Files
OpenQuery/docs/components/overview.md
OpenQuery Documentation 65ca2401ae docs: add comprehensive documentation with README and detailed guides
- Add user-friendly README.md with quick start guide
- Create docs/ folder with structured technical documentation:
  - installation.md: Build and setup instructions
  - configuration.md: Complete config reference
  - usage.md: CLI usage guide with examples
  - architecture.md: System design and patterns
  - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models)
  - api/: CLI reference, environment variables, programmatic API
  - troubleshooting.md: Common issues and solutions
  - performance.md: Latency, throughput, and optimization
- All documentation fully cross-referenced with internal links
- Covers project overview, architecture, components, APIs, and support

See individual files for complete documentation.
2026-03-19 10:01:58 +01:00

19 KiB
Raw Blame History

Components Overview

Detailed documentation for each major component in the OpenQuery system.

📋 Table of Contents

  1. Component Hierarchy
  2. Core Components
  3. Services
  4. Data Models
  5. Component Interactions

Component Hierarchy

OpenQuery/
├── Program.cs                    [Entry Point, CLI]
├── OpenQuery.cs                  [OpenQueryApp - Orchestrator]
├── Tools/
│   └── SearchTool.cs            [Pipeline Orchestration]
├── Services/
│   ├── OpenRouterClient.cs      [LLM & Embedding API]
│   ├── SearxngClient.cs         [Search API]
│   ├── EmbeddingService.cs      [Embedding Generation + Math]
│   ├── ChunkingService.cs       [Text Splitting]
│   ├── ArticleService.cs        [Content Extraction]
│   ├── RateLimiter.cs           [Concurrency Control]
│   └── StatusReporter.cs        [Progress Display]
├── Models/
│   ├── OpenQueryOptions.cs      [CLI Options Record]
│   ├── Chunk.cs                 [Content + Metadata]
│   ├── ParallelOptions.cs       [Concurrency Settings]
│   ├── OpenRouter.cs            [API DTOs]
│   ├── Searxng.cs               [Search Result DTOs]
│   └── JsonContexts.cs          [JSON Context]
└── ConfigManager.cs             [Configuration Persistence]

Core Components

1. Program.cs

Type: Console Application Entry Point
Responsibilities: CLI parsing, dependency wiring, error handling

Key Elements:

  • RootCommand from System.CommandLine
  • Options: --chunks, --results, --queries, --short, --long, --verbose
  • Subcommand: configure (with interactive mode)
  • Configuration loading via ConfigManager.Load()
  • Environment variable resolution
  • Service instantiation and coordination
  • Top-level try-catch for error reporting

Code Flow:

  1. Load config file
  2. Define CLI options and commands
  3. Set handler for root command
  4. Handler: resolve API key/model → instantiate services → call OpenQueryApp.RunAsync()
  5. Set handler for configure command (writes config file)
  6. Invoke command parser: await rootCommand.InvokeAsync(args)

Exit Codes:

  • 0 = success
  • 1 = error

2. OpenQueryApp (OpenQuery.cs)

Type: Main Application Class
Responsibilities: Workflow orchestration, query generation, answer streaming

Constructor Parameters:

  • OpenRouterClient client - for query gen and final answer
  • SearchTool searchTool - for search-retrieve-rank pipeline
  • string model - LLM model identifier

Main Method: RunAsync(OpenQueryOptions options)

Workflow Steps:

  1. Create StatusReporter (for progress UI)
  2. Optional Query Generation (if options.Queries > 1):
    • Create system message instructing JSON array output
    • Create user message with options.Question
    • Call client.CompleteAsync() with query gen model
    • Parse JSON response; fall back to original question on failure
    • Result: List<string> queries (1 or many)
  3. Execute Search Pipeline:
    • Call _searchTool.ExecuteAsync() with queries, options
    • Receive string context (formatted context with source citations)
    • Progress reported via callback to StatusReporter
  4. Generate Final Answer:
    • Build system prompt (append "short" or "long" modifier)
    • Create user message with Context:\n{context}\n\nQuestion: {options.Question}
    • Stream answer via client.StreamAsync()
    • Write each chunk.TextDelta to Console as it arrives
    • Stop spinner on first chunk, continue streaming
  5. Dispose reporter

Error Handling:

  • Exceptions propagate to Program.cs top-level handler
  • HttpRequestException vs generic Exception

Note: Query generation uses the same model as final answer; could be separated for cost/performance.

3. SearchTool (Tools/SearchTool.cs)

Type: Pipeline Orchestrator
Responsibilities: Execute 4-phase search-retrieve-rank-return workflow

Constructor Parameters:

  • SearxngClient searxngClient
  • EmbeddingService embeddingService

Main Method: ExecuteAsync(originalQuery, generatedQueries, maxResults, topChunksLimit, onProgress, verbose)

Returns: Task<string> - formatted context string with source citations

Pipeline Phases:

Phase 1: ExecuteParallelSearchesAsync

  • Parallelize searxngClient.SearchAsync(query, maxResults) for each query
  • Collect all results in ConcurrentBag<SearxngResult>
  • Deduplicate by DistinctBy(r => r.Url)

Output: List<SearxngResult> (aggregated, unique)

Phase 2: ExecuteParallelArticleFetchingAsync

  • Semaphore: MaxConcurrentArticleFetches (default 10)
  • For each SearxngResult: fetch URL via ArticleService.FetchArticleAsync()
  • Extract article text, title
  • Chunk via ChunkingService.ChunkText(article.TextContent)
  • Add each chunk as new Chunk(content, url, title)

Output: List<Chunk> (potentially 50-100 chunks)

Phase 3: ExecuteParallelEmbeddingsAsync

  • Start two parallel tasks:
    1. Query embedding: embeddingService.GetEmbeddingAsync(originalQuery)
    2. Chunk embeddings: embeddingService.GetEmbeddingsWithRateLimitAsync(chunkTexts, onProgress)
  • Parallel.ForEachAsync with MaxConcurrentEmbeddingRequests (default 4)
  • Batch size: 300 chunks per embedding API call
  • Filter chunks with empty embeddings (failed batches)

Output: (float[] queryEmbedding, float[][] chunkEmbeddings)

Phase 4: RankAndSelectTopChunks

  • Calculate cosine similarity for each chunk vs query
  • Assign chunk.Score
  • Order by descending score
  • Take topChunksLimit (from --chunks option)
  • Return List<Chunk> (top N)

Formatting:

string context = string.Join("\n\n", topChunks.Select((c, i) =>
    $"[Source {i+1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}"));

Progress Callbacks: Invoked at each major step for UI feedback

Services

OpenRouterClient

Purpose: HTTP client for OpenRouter API (chat completions + embeddings)

Base URL: https://openrouter.ai/api/v1

Authentication: Authorization: Bearer {apiKey}

Methods:

StreamAsync(ChatCompletionRequest request, CancellationToken)

  • Sets request.Stream = true
  • POST to /chat/completions
  • Reads SSE stream line-by-line
  • Parses data: {json} chunks
  • Yields StreamChunk (text delta or tool call)
  • Supports cancellation

CompleteAsync(ChatCompletionRequest request)

  • Sets request.Stream = false
  • POST to /chat/completions
  • Deserializes full response
  • Returns ChatCompletionResponse

EmbedAsync(string model, List<string> inputs)

  • POST to /embeddings
  • Returns float[][] (ordered by input index)

Error Handling: EnsureSuccessStatusCode() throws HttpRequestException on failure

Design: Thin wrapper; no retry logic (delegated to EmbeddingService)

SearxngClient

Purpose: HTTP client for SearxNG metasearch

Base URL: Configurable (default http://localhost:8002)

Methods:

SearchAsync(string query, int limit = 10)

  • GET {baseUrl}/search?q={query}&format=json
  • Deserializes to SearxngRoot
  • Returns Results.Take(limit).ToList()
  • On failure: returns empty List<SearxngResult> (no exception)

Design: Very simple; failures are tolerated (OpenQuery continues with other queries)

EmbeddingService

Purpose: Batch embedding generation with rate limiting, parallelization, and retries

Configuration (from ParallelProcessingOptions):

  • MaxConcurrentEmbeddingRequests = 4
  • EmbeddingBatchSize = 300

Default Embedding Model: openai/text-embedding-3-small

Methods:

GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)

  • Splits texts into batches of EmbeddingBatchSize
  • Parallelizes batches with Parallel.ForEachAsync + MaxConcurrentEmbeddingRequests
  • Each batch: rate-limited + retry-wrapped client.EmbedAsync(model, batch)
  • Collects results in order (by batch index)
  • Returns float[][] (same order as input texts)
  • Failed batches return empty float[] for each text

GetEmbeddingAsync(string text, CancellationToken)

  • Wraps single-text call in rate limiter + retry
  • Returns float[]

CosineSimilarity(float[] v1, float[] v2)

  • Static method using TensorPrimitives.CosineSimilarity
  • Returns float between -1 and 1 (typically 0-1 for normalized embeddings)

Retry Policy (Polly):

  • Max 3 attempts
  • 1s base delay, exponential backoff
  • Only HttpRequestException

Rate Limiting: RateLimiter semaphore with MaxConcurrentEmbeddingRequests

Design Notes:

  • Two similar methods (GetEmbeddingsAsync and GetEmbeddingsWithRateLimitAsync) - could be consolidated
  • Uses Polly for resilience (good pattern)
  • Concurrency control prevents overwhelming OpenRouter

ChunkingService

Purpose: Split long text into manageable pieces

Static Class (no dependencies, pure function)

Algorithm (in ChunkText(string text)):

  • Constant MAX_CHUNK_SIZE = 500
  • While remaining text:
    • Take up to 500 chars
    • If not at end, backtrack to last [' ', '\n', '\r', '.', '!']
    • Trim and add non-empty chunk
    • Advance start position

Rationale: 500 chars is a sweet spot for embeddings - long enough for context, short enough for semantic coherence.

Edge Cases: Handles text shorter than 500 chars, empty text, text with no natural breaks.

ArticleService

Purpose: Extract clean article content from URLs

Method: FetchArticleAsync(string url)

Implementation: Delegates to SmartReader.ParseArticleAsync(url)

Returns: Article object (from SmartReader)

  • Title (string)
  • TextContent (string) - cleaned article body
  • IsReadable (bool) - quality indicator
  • Other metadata (author, date, etc.)

Error Handling: Exceptions propagate (handled by SearchTool)

Design: Thin wrapper around third-party library. Could be extended to add caching, custom extraction rules, etc.

RateLimiter

Purpose: Limit concurrent operations via semaphore

Interface:

public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken);
public async Task ExecuteAsync(Func<Task> action, CancellationToken);

Implementation: SemaphoreSlim with WaitAsync and Release

Disposal: IAsyncDisposable (awaits semaphore disposal)

Usage: Wrap API calls that need concurrency control

var result = await _rateLimiter.ExecuteAsync(async () =>
    await _client.EmbedAsync(model, batch), cancellationToken);

Design: Simple, reusable. Could be replaced with Polly.RateLimiting policy but this is lightweight.

StatusReporter

Purpose: Real-time progress UI with spinner and verbose modes

Architecture:

  • Producer: UpdateStatus(text) → writes to Channel<string>
  • Consumer: Background task ProcessStatusUpdatesAsync() reads from channel
  • Spinner: Separate task animates Braille characters every 100ms

Modes:

Verbose Mode (_verbose = true):

  • All progress messages written as Console.WriteLine()
  • No spinner
  • Full audit trail

Compact Mode (default):

  • Status line with spinner (overwrites same line)
  • Only latest status visible
  • Example: ⠋ Fetching articles 3/10...

Key Methods:

  • UpdateStatus(message) - fire-and-forget, non-blocking
  • WriteLine(text) - stops spinner temporarily, writes full line
  • StartSpinner() / StopSpinner() - manual control
  • ClearStatus() - ANSI escape \r\x1b[K to clear line
  • Dispose() - completes channel, waits for background tasks

Spinner Chars: ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏'] (Braille patterns, smooth animation)

ANSI Codes: \r (carriage return), \x1b[K (erase to end of line)

Thread Safety: Channel is thread-safe; multiple components can write concurrently without locks

Design: Well-encapsulated; could be reused in other CLI projects.

ConfigManager

Purpose: Load/save configuration from XDG-compliant location

Config Path:

  • Environment.SpecialFolder.UserProfile~/.config/openquery/config

Schema (AppConfig):

public class AppConfig
{
    public string ApiKey { get; set; } = "";
    public string Model { get; set; } = "qwen/qwen3.5-flash-02-23";
    public int DefaultQueries { get; set; } = 3;
    public int DefaultChunks { get; set; } = 3;
    public int DefaultResults { get; set; } = 5;
}

Format: Simple key=value (no INI parser, manual line split)

Methods:

  • Load() → reads file if exists, returns AppConfig (with defaults)
  • Save(AppConfig) → writes all 5 keys, overwrites existing

Design:

  • Static class (no instances)
  • Creates directory if missing
  • No validation (writes whatever values given)
  • Could be improved with JSON format (but keep simple)

Data Models

OpenQueryOptions

Location: Models/OpenQueryOptions.cs

Type: record

Purpose: Immutable options object passed through workflow

Properties:

  • int Chunks - top N chunks for context
  • int Results - search results per query
  • int Queries - number of expanded queries to generate
  • bool Short - concise answer flag
  • bool Long - detailed answer flag
  • bool Verbose - verbose logging flag
  • string Question - original user question

Created: In Program.cs from CLI options + config defaults

Used By: OpenQueryApp.RunAsync()

Chunk

Location: Models/Chunk.cs

Type: record

Purpose: Content chunk with metadata and embedding

Properties:

  • string Content - extracted text (~500 chars)
  • string SourceUrl - article URL
  • string? Title - article title (nullable)
  • float[]? Embedding - vector embedding (populated by EmbeddingService)
  • float Score - relevance score (populated during ranking)

Lifecycle:

  1. Instantiated in SearchTool.ExecuteParallelArticleFetchingAsync with content, url, title
  2. Embedding set in ExecuteParallelEmbeddingsAsync after batch processing
  3. Score set in RankAndSelectTopChunks after cosine similarity
  4. Serialized into context string for final answer

Equality: Records provide value equality (based on all properties)

ParallelProcessingOptions

Location: Models/ParallelOptions.cs

Type: class (mutable)

Purpose: Concurrency settings for parallel operations

Properties (with defaults):

  • MaxConcurrentArticleFetches = 10
  • MaxConcurrentEmbeddingRequests = 4
  • EmbeddingBatchSize = 300

Used By: EmbeddingService (for embeddings), SearchTool (for article fetching)

Currently: Hardcoded in SearchTool constructor; could be made configurable

OpenRouter Models (Models/OpenRouter.cs)

Purpose: DTOs for OpenRouter API (JSON serializable)

Chat Completion:

  • ChatCompletionRequest (model, messages, tools, stream)
  • ChatCompletionResponse (choices[], usage[])
  • Message (role, content, tool_calls, tool_call_id)
  • ToolDefinition, ToolFunction, ToolCall, FunctionCall
  • Choice, Usage

Embedding:

  • EmbeddingRequest (model, input[])
  • EmbeddingResponse (data[], usage)
  • EmbeddingData (embedding[], index)

Streaming:

  • StreamChunk (TextDelta, Tool)
  • ChatCompletionChunk, ChunkChoice, ChunkDelta

JSON Properties: Uses [JsonPropertyName] to match API

Serialization: System.Text.Json with source generation (AppJsonContext)

Searxng Models (Models/Searxng.cs)

Purpose: DTOs for SearxNG search results

Records:

  • SearxngRoot with List<SearxngResult> Results
  • SearxngResult with Title, Url, Content (snippet)

Usage: Deserialized from SearxNG's JSON response

JsonContexts

Location: Models/JsonContexts.cs

Purpose: Source-generated JSON serializer context for AOT compatibility

Pattern:

[JsonSerializable(typeof(ChatCompletionRequest))]
[JsonSerializable(typeof(ChatCompletionResponse))]
... etc ...
internal partial class AppJsonContext : JsonSerializerContext
{
}

Generated: Partial class compiled by source generator

Used By: All JsonSerializer.Serialize/Deserialize calls with AppJsonContext.Default.{Type}

Benefits:

  • AOT-compatible (no reflection)
  • Faster serialization (compiled delegates)
  • Smaller binary (trimming-safe)

Component Interactions

Dependencies Graph

Program.cs
├── ConfigManager (load/save)
├── OpenRouterClient ──┐
├── SearxngClient ─────┤
├── EmbeddingService ──┤
└── SearchTool ────────┤
                        │
OpenQueryApp ◄──────────┘
    │
    ├── OpenRouterClient (query gen + answer streaming)
    ├── SearchTool (pipeline)
    │   ├── SearxngClient (searches)
    │   ├── ArticleService (fetch)
    │   ├── ChunkingService (split)
    │   ├── EmbeddingService (embeddings)
    │   ├── RateLimiter (concurrency)
    │   └── StatusReporter (progress via callback)
    └── StatusReporter (UI)

Data Flow Between Components

OpenQueryOptions
    ↓
OpenQueryApp
    ├─ Query Generation
    │   └─ OpenRouterClient.CompleteAsync()
    │       → List<string> generatedQueries
    │
    ├─ Search Pipeline
    │   └─ SearchTool.ExecuteAsync(originalQuery, generatedQueries, ...)
    │       ↓
    │   Phase 1: SearxngClient.SearchAsync(query) × N
    │       → ConcurrentBag<SearxngResult>
    │       → List<SearxngResult> (unique)
    │       ↓
    │   Phase 2: ArticleService.FetchArticleAsync(url) × M
    │       → ChunkingService.ChunkText(article.TextContent)
    │       → ConcurrentBag<Chunk> (content, url, title)
    │       ↓
    │   Phase 3: EmbeddingService.GetEmbeddingsAsync(chunkContents)
    │       → (queryEmbedding, chunkEmbeddings)
    │       ↓
    │   Phase 4: CosineSimilarity + Rank
    │       → List<Chunk> topChunks (with Score, Embedding set)
    │       ↓
    │   Format: context string with [Source N: Title](Url)
    │       → return context string
    │
    └─ Final Answer
        └─ OpenRouterClient.StreamAsync(prompt with context)
            → stream deltas to Console

Interface Contracts

SearchTool → Progress:

// Invoked as: onProgress?.Invoke("[Fetching article 1/10: example.com]")
Action<string>? onProgress

StatusReporter ← Progress:

// Handler in OpenQueryApp:
(progress) => {
    if (options.Verbose) reporter.WriteLine(progress);
    else reporter.UpdateStatus(parsedShorterMessage);
}

SearchTool → ArticleService:

Article article = await ArticleService.FetchArticleAsync(url);

SearchTool → EmbeddingService:

(float[] queryEmbedding, float[][] chunkEmbeddings) = await ExecuteParallelEmbeddingsAsync(...);
// Also: embeddingService.GetEmbeddingAsync(text), GetEmbeddingsWithRateLimitAsync(...)

SearchTool → ChunkingService:

List<string> chunks = ChunkingService.ChunkText(article.TextContent);

SearchTool → RateLimiter:

await _rateLimiter.ExecuteAsync(async () => await _client.EmbedAsync(...), ct);

Next Steps