# Components Overview Detailed documentation for each major component in the OpenQuery system. ## 📋 Table of Contents 1. [Component Hierarchy](#component-hierarchy) 2. [Core Components](#core-components) 3. [Services](#services) 4. [Data Models](#data-models) 5. [Component Interactions](#component-interactions) ## Component Hierarchy ``` OpenQuery/ ├── Program.cs [Entry Point, CLI] ├── OpenQuery.cs [OpenQueryApp - Orchestrator] ├── Tools/ │ └── SearchTool.cs [Pipeline Orchestration] ├── Services/ │ ├── OpenRouterClient.cs [LLM & Embedding API] │ ├── SearxngClient.cs [Search API] │ ├── EmbeddingService.cs [Embedding Generation + Math] │ ├── ChunkingService.cs [Text Splitting] │ ├── ArticleService.cs [Content Extraction] │ ├── RateLimiter.cs [Concurrency Control] │ └── StatusReporter.cs [Progress Display] ├── Models/ │ ├── OpenQueryOptions.cs [CLI Options Record] │ ├── Chunk.cs [Content + Metadata] │ ├── ParallelOptions.cs [Concurrency Settings] │ ├── OpenRouter.cs [API DTOs] │ ├── Searxng.cs [Search Result DTOs] │ └── JsonContexts.cs [JSON Context] └── ConfigManager.cs [Configuration Persistence] ``` ## Core Components ### 1. Program.cs **Type**: Console Application Entry Point **Responsibilities**: CLI parsing, dependency wiring, error handling **Key Elements**: - `RootCommand` from System.CommandLine - Options: `--chunks`, `--results`, `--queries`, `--short`, `--long`, `--verbose` - Subcommand: `configure` (with interactive mode) - Configuration loading via `ConfigManager.Load()` - Environment variable resolution - Service instantiation and coordination - Top-level try-catch for error reporting **Code Flow**: 1. Load config file 2. Define CLI options and commands 3. Set handler for root command 4. Handler: resolve API key/model → instantiate services → call `OpenQueryApp.RunAsync()` 5. Set handler for configure command (writes config file) 6. Invoke command parser: `await rootCommand.InvokeAsync(args)` **Exit Codes**: - 0 = success - 1 = error ### 2. OpenQueryApp (OpenQuery.cs) **Type**: Main Application Class **Responsibilities**: Workflow orchestration, query generation, answer streaming **Constructor Parameters**: - `OpenRouterClient client` - for query gen and final answer - `SearchTool searchTool` - for search-retrieve-rank pipeline - `string model` - LLM model identifier **Main Method**: `RunAsync(OpenQueryOptions options)` **Workflow Steps**: 1. Create `StatusReporter` (for progress UI) 2. **Optional Query Generation** (if `options.Queries > 1`): - Create system message instructing JSON array output - Create user message with `options.Question` - Call `client.CompleteAsync()` with query gen model - Parse JSON response; fall back to original question on failure - Result: `List queries` (1 or many) 3. **Execute Search Pipeline**: - Call `_searchTool.ExecuteAsync()` with queries, options - Receive `string context` (formatted context with source citations) - Progress reported via callback to `StatusReporter` 4. **Generate Final Answer**: - Build system prompt (append "short" or "long" modifier) - Create user message with `Context:\n{context}\n\nQuestion: {options.Question}` - Stream answer via `client.StreamAsync()` - Write each `chunk.TextDelta` to Console as it arrives - Stop spinner on first chunk, continue streaming 5. Dispose reporter **Error Handling**: - Exceptions propagate to `Program.cs` top-level handler - `HttpRequestException` vs generic `Exception` **Note**: Query generation uses the same model as final answer; could be separated for cost/performance. ### 3. SearchTool (Tools/SearchTool.cs) **Type**: Pipeline Orchestrator **Responsibilities**: Execute 4-phase search-retrieve-rank-return workflow **Constructor Parameters**: - `SearxngClient searxngClient` - `EmbeddingService embeddingService` **Main Method**: `ExecuteAsync(originalQuery, generatedQueries, maxResults, topChunksLimit, onProgress, verbose)` **Returns**: `Task` - formatted context string with source citations **Pipeline Phases**: #### Phase 1: ExecuteParallelSearchesAsync - Parallelize `searxngClient.SearchAsync(query, maxResults)` for each query - Collect all results in `ConcurrentBag` - Deduplicate by `DistinctBy(r => r.Url)` **Output**: `List` (aggregated, unique) #### Phase 2: ExecuteParallelArticleFetchingAsync - Semaphore: `MaxConcurrentArticleFetches` (default 10) - For each `SearxngResult`: fetch URL via `ArticleService.FetchArticleAsync()` - Extract article text, title - Chunk via `ChunkingService.ChunkText(article.TextContent)` - Add each chunk as new `Chunk(content, url, title)` **Output**: `List` (potentially 50-100 chunks) #### Phase 3: ExecuteParallelEmbeddingsAsync - Start two parallel tasks: 1. Query embedding: `embeddingService.GetEmbeddingAsync(originalQuery)` 2. Chunk embeddings: `embeddingService.GetEmbeddingsWithRateLimitAsync(chunkTexts, onProgress)` - `Parallel.ForEachAsync` with `MaxConcurrentEmbeddingRequests` (default 4) - Batch size: 300 chunks per embedding API call - Filter chunks with empty embeddings (failed batches) **Output**: `(float[] queryEmbedding, float[][] chunkEmbeddings)` #### Phase 4: RankAndSelectTopChunks - Calculate cosine similarity for each chunk vs query - Assign `chunk.Score` - Order by descending score - Take `topChunksLimit` (from `--chunks` option) - Return `List` (top N) **Formatting**: ```csharp string context = string.Join("\n\n", topChunks.Select((c, i) => $"[Source {i+1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}")); ``` **Progress Callbacks**: Invoked at each major step for UI feedback ## Services ### OpenRouterClient **Purpose**: HTTP client for OpenRouter API (chat completions + embeddings) **Base URL**: `https://openrouter.ai/api/v1` **Authentication**: `Authorization: Bearer {apiKey}` **Methods**: #### `StreamAsync(ChatCompletionRequest request, CancellationToken)` - Sets `request.Stream = true` - POST to `/chat/completions` - Reads SSE stream line-by-line - Parses `data: {json}` chunks - Yields `StreamChunk` (text delta or tool call) - Supports cancellation #### `CompleteAsync(ChatCompletionRequest request)` - Sets `request.Stream = false` - POST to `/chat/completions` - Deserializes full response - Returns `ChatCompletionResponse` #### `EmbedAsync(string model, List inputs)` - POST to `/embeddings` - Returns `float[][]` (ordered by input index) **Error Handling**: `EnsureSuccessStatusCode()` throws `HttpRequestException` on failure **Design**: Thin wrapper; no retry logic (delegated to EmbeddingService) ### SearxngClient **Purpose**: HTTP client for SearxNG metasearch **Base URL**: Configurable (default `http://localhost:8002`) **Methods**: #### `SearchAsync(string query, int limit = 10)` - GET `{baseUrl}/search?q={query}&format=json` - Deserializes to `SearxngRoot` - Returns `Results.Take(limit).ToList()` - On failure: returns empty `List` (no exception) **Design**: Very simple; failures are tolerated (OpenQuery continues with other queries) ### EmbeddingService **Purpose**: Batch embedding generation with rate limiting, parallelization, and retries **Configuration** (from `ParallelProcessingOptions`): - `MaxConcurrentEmbeddingRequests` = 4 - `EmbeddingBatchSize` = 300 **Default Embedding Model**: `openai/text-embedding-3-small` **Methods**: #### `GetEmbeddingsAsync(List texts, Action? onProgress, CancellationToken)` - Splits `texts` into batches of `EmbeddingBatchSize` - Parallelizes batches with `Parallel.ForEachAsync` + `MaxConcurrentEmbeddingRequests` - Each batch: rate-limited + retry-wrapped `client.EmbedAsync(model, batch)` - Collects results in order (by batch index) - Returns `float[][]` (same order as input texts) - Failed batches return empty `float[]` for each text #### `GetEmbeddingAsync(string text, CancellationToken)` - Wraps single-text call in rate limiter + retry - Returns `float[]` #### `CosineSimilarity(float[] v1, float[] v2)` - Static method using `TensorPrimitives.CosineSimilarity` - Returns float between -1 and 1 (typically 0-1 for normalized embeddings) **Retry Policy** (Polly): - Max 3 attempts - 1s base delay, exponential backoff - Only `HttpRequestException` **Rate Limiting**: `RateLimiter` semaphore with `MaxConcurrentEmbeddingRequests` **Design Notes**: - Two similar methods (`GetEmbeddingsAsync` and `GetEmbeddingsWithRateLimitAsync`) - could be consolidated - Uses Polly for resilience (good pattern) - Concurrency control prevents overwhelming OpenRouter ### ChunkingService **Purpose**: Split long text into manageable pieces **Static Class** (no dependencies, pure function) **Algorithm** (in `ChunkText(string text)`): - Constant `MAX_CHUNK_SIZE = 500` - While remaining text: - Take up to 500 chars - If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']` - Trim and add non-empty chunk - Advance start position **Rationale**: 500 chars is a sweet spot for embeddings - long enough for context, short enough for semantic coherence. **Edge Cases**: Handles text shorter than 500 chars, empty text, text with no natural breaks. ### ArticleService **Purpose**: Extract clean article content from URLs **Method**: `FetchArticleAsync(string url)` **Implementation**: Delegates to `SmartReader.ParseArticleAsync(url)` **Returns**: `Article` object (from SmartReader) - `Title` (string) - `TextContent` (string) - cleaned article body - `IsReadable` (bool) - quality indicator - Other metadata (author, date, etc.) **Error Handling**: Exceptions propagate (handled by `SearchTool`) **Design**: Thin wrapper around third-party library. Could be extended to add caching, custom extraction rules, etc. ### RateLimiter **Purpose**: Limit concurrent operations via semaphore **Interface**: ```csharp public async Task ExecuteAsync(Func> action, CancellationToken); public async Task ExecuteAsync(Func action, CancellationToken); ``` **Implementation**: `SemaphoreSlim` with `WaitAsync` and `Release` **Disposal**: `IAsyncDisposable` (awaits semaphore disposal) **Usage**: Wrap API calls that need concurrency control ```csharp var result = await _rateLimiter.ExecuteAsync(async () => await _client.EmbedAsync(model, batch), cancellationToken); ``` **Design**: Simple, reusable. Could be replaced with `Polly.RateLimiting` policy but this is lightweight. ### StatusReporter **Purpose**: Real-time progress UI with spinner and verbose modes **Architecture**: - Producer: UpdateStatus(text) → writes to `Channel` - Consumer: Background task `ProcessStatusUpdatesAsync()` reads from channel - Spinner: Separate task animates Braille characters every 100ms **Modes**: **Verbose Mode** (`_verbose = true`): - All progress messages written as `Console.WriteLine()` - No spinner - Full audit trail **Compact Mode** (default): - Status line with spinner (overwrites same line) - Only latest status visible - Example: `⠋ Fetching articles 3/10...` **Key Methods**: - `UpdateStatus(message)` - fire-and-forget, non-blocking - `WriteLine(text)` - stops spinner temporarily, writes full line - `StartSpinner()` / `StopSpinner()` - manual control - `ClearStatus()` - ANSI escape `\r\x1b[K` to clear line - `Dispose()` - completes channel, waits for background tasks **Spinner Chars**: `['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏']` (Braille patterns, smooth animation) **ANSI Codes**: `\r` (carriage return), `\x1b[K` (erase to end of line) **Thread Safety**: Channel is thread-safe; multiple components can write concurrently without locks **Design**: Well-encapsulated; could be reused in other CLI projects. ### ConfigManager **Purpose**: Load/save configuration from XDG-compliant location **Config Path**: - `Environment.SpecialFolder.UserProfile` → `~/.config/openquery/config` **Schema** (`AppConfig`): ```csharp public class AppConfig { public string ApiKey { get; set; } = ""; public string Model { get; set; } = "qwen/qwen3.5-flash-02-23"; public int DefaultQueries { get; set; } = 3; public int DefaultChunks { get; set; } = 3; public int DefaultResults { get; set; } = 5; } ``` **Format**: Simple `key=value` (no INI parser, manual line split) **Methods**: - `Load()` → reads file if exists, returns `AppConfig` (with defaults) - `Save(AppConfig)` → writes all 5 keys, overwrites existing **Design**: - Static class (no instances) - Creates directory if missing - No validation (writes whatever values given) - Could be improved with JSON format (but keep simple) ## Data Models ### OpenQueryOptions **Location**: `Models/OpenQueryOptions.cs` **Type**: `record` **Purpose**: Immutable options object passed through workflow **Properties**: - `int Chunks` - top N chunks for context - `int Results` - search results per query - `int Queries` - number of expanded queries to generate - `bool Short` - concise answer flag - `bool Long` - detailed answer flag - `bool Verbose` - verbose logging flag - `string Question` - original user question **Created**: In `Program.cs` from CLI options + config defaults **Used By**: `OpenQueryApp.RunAsync()` ### Chunk **Location**: `Models/Chunk.cs` **Type**: `record` **Purpose**: Content chunk with metadata and embedding **Properties**: - `string Content` - extracted text (~500 chars) - `string SourceUrl` - article URL - `string? Title` - article title (nullable) - `float[]? Embedding` - vector embedding (populated by EmbeddingService) - `float Score` - relevance score (populated during ranking) **Lifecycle**: 1. Instantiated in `SearchTool.ExecuteParallelArticleFetchingAsync` with content, url, title 2. `Embedding` set in `ExecuteParallelEmbeddingsAsync` after batch processing 3. `Score` set in `RankAndSelectTopChunks` after cosine similarity 4. Serialized into context string for final answer **Equality**: Records provide value equality (based on all properties) ### ParallelProcessingOptions **Location**: `Models/ParallelOptions.cs` **Type**: `class` (mutable) **Purpose**: Concurrency settings for parallel operations **Properties** (with defaults): - `MaxConcurrentArticleFetches` = 10 - `MaxConcurrentEmbeddingRequests` = 4 - `EmbeddingBatchSize` = 300 **Used By**: `EmbeddingService` (for embeddings), `SearchTool` (for article fetching) **Currently**: Hardcoded in `SearchTool` constructor; could be made configurable ### OpenRouter Models (Models/OpenRouter.cs) **Purpose**: DTOs for OpenRouter API (JSON serializable) **Chat Completion**: - `ChatCompletionRequest` (model, messages, tools, stream) - `ChatCompletionResponse` (choices[], usage[]) - `Message` (role, content, tool_calls, tool_call_id) - `ToolDefinition`, `ToolFunction`, `ToolCall`, `FunctionCall` - `Choice`, `Usage` **Embedding**: - `EmbeddingRequest` (model, input[]) - `EmbeddingResponse` (data[], usage) - `EmbeddingData` (embedding[], index) **Streaming**: - `StreamChunk` (TextDelta, Tool) - `ChatCompletionChunk`, `ChunkChoice`, `ChunkDelta` **JSON Properties**: Uses `[JsonPropertyName]` to match API **Serialization**: System.Text.Json with source generation (AppJsonContext) ### Searxng Models (Models/Searxng.cs) **Purpose**: DTOs for SearxNG search results **Records**: - `SearxngRoot` with `List Results` - `SearxngResult` with `Title`, `Url`, `Content` (snippet) **Usage**: Deserialized from SearxNG's JSON response ### JsonContexts **Location**: `Models/JsonContexts.cs` **Purpose**: Source-generated JSON serializer context for AOT compatibility **Pattern**: ```csharp [JsonSerializable(typeof(ChatCompletionRequest))] [JsonSerializable(typeof(ChatCompletionResponse))] ... etc ... internal partial class AppJsonContext : JsonSerializerContext { } ``` **Generated**: Partial class compiled by source generator **Used By**: All `JsonSerializer.Serialize/Deserialize` calls with `AppJsonContext.Default.{Type}` **Benefits**: - AOT-compatible (no reflection) - Faster serialization (compiled delegates) - Smaller binary (trimming-safe) ## Component Interactions ### Dependencies Graph ``` Program.cs ├── ConfigManager (load/save) ├── OpenRouterClient ──┐ ├── SearxngClient ─────┤ ├── EmbeddingService ──┤ └── SearchTool ────────┤ │ OpenQueryApp ◄──────────┘ │ ├── OpenRouterClient (query gen + answer streaming) ├── SearchTool (pipeline) │ ├── SearxngClient (searches) │ ├── ArticleService (fetch) │ ├── ChunkingService (split) │ ├── EmbeddingService (embeddings) │ ├── RateLimiter (concurrency) │ └── StatusReporter (progress via callback) └── StatusReporter (UI) ``` ### Data Flow Between Components ``` OpenQueryOptions ↓ OpenQueryApp ├─ Query Generation │ └─ OpenRouterClient.CompleteAsync() │ → List generatedQueries │ ├─ Search Pipeline │ └─ SearchTool.ExecuteAsync(originalQuery, generatedQueries, ...) │ ↓ │ Phase 1: SearxngClient.SearchAsync(query) × N │ → ConcurrentBag │ → List (unique) │ ↓ │ Phase 2: ArticleService.FetchArticleAsync(url) × M │ → ChunkingService.ChunkText(article.TextContent) │ → ConcurrentBag (content, url, title) │ ↓ │ Phase 3: EmbeddingService.GetEmbeddingsAsync(chunkContents) │ → (queryEmbedding, chunkEmbeddings) │ ↓ │ Phase 4: CosineSimilarity + Rank │ → List topChunks (with Score, Embedding set) │ ↓ │ Format: context string with [Source N: Title](Url) │ → return context string │ └─ Final Answer └─ OpenRouterClient.StreamAsync(prompt with context) → stream deltas to Console ``` ### Interface Contracts **SearchTool → Progress**: ```csharp // Invoked as: onProgress?.Invoke("[Fetching article 1/10: example.com]") Action? onProgress ``` **StatusReporter ← Progress**: ```csharp // Handler in OpenQueryApp: (progress) => { if (options.Verbose) reporter.WriteLine(progress); else reporter.UpdateStatus(parsedShorterMessage); } ``` **SearchTool → ArticleService**: ```csharp Article article = await ArticleService.FetchArticleAsync(url); ``` **SearchTool → EmbeddingService**: ```csharp (float[] queryEmbedding, float[][] chunkEmbeddings) = await ExecuteParallelEmbeddingsAsync(...); // Also: embeddingService.GetEmbeddingAsync(text), GetEmbeddingsWithRateLimitAsync(...) ``` **SearchTool → ChunkingService**: ```csharp List chunks = ChunkingService.ChunkText(article.TextContent); ``` **SearchTool → RateLimiter**: ```csharp await _rateLimiter.ExecuteAsync(async () => await _client.EmbedAsync(...), ct); ``` --- ## Next Steps - [OpenQueryApp](openquery-app.md) - Main orchestrator details - [SearchTool](search-tool.md) - Pipeline implementation - [Services](services.md) - All service classes documented - [Models](models.md) - Complete data model reference