OpenQuery/docs/components/overview.md

# Components Overview

Detailed documentation for each major component in the OpenQuery system.

## 📋 Table of Contents

1. [Component Hierarchy](#component-hierarchy)
2. [Core Components](#core-components)
3. [Services](#services)
4. [Data Models](#data-models)
5. [Component Interactions](#component-interactions)

## Component Hierarchy

```
OpenQuery/
├── Program.cs                    [Entry Point, CLI]
├── OpenQuery.cs                  [OpenQueryApp - Orchestrator]
├── Tools/
│   └── SearchTool.cs            [Pipeline Orchestration]
├── Services/
│   ├── OpenRouterClient.cs      [LLM & Embedding API]
│   ├── SearxngClient.cs         [Search API]
│   ├── EmbeddingService.cs      [Embedding Generation + Math]
│   ├── ChunkingService.cs       [Text Splitting]
│   ├── ArticleService.cs        [Content Extraction]
│   ├── RateLimiter.cs           [Concurrency Control]
│   └── StatusReporter.cs        [Progress Display]
├── Models/
│   ├── OpenQueryOptions.cs      [CLI Options Record]
│   ├── Chunk.cs                 [Content + Metadata]
│   ├── ParallelOptions.cs       [Concurrency Settings]
│   ├── OpenRouter.cs            [API DTOs]
│   ├── Searxng.cs               [Search Result DTOs]
│   └── JsonContexts.cs          [JSON Context]
└── ConfigManager.cs             [Configuration Persistence]
```

## Core Components

### 1. Program.cs

**Type**: Console Application Entry Point
**Responsibilities**: CLI parsing, dependency wiring, error handling

**Key Elements**:
- `RootCommand` from System.CommandLine
- Options: `--chunks`, `--results`, `--queries`, `--short`, `--long`, `--verbose`
- Subcommand: `configure` (with interactive mode)
- Configuration loading via `ConfigManager.Load()`
- Environment variable resolution
- Service instantiation and coordination
- Top-level try-catch for error reporting

**Code Flow**:
1. Load config file
2. Define CLI options and commands
3. Set handler for root command
4. Handler: resolve API key/model → instantiate services → call `OpenQueryApp.RunAsync()`
5. Set handler for configure command (writes config file)
6. Invoke command parser: `await rootCommand.InvokeAsync(args)`

**Exit Codes**:
- 0 = success
- 1 = error

### 2. OpenQueryApp (OpenQuery.cs)

**Type**: Main Application Class
**Responsibilities**: Workflow orchestration, query generation, answer streaming

**Constructor Parameters**:
- `OpenRouterClient client` - for query gen and final answer
- `SearchTool searchTool` - for search-retrieve-rank pipeline
- `string model` - LLM model identifier

**Main Method**: `RunAsync(OpenQueryOptions options)`

**Workflow Steps**:
1. Create `StatusReporter` (for progress UI)
2. **Optional Query Generation** (if `options.Queries > 1`):
   - Create system message instructing JSON array output
   - Create user message with `options.Question`
   - Call `client.CompleteAsync()` with query gen model
   - Parse JSON response; fall back to original question on failure
   - Result: `List<string> queries` (1 or many)
3. **Execute Search Pipeline**:
   - Call `_searchTool.ExecuteAsync()` with queries, options
   - Receive `string context` (formatted context with source citations)
   - Progress reported via callback to `StatusReporter`
4. **Generate Final Answer**:
   - Build system prompt (append "short" or "long" modifier)
   - Create user message with `Context:\n{context}\n\nQuestion: {options.Question}`
   - Stream answer via `client.StreamAsync()`
   - Write each `chunk.TextDelta` to Console as it arrives
   - Stop spinner on first chunk, continue streaming
5. Dispose reporter

**Error Handling**:
- Exceptions propagate to `Program.cs` top-level handler
- `HttpRequestException` vs generic `Exception`

**Note**: Query generation uses the same model as final answer; could be separated for cost/performance.

### 3. SearchTool (Tools/SearchTool.cs)

**Type**: Pipeline Orchestrator
**Responsibilities**: Execute 4-phase search-retrieve-rank-return workflow

**Constructor Parameters**:
- `SearxngClient searxngClient`
- `EmbeddingService embeddingService`

**Main Method**: `ExecuteAsync(originalQuery, generatedQueries, maxResults, topChunksLimit, onProgress, verbose)`

**Returns**: `Task<string>` - formatted context string with source citations

**Pipeline Phases**:

#### Phase 1: ExecuteParallelSearchesAsync
- Parallelize `searxngClient.SearchAsync(query, maxResults)` for each query
- Collect all results in `ConcurrentBag<SearxngResult>`
- Deduplicate by `DistinctBy(r => r.Url)`

**Output**: `List<SearxngResult>` (aggregated, unique)

#### Phase 2: ExecuteParallelArticleFetchingAsync
- Semaphore: `MaxConcurrentArticleFetches` (default 10)
- For each `SearxngResult`: fetch URL via `ArticleService.FetchArticleAsync()`
- Extract article text, title
- Chunk via `ChunkingService.ChunkText(article.TextContent)`
- Add each chunk as new `Chunk(content, url, title)`

**Output**: `List<Chunk>` (potentially 50-100 chunks)

#### Phase 3: ExecuteParallelEmbeddingsAsync
- Start two parallel tasks:
  1. Query embedding: `embeddingService.GetEmbeddingAsync(originalQuery)`
  2. Chunk embeddings: `embeddingService.GetEmbeddingsWithRateLimitAsync(chunkTexts, onProgress)`
- `Parallel.ForEachAsync` with `MaxConcurrentEmbeddingRequests` (default 4)
- Batch size: 300 chunks per embedding API call
- Filter chunks with empty embeddings (failed batches)

**Output**: `(float[] queryEmbedding, float[][] chunkEmbeddings)`

#### Phase 4: RankAndSelectTopChunks
- Calculate cosine similarity for each chunk vs query
- Assign `chunk.Score`
- Order by descending score
- Take `topChunksLimit` (from `--chunks` option)
- Return `List<Chunk>` (top N)

**Formatting**:
```csharp
string context = string.Join("\n\n", topChunks.Select((c, i) =>
    $"[Source {i+1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}"));
```

**Progress Callbacks**: Invoked at each major step for UI feedback

## Services

### OpenRouterClient

**Purpose**: HTTP client for OpenRouter API (chat completions + embeddings)

**Base URL**: `https://openrouter.ai/api/v1`

**Authentication**: `Authorization: Bearer {apiKey}`

**Methods**:

#### `StreamAsync(ChatCompletionRequest request, CancellationToken)`
- Sets `request.Stream = true`
- POST to `/chat/completions`
- Reads SSE stream line-by-line
- Parses `data: {json}` chunks
- Yields `StreamChunk` (text delta or tool call)
- Supports cancellation

#### `CompleteAsync(ChatCompletionRequest request)`
- Sets `request.Stream = false`
- POST to `/chat/completions`
- Deserializes full response
- Returns `ChatCompletionResponse`

#### `EmbedAsync(string model, List<string> inputs)`
- POST to `/embeddings`
- Returns `float[][]` (ordered by input index)

**Error Handling**: `EnsureSuccessStatusCode()` throws `HttpRequestException` on failure

**Design**: Thin wrapper; no retry logic (delegated to EmbeddingService)

### SearxngClient

**Purpose**: HTTP client for SearxNG metasearch

**Base URL**: Configurable (default `http://localhost:8002`)

**Methods**:

#### `SearchAsync(string query, int limit = 10)`
- GET `{baseUrl}/search?q={query}&format=json`
- Deserializes to `SearxngRoot`
- Returns `Results.Take(limit).ToList()`
- On failure: returns empty `List<SearxngResult>` (no exception)

**Design**: Very simple; failures are tolerated (OpenQuery continues with other queries)

### EmbeddingService

**Purpose**: Batch embedding generation with rate limiting, parallelization, and retries

**Configuration** (from `ParallelProcessingOptions`):
- `MaxConcurrentEmbeddingRequests` = 4
- `EmbeddingBatchSize` = 300

**Default Embedding Model**: `openai/text-embedding-3-small`

**Methods**:

#### `GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`
- Splits `texts` into batches of `EmbeddingBatchSize`
- Parallelizes batches with `Parallel.ForEachAsync` + `MaxConcurrentEmbeddingRequests`
- Each batch: rate-limited + retry-wrapped `client.EmbedAsync(model, batch)`
- Collects results in order (by batch index)
- Returns `float[][]` (same order as input texts)
- Failed batches return empty `float[]` for each text

#### `GetEmbeddingAsync(string text, CancellationToken)`
- Wraps single-text call in rate limiter + retry
- Returns `float[]`

#### `CosineSimilarity(float[] v1, float[] v2)`
- Static method using `TensorPrimitives.CosineSimilarity`
- Returns float between -1 and 1 (typically 0-1 for normalized embeddings)

**Retry Policy** (Polly):
- Max 3 attempts
- 1s base delay, exponential backoff
- Only `HttpRequestException`

**Rate Limiting**: `RateLimiter` semaphore with `MaxConcurrentEmbeddingRequests`

**Design Notes**:
- Two similar methods (`GetEmbeddingsAsync` and `GetEmbeddingsWithRateLimitAsync`) - could be consolidated
- Uses Polly for resilience (good pattern)
- Concurrency control prevents overwhelming OpenRouter

### ChunkingService

**Purpose**: Split long text into manageable pieces

**Static Class** (no dependencies, pure function)

**Algorithm** (in `ChunkText(string text)`):
- Constant `MAX_CHUNK_SIZE = 500`
- While remaining text:
  - Take up to 500 chars
  - If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']`
  - Trim and add non-empty chunk
  - Advance start position

**Rationale**: 500 chars is a sweet spot for embeddings - long enough for context, short enough for semantic coherence.

**Edge Cases**: Handles text shorter than 500 chars, empty text, text with no natural breaks.

### ArticleService

**Purpose**: Extract clean article content from URLs

**Method**: `FetchArticleAsync(string url)`

**Implementation**: Delegates to `SmartReader.ParseArticleAsync(url)`

**Returns**: `Article` object (from SmartReader)
- `Title` (string)
- `TextContent` (string) - cleaned article body
- `IsReadable` (bool) - quality indicator
- Other metadata (author, date, etc.)

**Error Handling**: Exceptions propagate (handled by `SearchTool`)

**Design**: Thin wrapper around third-party library. Could be extended to add caching, custom extraction rules, etc.

### RateLimiter

**Purpose**: Limit concurrent operations via semaphore

**Interface**:
```csharp
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken);
public async Task ExecuteAsync(Func<Task> action, CancellationToken);
```

**Implementation**: `SemaphoreSlim` with `WaitAsync` and `Release`

**Disposal**: `IAsyncDisposable` (awaits semaphore disposal)

**Usage**: Wrap API calls that need concurrency control
```csharp
var result = await _rateLimiter.ExecuteAsync(async () =>
    await _client.EmbedAsync(model, batch), cancellationToken);
```

**Design**: Simple, reusable. Could be replaced with `Polly.RateLimiting` policy but this is lightweight.

### StatusReporter

**Purpose**: Real-time progress UI with spinner and verbose modes

**Architecture**:
- Producer: UpdateStatus(text) → writes to `Channel<string>`
- Consumer: Background task `ProcessStatusUpdatesAsync()` reads from channel
- Spinner: Separate task animates Braille characters every 100ms

**Modes**:

**Verbose Mode** (`_verbose = true`):
- All progress messages written as `Console.WriteLine()`
- No spinner
- Full audit trail

**Compact Mode** (default):
- Status line with spinner (overwrites same line)
- Only latest status visible
- Example: `⠋ Fetching articles 3/10...`

**Key Methods**:
- `UpdateStatus(message)` - fire-and-forget, non-blocking
- `WriteLine(text)` - stops spinner temporarily, writes full line
- `StartSpinner()` / `StopSpinner()` - manual control
- `ClearStatus()` - ANSI escape `\r\x1b[K` to clear line
- `Dispose()` - completes channel, waits for background tasks

**Spinner Chars**: `['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏']` (Braille patterns, smooth animation)

**ANSI Codes**: `\r` (carriage return), `\x1b[K` (erase to end of line)

**Thread Safety**: Channel is thread-safe; multiple components can write concurrently without locks

**Design**: Well-encapsulated; could be reused in other CLI projects.

### ConfigManager

**Purpose**: Load/save configuration from XDG-compliant location

**Config Path**:
- `Environment.SpecialFolder.UserProfile` → `~/.config/openquery/config`

**Schema** (`AppConfig`):
```csharp
public class AppConfig
{
    public string ApiKey { get; set; } = "";
    public string Model { get; set; } = "qwen/qwen3.5-flash-02-23";
    public int DefaultQueries { get; set; } = 3;
    public int DefaultChunks { get; set; } = 3;
    public int DefaultResults { get; set; } = 5;
}
```

**Format**: Simple `key=value` (no INI parser, manual line split)

**Methods**:
- `Load()` → reads file if exists, returns `AppConfig` (with defaults)
- `Save(AppConfig)` → writes all 5 keys, overwrites existing

**Design**:
- Static class (no instances)
- Creates directory if missing
- No validation (writes whatever values given)
- Could be improved with JSON format (but keep simple)

## Data Models

### OpenQueryOptions

**Location**: `Models/OpenQueryOptions.cs`

**Type**: `record`

**Purpose**: Immutable options object passed through workflow

**Properties**:
- `int Chunks` - top N chunks for context
- `int Results` - search results per query
- `int Queries` - number of expanded queries to generate
- `bool Short` - concise answer flag
- `bool Long` - detailed answer flag
- `bool Verbose` - verbose logging flag
- `string Question` - original user question

**Created**: In `Program.cs` from CLI options + config defaults

**Used By**: `OpenQueryApp.RunAsync()`

### Chunk

**Location**: `Models/Chunk.cs`

**Type**: `record`

**Purpose**: Content chunk with metadata and embedding

**Properties**:
- `string Content` - extracted text (~500 chars)
- `string SourceUrl` - article URL
- `string? Title` - article title (nullable)
- `float[]? Embedding` - vector embedding (populated by EmbeddingService)
- `float Score` - relevance score (populated during ranking)

**Lifecycle**:
1. Instantiated in `SearchTool.ExecuteParallelArticleFetchingAsync` with content, url, title
2. `Embedding` set in `ExecuteParallelEmbeddingsAsync` after batch processing
3. `Score` set in `RankAndSelectTopChunks` after cosine similarity
4. Serialized into context string for final answer

**Equality**: Records provide value equality (based on all properties)

### ParallelProcessingOptions

**Location**: `Models/ParallelOptions.cs`

**Type**: `class` (mutable)

**Purpose**: Concurrency settings for parallel operations

**Properties** (with defaults):
- `MaxConcurrentArticleFetches` = 10
- `MaxConcurrentEmbeddingRequests` = 4
- `EmbeddingBatchSize` = 300

**Used By**: `EmbeddingService` (for embeddings), `SearchTool` (for article fetching)

**Currently**: Hardcoded in `SearchTool` constructor; could be made configurable

### OpenRouter Models (Models/OpenRouter.cs)

**Purpose**: DTOs for OpenRouter API (JSON serializable)

**Chat Completion**:
- `ChatCompletionRequest` (model, messages, tools, stream)
- `ChatCompletionResponse` (choices[], usage[])
- `Message` (role, content, tool_calls, tool_call_id)
- `ToolDefinition`, `ToolFunction`, `ToolCall`, `FunctionCall`
- `Choice`, `Usage`

**Embedding**:
- `EmbeddingRequest` (model, input[])
- `EmbeddingResponse` (data[], usage)
- `EmbeddingData` (embedding[], index)

**Streaming**:
- `StreamChunk` (TextDelta, Tool)
- `ChatCompletionChunk`, `ChunkChoice`, `ChunkDelta`

**JSON Properties**: Uses `[JsonPropertyName]` to match API

**Serialization**: System.Text.Json with source generation (AppJsonContext)

### Searxng Models (Models/Searxng.cs)

**Purpose**: DTOs for SearxNG search results

**Records**:
- `SearxngRoot` with `List<SearxngResult> Results`
- `SearxngResult` with `Title`, `Url`, `Content` (snippet)

**Usage**: Deserialized from SearxNG's JSON response

### JsonContexts

**Location**: `Models/JsonContexts.cs`

**Purpose**: Source-generated JSON serializer context for AOT compatibility

**Pattern**:
```csharp
[JsonSerializable(typeof(ChatCompletionRequest))]
[JsonSerializable(typeof(ChatCompletionResponse))]
... etc ...
internal partial class AppJsonContext : JsonSerializerContext
{
}
```

**Generated**: Partial class compiled by source generator

**Used By**: All `JsonSerializer.Serialize/Deserialize` calls with `AppJsonContext.Default.{Type}`

**Benefits**:
- AOT-compatible (no reflection)
- Faster serialization (compiled delegates)
- Smaller binary (trimming-safe)

## Component Interactions

### Dependencies Graph

```
Program.cs
├── ConfigManager (load/save)
├── OpenRouterClient ──┐
├── SearxngClient ─────┤
├── EmbeddingService ──┤
└── SearchTool ────────┤
                        │
OpenQueryApp ◄──────────┘
    │
    ├── OpenRouterClient (query gen + answer streaming)
    ├── SearchTool (pipeline)
    │   ├── SearxngClient (searches)
    │   ├── ArticleService (fetch)
    │   ├── ChunkingService (split)
    │   ├── EmbeddingService (embeddings)
    │   ├── RateLimiter (concurrency)
    │   └── StatusReporter (progress via callback)
    └── StatusReporter (UI)
```

### Data Flow Between Components

```
OpenQueryOptions
    ↓
OpenQueryApp
    ├─ Query Generation
    │   └─ OpenRouterClient.CompleteAsync()
    │       → List<string> generatedQueries
    │
    ├─ Search Pipeline
    │   └─ SearchTool.ExecuteAsync(originalQuery, generatedQueries, ...)
    │       ↓
    │   Phase 1: SearxngClient.SearchAsync(query) × N
    │       → ConcurrentBag<SearxngResult>
    │       → List<SearxngResult> (unique)
    │       ↓
    │   Phase 2: ArticleService.FetchArticleAsync(url) × M
    │       → ChunkingService.ChunkText(article.TextContent)
    │       → ConcurrentBag<Chunk> (content, url, title)
    │       ↓
    │   Phase 3: EmbeddingService.GetEmbeddingsAsync(chunkContents)
    │       → (queryEmbedding, chunkEmbeddings)
    │       ↓
    │   Phase 4: CosineSimilarity + Rank
    │       → List<Chunk> topChunks (with Score, Embedding set)
    │       ↓
    │   Format: context string with [Source N: Title](Url)
    │       → return context string
    │
    └─ Final Answer
        └─ OpenRouterClient.StreamAsync(prompt with context)
            → stream deltas to Console
```

### Interface Contracts

**SearchTool → Progress**:
```csharp
// Invoked as: onProgress?.Invoke("[Fetching article 1/10: example.com]")
Action<string>? onProgress
```

**StatusReporter ← Progress**:
```csharp
// Handler in OpenQueryApp:
(progress) => {
    if (options.Verbose) reporter.WriteLine(progress);
    else reporter.UpdateStatus(parsedShorterMessage);
}
```

**SearchTool → ArticleService**:
```csharp
Article article = await ArticleService.FetchArticleAsync(url);
```

**SearchTool → EmbeddingService**:
```csharp
(float[] queryEmbedding, float[][] chunkEmbeddings) = await ExecuteParallelEmbeddingsAsync(...);
// Also: embeddingService.GetEmbeddingAsync(text), GetEmbeddingsWithRateLimitAsync(...)
```

**SearchTool → ChunkingService**:
```csharp
List<string> chunks = ChunkingService.ChunkText(article.TextContent);
```

**SearchTool → RateLimiter**:
```csharp
await _rateLimiter.ExecuteAsync(async () => await _client.EmbedAsync(...), ct);
```

---

## Next Steps

- [OpenQueryApp](openquery-app.md) - Main orchestrator details
- [SearchTool](search-tool.md) - Pipeline implementation
- [Services](services.md) - All service classes documented
- [Models](models.md) - Complete data model reference