Files

OpenQuery Documentation 65ca2401ae docs: add comprehensive documentation with README and detailed guides

- Add user-friendly README.md with quick start guide
- Create docs/ folder with structured technical documentation:
  - installation.md: Build and setup instructions
  - configuration.md: Complete config reference
  - usage.md: CLI usage guide with examples
  - architecture.md: System design and patterns
  - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models)
  - api/: CLI reference, environment variables, programmatic API
  - troubleshooting.md: Common issues and solutions
  - performance.md: Latency, throughput, and optimization
- All documentation fully cross-referenced with internal links
- Covers project overview, architecture, components, APIs, and support

See individual files for complete documentation.

2026-03-19 10:01:58 +01:00

19 KiB

Raw Blame History

Components Overview

Detailed documentation for each major component in the OpenQuery system.

📋 Table of Contents

Component Hierarchy
Core Components
Services
Data Models
Component Interactions

Component Hierarchy

OpenQuery/
├── Program.cs                    [Entry Point, CLI]
├── OpenQuery.cs                  [OpenQueryApp - Orchestrator]
├── Tools/
│   └── SearchTool.cs            [Pipeline Orchestration]
├── Services/
│   ├── OpenRouterClient.cs      [LLM & Embedding API]
│   ├── SearxngClient.cs         [Search API]
│   ├── EmbeddingService.cs      [Embedding Generation + Math]
│   ├── ChunkingService.cs       [Text Splitting]
│   ├── ArticleService.cs        [Content Extraction]
│   ├── RateLimiter.cs           [Concurrency Control]
│   └── StatusReporter.cs        [Progress Display]
├── Models/
│   ├── OpenQueryOptions.cs      [CLI Options Record]
│   ├── Chunk.cs                 [Content + Metadata]
│   ├── ParallelOptions.cs       [Concurrency Settings]
│   ├── OpenRouter.cs            [API DTOs]
│   ├── Searxng.cs               [Search Result DTOs]
│   └── JsonContexts.cs          [JSON Context]
└── ConfigManager.cs             [Configuration Persistence]

Core Components

1. Program.cs

Type: Console Application Entry Point
Responsibilities: CLI parsing, dependency wiring, error handling

Key Elements:

RootCommand from System.CommandLine
Options: --chunks, --results, --queries, --short, --long, --verbose
Subcommand: configure (with interactive mode)
Configuration loading via ConfigManager.Load()
Environment variable resolution
Service instantiation and coordination
Top-level try-catch for error reporting

Code Flow:

Load config file
Define CLI options and commands
Set handler for root command
Handler: resolve API key/model → instantiate services → call OpenQueryApp.RunAsync()
Set handler for configure command (writes config file)
Invoke command parser: await rootCommand.InvokeAsync(args)

Exit Codes:

0 = success
1 = error

2. OpenQueryApp (OpenQuery.cs)

Type: Main Application Class
Responsibilities: Workflow orchestration, query generation, answer streaming

Constructor Parameters:

OpenRouterClient client - for query gen and final answer
SearchTool searchTool - for search-retrieve-rank pipeline
string model - LLM model identifier

Main Method: RunAsync(OpenQueryOptions options)

Workflow Steps:

Create StatusReporter (for progress UI)
Optional Query Generation (if options.Queries > 1):
- Create system message instructing JSON array output
- Create user message with options.Question
- Call client.CompleteAsync() with query gen model
- Parse JSON response; fall back to original question on failure
- Result: List<string> queries (1 or many)
Execute Search Pipeline:
- Call _searchTool.ExecuteAsync() with queries, options
- Receive string context (formatted context with source citations)
- Progress reported via callback to StatusReporter
Generate Final Answer:
- Build system prompt (append "short" or "long" modifier)
- Create user message with Context:\n{context}\n\nQuestion: {options.Question}
- Stream answer via client.StreamAsync()
- Write each chunk.TextDelta to Console as it arrives
- Stop spinner on first chunk, continue streaming
Dispose reporter

Error Handling:

Exceptions propagate to Program.cs top-level handler
HttpRequestException vs generic Exception

Note: Query generation uses the same model as final answer; could be separated for cost/performance.

3. SearchTool (Tools/SearchTool.cs)

Type: Pipeline Orchestrator
Responsibilities: Execute 4-phase search-retrieve-rank-return workflow

Constructor Parameters:

SearxngClient searxngClient
EmbeddingService embeddingService

Main Method: ExecuteAsync(originalQuery, generatedQueries, maxResults, topChunksLimit, onProgress, verbose)

Returns: Task<string> - formatted context string with source citations

Pipeline Phases:

Phase 1: ExecuteParallelSearchesAsync

Parallelize searxngClient.SearchAsync(query, maxResults) for each query
Collect all results in ConcurrentBag<SearxngResult>
Deduplicate by DistinctBy(r => r.Url)

Output: List<SearxngResult> (aggregated, unique)

Phase 2: ExecuteParallelArticleFetchingAsync

Semaphore: MaxConcurrentArticleFetches (default 10)
For each SearxngResult: fetch URL via ArticleService.FetchArticleAsync()
Extract article text, title
Chunk via ChunkingService.ChunkText(article.TextContent)
Add each chunk as new Chunk(content, url, title)

Output: List<Chunk> (potentially 50-100 chunks)

Phase 3: ExecuteParallelEmbeddingsAsync

Start two parallel tasks:
1. Query embedding: embeddingService.GetEmbeddingAsync(originalQuery)
2. Chunk embeddings: embeddingService.GetEmbeddingsWithRateLimitAsync(chunkTexts, onProgress)
Parallel.ForEachAsync with MaxConcurrentEmbeddingRequests (default 4)
Batch size: 300 chunks per embedding API call
Filter chunks with empty embeddings (failed batches)

Output: (float[] queryEmbedding, float[][] chunkEmbeddings)

Phase 4: RankAndSelectTopChunks

Calculate cosine similarity for each chunk vs query
Assign chunk.Score
Order by descending score
Take topChunksLimit (from --chunks option)
Return List<Chunk> (top N)

Formatting:

string context = string.Join("\n\n", topChunks.Select((c, i) =>
    $"[Source {i+1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}"));

Progress Callbacks: Invoked at each major step for UI feedback

Services

OpenRouterClient

Purpose: HTTP client for OpenRouter API (chat completions + embeddings)

Base URL: https://openrouter.ai/api/v1

Authentication: Authorization: Bearer {apiKey}

Methods:

`StreamAsync(ChatCompletionRequest request, CancellationToken)`

Sets request.Stream = true
POST to /chat/completions
Reads SSE stream line-by-line
Parses data: {json} chunks
Yields StreamChunk (text delta or tool call)
Supports cancellation

`CompleteAsync(ChatCompletionRequest request)`

Sets request.Stream = false
POST to /chat/completions
Deserializes full response
Returns ChatCompletionResponse

`EmbedAsync(string model, List<string> inputs)`

POST to /embeddings
Returns float[][] (ordered by input index)

Error Handling: EnsureSuccessStatusCode() throws HttpRequestException on failure

Design: Thin wrapper; no retry logic (delegated to EmbeddingService)

SearxngClient

Purpose: HTTP client for SearxNG metasearch

Base URL: Configurable (default http://localhost:8002)

Methods:

`SearchAsync(string query, int limit = 10)`

GET {baseUrl}/search?q={query}&format=json
Deserializes to SearxngRoot
Returns Results.Take(limit).ToList()
On failure: returns empty List<SearxngResult> (no exception)

Design: Very simple; failures are tolerated (OpenQuery continues with other queries)

EmbeddingService

Purpose: Batch embedding generation with rate limiting, parallelization, and retries

Configuration (from ParallelProcessingOptions):

MaxConcurrentEmbeddingRequests = 4
EmbeddingBatchSize = 300

Default Embedding Model: openai/text-embedding-3-small

Methods:

`GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`

Splits texts into batches of EmbeddingBatchSize
Parallelizes batches with Parallel.ForEachAsync + MaxConcurrentEmbeddingRequests
Each batch: rate-limited + retry-wrapped client.EmbedAsync(model, batch)
Collects results in order (by batch index)
Returns float[][] (same order as input texts)
Failed batches return empty float[] for each text

`GetEmbeddingAsync(string text, CancellationToken)`

Wraps single-text call in rate limiter + retry
Returns float[]

`CosineSimilarity(float[] v1, float[] v2)`

Static method using TensorPrimitives.CosineSimilarity
Returns float between -1 and 1 (typically 0-1 for normalized embeddings)

Retry Policy (Polly):

Max 3 attempts
1s base delay, exponential backoff
Only HttpRequestException

Rate Limiting: RateLimiter semaphore with MaxConcurrentEmbeddingRequests

Design Notes:

Two similar methods (GetEmbeddingsAsync and GetEmbeddingsWithRateLimitAsync) - could be consolidated
Uses Polly for resilience (good pattern)
Concurrency control prevents overwhelming OpenRouter

ChunkingService

Purpose: Split long text into manageable pieces

Static Class (no dependencies, pure function)

Algorithm (in ChunkText(string text)):

Constant MAX_CHUNK_SIZE = 500
While remaining text:
- Take up to 500 chars
- If not at end, backtrack to last [' ', '\n', '\r', '.', '!']
- Trim and add non-empty chunk
- Advance start position

Rationale: 500 chars is a sweet spot for embeddings - long enough for context, short enough for semantic coherence.

Edge Cases: Handles text shorter than 500 chars, empty text, text with no natural breaks.

ArticleService

Purpose: Extract clean article content from URLs

Method: FetchArticleAsync(string url)

Implementation: Delegates to SmartReader.ParseArticleAsync(url)

Returns: Article object (from SmartReader)

Title (string)
TextContent (string) - cleaned article body
IsReadable (bool) - quality indicator
Other metadata (author, date, etc.)

Error Handling: Exceptions propagate (handled by SearchTool)

Design: Thin wrapper around third-party library. Could be extended to add caching, custom extraction rules, etc.

RateLimiter

Purpose: Limit concurrent operations via semaphore

Interface:

public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken);
public async Task ExecuteAsync(Func<Task> action, CancellationToken);

Implementation: SemaphoreSlim with WaitAsync and Release

Disposal: IAsyncDisposable (awaits semaphore disposal)

Usage: Wrap API calls that need concurrency control

var result = await _rateLimiter.ExecuteAsync(async () =>
    await _client.EmbedAsync(model, batch), cancellationToken);

Design: Simple, reusable. Could be replaced with Polly.RateLimiting policy but this is lightweight.

StatusReporter

Purpose: Real-time progress UI with spinner and verbose modes

Architecture:

Producer: UpdateStatus(text) → writes to Channel<string>
Consumer: Background task ProcessStatusUpdatesAsync() reads from channel
Spinner: Separate task animates Braille characters every 100ms

Modes:

Verbose Mode (_verbose = true):

All progress messages written as Console.WriteLine()
No spinner
Full audit trail

Compact Mode (default):

Status line with spinner (overwrites same line)
Only latest status visible
Example: ⠋ Fetching articles 3/10...

Key Methods:

UpdateStatus(message) - fire-and-forget, non-blocking
WriteLine(text) - stops spinner temporarily, writes full line
StartSpinner() / StopSpinner() - manual control
ClearStatus() - ANSI escape \r\x1b[K to clear line
Dispose() - completes channel, waits for background tasks

Spinner Chars: ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏'] (Braille patterns, smooth animation)

ANSI Codes: \r (carriage return), \x1b[K (erase to end of line)

Thread Safety: Channel is thread-safe; multiple components can write concurrently without locks

Design: Well-encapsulated; could be reused in other CLI projects.

ConfigManager

Purpose: Load/save configuration from XDG-compliant location

Config Path:

Environment.SpecialFolder.UserProfile → ~/.config/openquery/config

Schema (AppConfig):

public class AppConfig
{
    public string ApiKey { get; set; } = "";
    public string Model { get; set; } = "qwen/qwen3.5-flash-02-23";
    public int DefaultQueries { get; set; } = 3;
    public int DefaultChunks { get; set; } = 3;
    public int DefaultResults { get; set; } = 5;
}

Format: Simple key=value (no INI parser, manual line split)

Methods:

Load() → reads file if exists, returns AppConfig (with defaults)
Save(AppConfig) → writes all 5 keys, overwrites existing

Design:

Static class (no instances)
Creates directory if missing
No validation (writes whatever values given)
Could be improved with JSON format (but keep simple)

Data Models

OpenQueryOptions

Location: Models/OpenQueryOptions.cs

Type: record

Purpose: Immutable options object passed through workflow

Properties:

int Chunks - top N chunks for context
int Results - search results per query
int Queries - number of expanded queries to generate
bool Short - concise answer flag
bool Long - detailed answer flag
bool Verbose - verbose logging flag
string Question - original user question

Created: In Program.cs from CLI options + config defaults

Used By: OpenQueryApp.RunAsync()

Chunk

Location: Models/Chunk.cs

Type: record

Purpose: Content chunk with metadata and embedding

Properties:

string Content - extracted text (~500 chars)
string SourceUrl - article URL
string? Title - article title (nullable)
float[]? Embedding - vector embedding (populated by EmbeddingService)
float Score - relevance score (populated during ranking)

Lifecycle:

Instantiated in SearchTool.ExecuteParallelArticleFetchingAsync with content, url, title
Embedding set in ExecuteParallelEmbeddingsAsync after batch processing
Score set in RankAndSelectTopChunks after cosine similarity
Serialized into context string for final answer

Equality: Records provide value equality (based on all properties)

ParallelProcessingOptions

Location: Models/ParallelOptions.cs

Type: class (mutable)

Purpose: Concurrency settings for parallel operations

Properties (with defaults):

MaxConcurrentArticleFetches = 10
MaxConcurrentEmbeddingRequests = 4
EmbeddingBatchSize = 300

Used By: EmbeddingService (for embeddings), SearchTool (for article fetching)

Currently: Hardcoded in SearchTool constructor; could be made configurable

OpenRouter Models (Models/OpenRouter.cs)

Purpose: DTOs for OpenRouter API (JSON serializable)

Chat Completion:

ChatCompletionRequest (model, messages, tools, stream)
ChatCompletionResponse (choices[], usage[])
Message (role, content, tool_calls, tool_call_id)
ToolDefinition, ToolFunction, ToolCall, FunctionCall
Choice, Usage

Embedding:

EmbeddingRequest (model, input[])
EmbeddingResponse (data[], usage)
EmbeddingData (embedding[], index)

Streaming:

StreamChunk (TextDelta, Tool)
ChatCompletionChunk, ChunkChoice, ChunkDelta

JSON Properties: Uses [JsonPropertyName] to match API

Serialization: System.Text.Json with source generation (AppJsonContext)

Searxng Models (Models/Searxng.cs)

Purpose: DTOs for SearxNG search results

Records:

SearxngRoot with List<SearxngResult> Results
SearxngResult with Title, Url, Content (snippet)

Usage: Deserialized from SearxNG's JSON response

JsonContexts

Location: Models/JsonContexts.cs

Purpose: Source-generated JSON serializer context for AOT compatibility

Pattern:

[JsonSerializable(typeof(ChatCompletionRequest))]
[JsonSerializable(typeof(ChatCompletionResponse))]
... etc ...
internal partial class AppJsonContext : JsonSerializerContext
{
}

Generated: Partial class compiled by source generator

Used By: All JsonSerializer.Serialize/Deserialize calls with AppJsonContext.Default.{Type}

Benefits:

AOT-compatible (no reflection)
Faster serialization (compiled delegates)
Smaller binary (trimming-safe)

Component Interactions

Dependencies Graph

Program.cs
├── ConfigManager (load/save)
├── OpenRouterClient ──┐
├── SearxngClient ─────┤
├── EmbeddingService ──┤
└── SearchTool ────────┤
                        │
OpenQueryApp ◄──────────┘
    │
    ├── OpenRouterClient (query gen + answer streaming)
    ├── SearchTool (pipeline)
    │   ├── SearxngClient (searches)
    │   ├── ArticleService (fetch)
    │   ├── ChunkingService (split)
    │   ├── EmbeddingService (embeddings)
    │   ├── RateLimiter (concurrency)
    │   └── StatusReporter (progress via callback)
    └── StatusReporter (UI)

Data Flow Between Components

OpenQueryOptions
    ↓
OpenQueryApp
    ├─ Query Generation
    │   └─ OpenRouterClient.CompleteAsync()
    │       → List<string> generatedQueries
    │
    ├─ Search Pipeline
    │   └─ SearchTool.ExecuteAsync(originalQuery, generatedQueries, ...)
    │       ↓
    │   Phase 1: SearxngClient.SearchAsync(query) × N
    │       → ConcurrentBag<SearxngResult>
    │       → List<SearxngResult> (unique)
    │       ↓
    │   Phase 2: ArticleService.FetchArticleAsync(url) × M
    │       → ChunkingService.ChunkText(article.TextContent)
    │       → ConcurrentBag<Chunk> (content, url, title)
    │       ↓
    │   Phase 3: EmbeddingService.GetEmbeddingsAsync(chunkContents)
    │       → (queryEmbedding, chunkEmbeddings)
    │       ↓
    │   Phase 4: CosineSimilarity + Rank
    │       → List<Chunk> topChunks (with Score, Embedding set)
    │       ↓
    │   Format: context string with [Source N: Title](Url)
    │       → return context string
    │
    └─ Final Answer
        └─ OpenRouterClient.StreamAsync(prompt with context)
            → stream deltas to Console

Interface Contracts

SearchTool → Progress:

// Invoked as: onProgress?.Invoke("[Fetching article 1/10: example.com]")
Action<string>? onProgress

StatusReporter ← Progress:

// Handler in OpenQueryApp:
(progress) => {
    if (options.Verbose) reporter.WriteLine(progress);
    else reporter.UpdateStatus(parsedShorterMessage);
}

SearchTool → ArticleService:

Article article = await ArticleService.FetchArticleAsync(url);

SearchTool → EmbeddingService:

(float[] queryEmbedding, float[][] chunkEmbeddings) = await ExecuteParallelEmbeddingsAsync(...);
// Also: embeddingService.GetEmbeddingAsync(text), GetEmbeddingsWithRateLimitAsync(...)

SearchTool → ChunkingService:

List<string> chunks = ChunkingService.ChunkText(article.TextContent);

SearchTool → RateLimiter:

await _rateLimiter.ExecuteAsync(async () => await _client.EmbedAsync(...), ct);

Next Steps

OpenQueryApp - Main orchestrator details
SearchTool - Pipeline implementation
Services - All service classes documented
Models - Complete data model reference

19 KiB Raw Blame History Unescape Escape

Components Overview

📋 Table of Contents

Component Hierarchy

Core Components

1. Program.cs

2. OpenQueryApp (OpenQuery.cs)

3. SearchTool (Tools/SearchTool.cs)

Phase 1: ExecuteParallelSearchesAsync

Phase 2: ExecuteParallelArticleFetchingAsync

Phase 3: ExecuteParallelEmbeddingsAsync

Phase 4: RankAndSelectTopChunks

Services

OpenRouterClient

StreamAsync(ChatCompletionRequest request, CancellationToken)

CompleteAsync(ChatCompletionRequest request)

EmbedAsync(string model, List<string> inputs)

SearxngClient

SearchAsync(string query, int limit = 10)

EmbeddingService

GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)

GetEmbeddingAsync(string text, CancellationToken)

CosineSimilarity(float[] v1, float[] v2)

ChunkingService

ArticleService

RateLimiter

StatusReporter

ConfigManager

Data Models

OpenQueryOptions

Chunk

ParallelProcessingOptions

OpenRouter Models (Models/OpenRouter.cs)

Searxng Models (Models/Searxng.cs)

JsonContexts

Component Interactions

Dependencies Graph

Data Flow Between Components

Interface Contracts

Next Steps

19 KiB

Raw Blame History

`StreamAsync(ChatCompletionRequest request, CancellationToken)`

`CompleteAsync(ChatCompletionRequest request)`

`EmbedAsync(string model, List<string> inputs)`

`SearchAsync(string query, int limit = 10)`

`GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`

`GetEmbeddingAsync(string text, CancellationToken)`

`CosineSimilarity(float[] v1, float[] v2)`