- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
19 KiB
Components Overview
Detailed documentation for each major component in the OpenQuery system.
📋 Table of Contents
Component Hierarchy
OpenQuery/
├── Program.cs [Entry Point, CLI]
├── OpenQuery.cs [OpenQueryApp - Orchestrator]
├── Tools/
│ └── SearchTool.cs [Pipeline Orchestration]
├── Services/
│ ├── OpenRouterClient.cs [LLM & Embedding API]
│ ├── SearxngClient.cs [Search API]
│ ├── EmbeddingService.cs [Embedding Generation + Math]
│ ├── ChunkingService.cs [Text Splitting]
│ ├── ArticleService.cs [Content Extraction]
│ ├── RateLimiter.cs [Concurrency Control]
│ └── StatusReporter.cs [Progress Display]
├── Models/
│ ├── OpenQueryOptions.cs [CLI Options Record]
│ ├── Chunk.cs [Content + Metadata]
│ ├── ParallelOptions.cs [Concurrency Settings]
│ ├── OpenRouter.cs [API DTOs]
│ ├── Searxng.cs [Search Result DTOs]
│ └── JsonContexts.cs [JSON Context]
└── ConfigManager.cs [Configuration Persistence]
Core Components
1. Program.cs
Type: Console Application Entry Point
Responsibilities: CLI parsing, dependency wiring, error handling
Key Elements:
RootCommandfrom System.CommandLine- Options:
--chunks,--results,--queries,--short,--long,--verbose - Subcommand:
configure(with interactive mode) - Configuration loading via
ConfigManager.Load() - Environment variable resolution
- Service instantiation and coordination
- Top-level try-catch for error reporting
Code Flow:
- Load config file
- Define CLI options and commands
- Set handler for root command
- Handler: resolve API key/model → instantiate services → call
OpenQueryApp.RunAsync() - Set handler for configure command (writes config file)
- Invoke command parser:
await rootCommand.InvokeAsync(args)
Exit Codes:
- 0 = success
- 1 = error
2. OpenQueryApp (OpenQuery.cs)
Type: Main Application Class
Responsibilities: Workflow orchestration, query generation, answer streaming
Constructor Parameters:
OpenRouterClient client- for query gen and final answerSearchTool searchTool- for search-retrieve-rank pipelinestring model- LLM model identifier
Main Method: RunAsync(OpenQueryOptions options)
Workflow Steps:
- Create
StatusReporter(for progress UI) - Optional Query Generation (if
options.Queries > 1):- Create system message instructing JSON array output
- Create user message with
options.Question - Call
client.CompleteAsync()with query gen model - Parse JSON response; fall back to original question on failure
- Result:
List<string> queries(1 or many)
- Execute Search Pipeline:
- Call
_searchTool.ExecuteAsync()with queries, options - Receive
string context(formatted context with source citations) - Progress reported via callback to
StatusReporter
- Call
- Generate Final Answer:
- Build system prompt (append "short" or "long" modifier)
- Create user message with
Context:\n{context}\n\nQuestion: {options.Question} - Stream answer via
client.StreamAsync() - Write each
chunk.TextDeltato Console as it arrives - Stop spinner on first chunk, continue streaming
- Dispose reporter
Error Handling:
- Exceptions propagate to
Program.cstop-level handler HttpRequestExceptionvs genericException
Note: Query generation uses the same model as final answer; could be separated for cost/performance.
3. SearchTool (Tools/SearchTool.cs)
Type: Pipeline Orchestrator
Responsibilities: Execute 4-phase search-retrieve-rank-return workflow
Constructor Parameters:
SearxngClient searxngClientEmbeddingService embeddingService
Main Method: ExecuteAsync(originalQuery, generatedQueries, maxResults, topChunksLimit, onProgress, verbose)
Returns: Task<string> - formatted context string with source citations
Pipeline Phases:
Phase 1: ExecuteParallelSearchesAsync
- Parallelize
searxngClient.SearchAsync(query, maxResults)for each query - Collect all results in
ConcurrentBag<SearxngResult> - Deduplicate by
DistinctBy(r => r.Url)
Output: List<SearxngResult> (aggregated, unique)
Phase 2: ExecuteParallelArticleFetchingAsync
- Semaphore:
MaxConcurrentArticleFetches(default 10) - For each
SearxngResult: fetch URL viaArticleService.FetchArticleAsync() - Extract article text, title
- Chunk via
ChunkingService.ChunkText(article.TextContent) - Add each chunk as new
Chunk(content, url, title)
Output: List<Chunk> (potentially 50-100 chunks)
Phase 3: ExecuteParallelEmbeddingsAsync
- Start two parallel tasks:
- Query embedding:
embeddingService.GetEmbeddingAsync(originalQuery) - Chunk embeddings:
embeddingService.GetEmbeddingsWithRateLimitAsync(chunkTexts, onProgress)
- Query embedding:
Parallel.ForEachAsyncwithMaxConcurrentEmbeddingRequests(default 4)- Batch size: 300 chunks per embedding API call
- Filter chunks with empty embeddings (failed batches)
Output: (float[] queryEmbedding, float[][] chunkEmbeddings)
Phase 4: RankAndSelectTopChunks
- Calculate cosine similarity for each chunk vs query
- Assign
chunk.Score - Order by descending score
- Take
topChunksLimit(from--chunksoption) - Return
List<Chunk>(top N)
Formatting:
string context = string.Join("\n\n", topChunks.Select((c, i) =>
$"[Source {i+1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}"));
Progress Callbacks: Invoked at each major step for UI feedback
Services
OpenRouterClient
Purpose: HTTP client for OpenRouter API (chat completions + embeddings)
Base URL: https://openrouter.ai/api/v1
Authentication: Authorization: Bearer {apiKey}
Methods:
StreamAsync(ChatCompletionRequest request, CancellationToken)
- Sets
request.Stream = true - POST to
/chat/completions - Reads SSE stream line-by-line
- Parses
data: {json}chunks - Yields
StreamChunk(text delta or tool call) - Supports cancellation
CompleteAsync(ChatCompletionRequest request)
- Sets
request.Stream = false - POST to
/chat/completions - Deserializes full response
- Returns
ChatCompletionResponse
EmbedAsync(string model, List<string> inputs)
- POST to
/embeddings - Returns
float[][](ordered by input index)
Error Handling: EnsureSuccessStatusCode() throws HttpRequestException on failure
Design: Thin wrapper; no retry logic (delegated to EmbeddingService)
SearxngClient
Purpose: HTTP client for SearxNG metasearch
Base URL: Configurable (default http://localhost:8002)
Methods:
SearchAsync(string query, int limit = 10)
- GET
{baseUrl}/search?q={query}&format=json - Deserializes to
SearxngRoot - Returns
Results.Take(limit).ToList() - On failure: returns empty
List<SearxngResult>(no exception)
Design: Very simple; failures are tolerated (OpenQuery continues with other queries)
EmbeddingService
Purpose: Batch embedding generation with rate limiting, parallelization, and retries
Configuration (from ParallelProcessingOptions):
MaxConcurrentEmbeddingRequests= 4EmbeddingBatchSize= 300
Default Embedding Model: openai/text-embedding-3-small
Methods:
GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)
- Splits
textsinto batches ofEmbeddingBatchSize - Parallelizes batches with
Parallel.ForEachAsync+MaxConcurrentEmbeddingRequests - Each batch: rate-limited + retry-wrapped
client.EmbedAsync(model, batch) - Collects results in order (by batch index)
- Returns
float[][](same order as input texts) - Failed batches return empty
float[]for each text
GetEmbeddingAsync(string text, CancellationToken)
- Wraps single-text call in rate limiter + retry
- Returns
float[]
CosineSimilarity(float[] v1, float[] v2)
- Static method using
TensorPrimitives.CosineSimilarity - Returns float between -1 and 1 (typically 0-1 for normalized embeddings)
Retry Policy (Polly):
- Max 3 attempts
- 1s base delay, exponential backoff
- Only
HttpRequestException
Rate Limiting: RateLimiter semaphore with MaxConcurrentEmbeddingRequests
Design Notes:
- Two similar methods (
GetEmbeddingsAsyncandGetEmbeddingsWithRateLimitAsync) - could be consolidated - Uses Polly for resilience (good pattern)
- Concurrency control prevents overwhelming OpenRouter
ChunkingService
Purpose: Split long text into manageable pieces
Static Class (no dependencies, pure function)
Algorithm (in ChunkText(string text)):
- Constant
MAX_CHUNK_SIZE = 500 - While remaining text:
- Take up to 500 chars
- If not at end, backtrack to last
[' ', '\n', '\r', '.', '!'] - Trim and add non-empty chunk
- Advance start position
Rationale: 500 chars is a sweet spot for embeddings - long enough for context, short enough for semantic coherence.
Edge Cases: Handles text shorter than 500 chars, empty text, text with no natural breaks.
ArticleService
Purpose: Extract clean article content from URLs
Method: FetchArticleAsync(string url)
Implementation: Delegates to SmartReader.ParseArticleAsync(url)
Returns: Article object (from SmartReader)
Title(string)TextContent(string) - cleaned article bodyIsReadable(bool) - quality indicator- Other metadata (author, date, etc.)
Error Handling: Exceptions propagate (handled by SearchTool)
Design: Thin wrapper around third-party library. Could be extended to add caching, custom extraction rules, etc.
RateLimiter
Purpose: Limit concurrent operations via semaphore
Interface:
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken);
public async Task ExecuteAsync(Func<Task> action, CancellationToken);
Implementation: SemaphoreSlim with WaitAsync and Release
Disposal: IAsyncDisposable (awaits semaphore disposal)
Usage: Wrap API calls that need concurrency control
var result = await _rateLimiter.ExecuteAsync(async () =>
await _client.EmbedAsync(model, batch), cancellationToken);
Design: Simple, reusable. Could be replaced with Polly.RateLimiting policy but this is lightweight.
StatusReporter
Purpose: Real-time progress UI with spinner and verbose modes
Architecture:
- Producer: UpdateStatus(text) → writes to
Channel<string> - Consumer: Background task
ProcessStatusUpdatesAsync()reads from channel - Spinner: Separate task animates Braille characters every 100ms
Modes:
Verbose Mode (_verbose = true):
- All progress messages written as
Console.WriteLine() - No spinner
- Full audit trail
Compact Mode (default):
- Status line with spinner (overwrites same line)
- Only latest status visible
- Example:
⠋ Fetching articles 3/10...
Key Methods:
UpdateStatus(message)- fire-and-forget, non-blockingWriteLine(text)- stops spinner temporarily, writes full lineStartSpinner()/StopSpinner()- manual controlClearStatus()- ANSI escape\r\x1b[Kto clear lineDispose()- completes channel, waits for background tasks
Spinner Chars: ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏'] (Braille patterns, smooth animation)
ANSI Codes: \r (carriage return), \x1b[K (erase to end of line)
Thread Safety: Channel is thread-safe; multiple components can write concurrently without locks
Design: Well-encapsulated; could be reused in other CLI projects.
ConfigManager
Purpose: Load/save configuration from XDG-compliant location
Config Path:
Environment.SpecialFolder.UserProfile→~/.config/openquery/config
Schema (AppConfig):
public class AppConfig
{
public string ApiKey { get; set; } = "";
public string Model { get; set; } = "qwen/qwen3.5-flash-02-23";
public int DefaultQueries { get; set; } = 3;
public int DefaultChunks { get; set; } = 3;
public int DefaultResults { get; set; } = 5;
}
Format: Simple key=value (no INI parser, manual line split)
Methods:
Load()→ reads file if exists, returnsAppConfig(with defaults)Save(AppConfig)→ writes all 5 keys, overwrites existing
Design:
- Static class (no instances)
- Creates directory if missing
- No validation (writes whatever values given)
- Could be improved with JSON format (but keep simple)
Data Models
OpenQueryOptions
Location: Models/OpenQueryOptions.cs
Type: record
Purpose: Immutable options object passed through workflow
Properties:
int Chunks- top N chunks for contextint Results- search results per queryint Queries- number of expanded queries to generatebool Short- concise answer flagbool Long- detailed answer flagbool Verbose- verbose logging flagstring Question- original user question
Created: In Program.cs from CLI options + config defaults
Used By: OpenQueryApp.RunAsync()
Chunk
Location: Models/Chunk.cs
Type: record
Purpose: Content chunk with metadata and embedding
Properties:
string Content- extracted text (~500 chars)string SourceUrl- article URLstring? Title- article title (nullable)float[]? Embedding- vector embedding (populated by EmbeddingService)float Score- relevance score (populated during ranking)
Lifecycle:
- Instantiated in
SearchTool.ExecuteParallelArticleFetchingAsyncwith content, url, title Embeddingset inExecuteParallelEmbeddingsAsyncafter batch processingScoreset inRankAndSelectTopChunksafter cosine similarity- Serialized into context string for final answer
Equality: Records provide value equality (based on all properties)
ParallelProcessingOptions
Location: Models/ParallelOptions.cs
Type: class (mutable)
Purpose: Concurrency settings for parallel operations
Properties (with defaults):
MaxConcurrentArticleFetches= 10MaxConcurrentEmbeddingRequests= 4EmbeddingBatchSize= 300
Used By: EmbeddingService (for embeddings), SearchTool (for article fetching)
Currently: Hardcoded in SearchTool constructor; could be made configurable
OpenRouter Models (Models/OpenRouter.cs)
Purpose: DTOs for OpenRouter API (JSON serializable)
Chat Completion:
ChatCompletionRequest(model, messages, tools, stream)ChatCompletionResponse(choices[], usage[])Message(role, content, tool_calls, tool_call_id)ToolDefinition,ToolFunction,ToolCall,FunctionCallChoice,Usage
Embedding:
EmbeddingRequest(model, input[])EmbeddingResponse(data[], usage)EmbeddingData(embedding[], index)
Streaming:
StreamChunk(TextDelta, Tool)ChatCompletionChunk,ChunkChoice,ChunkDelta
JSON Properties: Uses [JsonPropertyName] to match API
Serialization: System.Text.Json with source generation (AppJsonContext)
Searxng Models (Models/Searxng.cs)
Purpose: DTOs for SearxNG search results
Records:
SearxngRootwithList<SearxngResult> ResultsSearxngResultwithTitle,Url,Content(snippet)
Usage: Deserialized from SearxNG's JSON response
JsonContexts
Location: Models/JsonContexts.cs
Purpose: Source-generated JSON serializer context for AOT compatibility
Pattern:
[JsonSerializable(typeof(ChatCompletionRequest))]
[JsonSerializable(typeof(ChatCompletionResponse))]
... etc ...
internal partial class AppJsonContext : JsonSerializerContext
{
}
Generated: Partial class compiled by source generator
Used By: All JsonSerializer.Serialize/Deserialize calls with AppJsonContext.Default.{Type}
Benefits:
- AOT-compatible (no reflection)
- Faster serialization (compiled delegates)
- Smaller binary (trimming-safe)
Component Interactions
Dependencies Graph
Program.cs
├── ConfigManager (load/save)
├── OpenRouterClient ──┐
├── SearxngClient ─────┤
├── EmbeddingService ──┤
└── SearchTool ────────┤
│
OpenQueryApp ◄──────────┘
│
├── OpenRouterClient (query gen + answer streaming)
├── SearchTool (pipeline)
│ ├── SearxngClient (searches)
│ ├── ArticleService (fetch)
│ ├── ChunkingService (split)
│ ├── EmbeddingService (embeddings)
│ ├── RateLimiter (concurrency)
│ └── StatusReporter (progress via callback)
└── StatusReporter (UI)
Data Flow Between Components
OpenQueryOptions
↓
OpenQueryApp
├─ Query Generation
│ └─ OpenRouterClient.CompleteAsync()
│ → List<string> generatedQueries
│
├─ Search Pipeline
│ └─ SearchTool.ExecuteAsync(originalQuery, generatedQueries, ...)
│ ↓
│ Phase 1: SearxngClient.SearchAsync(query) × N
│ → ConcurrentBag<SearxngResult>
│ → List<SearxngResult> (unique)
│ ↓
│ Phase 2: ArticleService.FetchArticleAsync(url) × M
│ → ChunkingService.ChunkText(article.TextContent)
│ → ConcurrentBag<Chunk> (content, url, title)
│ ↓
│ Phase 3: EmbeddingService.GetEmbeddingsAsync(chunkContents)
│ → (queryEmbedding, chunkEmbeddings)
│ ↓
│ Phase 4: CosineSimilarity + Rank
│ → List<Chunk> topChunks (with Score, Embedding set)
│ ↓
│ Format: context string with [Source N: Title](Url)
│ → return context string
│
└─ Final Answer
└─ OpenRouterClient.StreamAsync(prompt with context)
→ stream deltas to Console
Interface Contracts
SearchTool → Progress:
// Invoked as: onProgress?.Invoke("[Fetching article 1/10: example.com]")
Action<string>? onProgress
StatusReporter ← Progress:
// Handler in OpenQueryApp:
(progress) => {
if (options.Verbose) reporter.WriteLine(progress);
else reporter.UpdateStatus(parsedShorterMessage);
}
SearchTool → ArticleService:
Article article = await ArticleService.FetchArticleAsync(url);
SearchTool → EmbeddingService:
(float[] queryEmbedding, float[][] chunkEmbeddings) = await ExecuteParallelEmbeddingsAsync(...);
// Also: embeddingService.GetEmbeddingAsync(text), GetEmbeddingsWithRateLimitAsync(...)
SearchTool → ChunkingService:
List<string> chunks = ChunkingService.ChunkText(article.TextContent);
SearchTool → RateLimiter:
await _rateLimiter.ExecuteAsync(async () => await _client.EmbedAsync(...), ct);
Next Steps
- OpenQueryApp - Main orchestrator details
- SearchTool - Pipeline implementation
- Services - All service classes documented
- Models - Complete data model reference