- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
14 KiB
Services Overview
Comprehensive reference for all service classes in OpenQuery.
📋 Table of Contents
Service Catalog
OpenQuery's services are organized into three categories:
| Category | Services | Purpose |
|---|---|---|
| Clients | OpenRouterClient, SearxngClient |
External API communication |
| Processors | EmbeddingService, ChunkingService, ArticleService |
Data transformation & extraction |
| Infrastructure | RateLimiter, StatusReporter |
Cross-cutting concerns |
All services are stateless (except for internal configuration) and can be safely reused across multiple operations.
Client Services
OpenRouterClient
Location: Services/OpenRouterClient.cs
Purpose: HTTP client for OpenRouter AI APIs (chat completions & embeddings)
API Endpoints
| Method | Endpoint | Purpose |
|---|---|---|
| POST | /chat/completions |
Chat completion (streaming or non-streaming) |
| POST | /embeddings |
Embedding generation for text inputs |
Authentication
Authorization: Bearer {apiKey}
Accept: application/json
Public Methods
StreamAsync(ChatCompletionRequest request, CancellationToken cancellationToken)
- Returns:
IAsyncEnumerable<StreamChunk> - Behavior: Sets
request.Stream = true, posts, reads Server-Sent Events stream - Use Case: Final answer streaming, real-time responses
- Stream Format: SSE lines
data: {json}; yieldsTextDeltaorToolCall
CompleteAsync(ChatCompletionRequest request)
- Returns:
Task<ChatCompletionResponse> - Behavior: Sets
request.Stream = false, posts, returns full response - Use Case: Query generation (non-streaming)
EmbedAsync(string model, List<string> inputs)
- Returns:
Task<float[][]> - Behavior: POST
/embeddings, returns array of vectors (ordered by input index) - Use Case: Batch embedding generation
HttpClient
- Property: Internal
_httpClient(created per instance) - Note: Could use
IHttpClientFactoryfor pooling (not needed for CLI)
Error Handling
EnsureSuccessStatusCode()throwsHttpRequestExceptionon 4xx/5xx- No retry logic (handled by
EmbeddingService)
Configuration
public OpenRouterClient(string apiKey)
{
_apiKey = apiKey;
_httpClient = new HttpClient();
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
_httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
}
Example Usage
var client = new OpenRouterClient("sk-or-...");
var request = new ChatCompletionRequest("model", new List<Message> { ... });
await foreach (var chunk in client.StreamAsync(request))
{
Console.Write(chunk.TextDelta);
}
SearxngClient
Location: Services/SearxngClient.cs
Purpose: HTTP client for SearxNG metasearch engine
API Endpoint
GET /search?q={query}&format=json
Constructor
public SearxngClient(string baseUrl) // e.g., "http://localhost:8002"
baseUrltrimmed of trailing/
Public Methods
SearchAsync(string query, int limit = 10)
- Returns:
Task<List<SearxngResult>> - Behavior: GET request, deserialize JSON, take up to
limitresults - On Failure: Returns empty
List<SearxngResult>(no exception)
Error Handling
response.EnsureSuccessStatusCode()would throw, but code doesn't call it- If invalid JSON or missing
Results, returns empty list - Failures are tolerated - individual search queries may fail without aborting whole operation
Example Searxng Response
{
"results": [
{
"title": "Quantum Entanglement - Wikipedia",
"url": "https://en.wikipedia.org/wiki/Quantum_entanglement",
"content": "Quantum entanglement is a physical phenomenon..."
},
...
]
}
Processing Services
EmbeddingService
Location: Services/EmbeddingService.cs
Purpose: Generate embeddings with batching, rate limiting, and retry logic
Configuration
Embedding Model: openai/text-embedding-3-small (default, configurable via constructor)
ParallelProcessingOptions (hardcoded defaults):
public class ParallelProcessingOptions
{
public int MaxConcurrentEmbeddingRequests { get; set; } = 4;
public int EmbeddingBatchSize { get; set; } = 300;
}
Public Methods
GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)
- Returns:
Task<float[][]> - Behavior:
- Splits
textsinto batches ofEmbeddingBatchSize - Parallel executes batches (max
MaxConcurrentEmbeddingRequestsconcurrent) - Each batch: rate-limited, retry-wrapped
client.EmbedAsync(model, batch) - Reassembles in original order
- Failed batches → empty
float[]for each text
- Splits
- Progress: Invokes
onProgressfor each batch:"[Generating embeddings: batch X/Y]" - Thread-Safe: Uses lock for collecting results
GetEmbeddingAsync(string text, CancellationToken)
- Returns:
Task<float[]> - Behavior: Single embedding with rate limiting and retry
- Use Case: Query embedding
`Cos static float CosineSimilarity(float[] vector1, float[] vector2)
Uses `System.Numerics.Tensors.TensorPrimitives.CosineSimilarity`
Returns float between -1 and 1 (typically 0-1 for normalized embeddings)
Implementation: Single line calling SIMD-accelerated tensor primitive
ArticleService
Location: Services/ArticleService.cs
Purpose: Extract clean article content from web URLs
Public Methods
FetchArticleAsync(string url)
- Returns:
Task<Article> - Behavior: Delegates to
SmartReader.ParseArticleAsync(url) - Result:
ArticlewithTitle,TextContent,IsReadable, and metadata
Errors
- Propagates exceptions (SmartReader may throw on network failures, malformed HTML)
SearchToolcatches and logs
SmartReader Notes
- Open-source article extraction library (bundled via NuGet)
- Uses Readability algorithm (similar to Firefox Reader View)
- Removes ads, navigation, boilerplate
IsReadableindicates quality (e.g., not a 404 page, not too short)
ChunkingService
Location: Services/ChunkingService.cs
Purpose: Split text into 500-character chunks at natural boundaries
Public Methods
ChunkText(string text)
- Returns:
List<string> - Algorithm:
- Constant
MAX_CHUNK_SIZE = 500 - While remaining text:
- Take up to 500 chars
- If not at end, backtrack to last
[' ', '\n', '\r', '.', '!'] - Trim, add if non-empty
- Advance start
- Returns all chunks
- Constant
Characteristics
- Static class (no instances)
- Pure function (no side effects)
- Zero dependencies
- Handles edge cases (empty text, short text, text without breaks)
Infrastructure Services
RateLimiter
Location: Services/RateLimiter.cs
Purpose: Limit concurrent operations using semaphore
Constructor
public RateLimiter(int maxConcurrentRequests)
Creates SemaphoreSlim with maxConcurrentRequests
Public Methods
ExecuteAsync<T>(Func<Task<T>> action, CancellationToken)
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken cancellationToken = default)
{
await _semaphore.WaitAsync(cancellationToken);
try
{
return await action();
}
finally
{
_semaphore.Release();
}
}
- Waits for semaphore slot
- Executes
action(typically an API call) - Releases semaphore (even if exception)
- Returns result from
action
ExecuteAsync(Func<Task> action, CancellationToken)
- Non-generic version (for void-returning actions)
Disposal
public async ValueTask DisposeAsync()
{
_semaphore.Dispose();
}
Implements IAsyncDisposable for async cleanup
Usage Pattern
var result = await _rateLimiter.ExecuteAsync(async () =>
{
return await SomeApiCall();
}, cancellationToken);
Where Used
EmbeddingService: Limits concurrent embedding batch requests (default 4)
StatusReporter
Location: Services/StatusReporter.cs
Purpose: Real-time progress display with spinner (compact) or verbose lines
Constructor
public StatusReporter(bool verbose)
verbose = true: all progress viaWriteLine()(no spinner)verbose = false: spinner with latest status
Architecture
Components:
Channel<string> _statusChannel- producer-consumer queueTask _statusProcessor- background task reading from channelCancellationTokenSource _spinnerCts- spinner task cancellationTask _spinnerTask- spinner animation taskchar[] _spinnerChars- Braille spinner pattern
Spinner Animation:
- Runs at 10 FPS (100ms interval)
- Cycles through
['⠋','⠙','⠹','⠸','⠼','⠴','⠦','⠧','⠇','⠏'] - Displays:
⠋ Fetching articles... - Updates in place using ANSI:
\r\x1b[K(carriage return + erase line)
Public Methods
UpdateStatus(string message)
- Fire-and-forget: writes to channel via
TryWrite(non-blocking) - If channel full, message dropped (acceptable loss for UI)
WriteLine(string text)
- Stops spinner temporarily
- Clears current status line
- Writes
textwith newline - In verbose mode: just
Console.WriteLine(text)
ClearStatus()
- In compact mode:
Console.Write("\r\x1b[K")(erase line) - In verbose: no-op
- Sets
_currentMessage = null
StartSpinner() / StopSpinner()
- Manual control (usually
StartSpinnerconstructor call,StopSpinnerbyDispose)
Dispose()
- Completes channel writer
- Awaits
_statusProcessorcompletion - Calls
StopSpinner()
Background Processing
Status Processor:
private async Task ProcessStatusUpdatesAsync()
{
await foreach (var message in _statusChannel.Reader.ReadAllAsync())
{
if (_verbose)
{
Console.WriteLine(message);
continue;
}
Console.Write("\r\x1b[K"); // Clear line
Console.Write($"{_spinnerChars[0]} {message}"); // Static spinner
_currentMessage = message;
}
}
Spinner Task:
_spinnerTask = Task.Run(async () =>
{
while (_spinnerCts is { Token.IsCancellationRequested: false })
{
if (_currentMessage != null)
{
Console.Write("\r\x1b[K");
var charIndex = index++ % spinner.Length;
Console.Write($"{spinner[charIndex]} {_currentMessage}");
}
await Task.Delay(100, _spinnerCts.Token);
}
});
Thread Safety
UpdateStatus(producer) writes to channelProcessStatusUpdatesAsync(consumer) reads from channel_spinnerTaskruns concurrently- All UI writes happen in consumer/spinner task context (single-threaded UI)
Design Notes
- Could be simplified: just use
Console.CursorLeftfor spinner, no channel - Channel allows random
UpdateStatuscalls from any thread without blocking - Braille spinner requires terminal that supports Unicode (most modern terminals do)
Service Interactions
Dependency Graph
OpenQueryApp
├── OpenRouterClient ← (used for query gen + final answer)
└── SearchTool
├── SearxngClient
├── ArticleService (uses SmartReader)
├── ChunkingService (static)
├── EmbeddingService
│ └── OpenRouterClient (different instance)
│ └── RateLimiter
└── ParallelProcessingOptions (config)
Service Lifetimes
All services are transient (new instance per query execution):
OpenRouterClient→ 1 instance for query gen + answerSearxngClient→ 1 instance for all searchesEmbeddingService→ 1 instance with its ownOpenRouterClientandRateLimiterSearchTool→ 1 instance per query (constructed inProgram.cs)
No singleton or static state (except static utility classes like ChunkingService).
Data Flow Through Services
OpenQueryApp
│
├─ OpenRouterClient.CompleteAsync() → query generation
│ Messages → JSON → HTTP request → response → JSON → Messages
│
└─ SearchTool.ExecuteAsync()
│
├─ SearxngClient.SearchAsync() × N
│ query → URL encode → GET → JSON → SearxngResult[]
│
├─ ArticleService.FetchArticleAsync() × M
│ URL → HTTP GET → SmartReader → Article
│
├─ ChunkingService.ChunkText() × M
│ Article.TextContent → List<string> chunks
│
├─ EmbeddingService.GetEmbeddingAsync(query) + GetEmbeddingsAsync(chunks[])
│ texts → batches → rate-limited HTTP POST → JSON → float[][]
│
├─ CosineSimilarity(queryEmbedding, chunkEmbedding) × M
│ Vectors → dot product → magnitude → score
│
└─ return context string (formatted chunks)
Next Steps
- OpenQueryApp - Orchestrates services
- SearchTool - Coordinates pipeline
- Models - Data structures passed between services
- API Reference - CLI that uses these services
Service Design Principles:
- Single Responsibility: Each service does one thing well
- Stateless: No instance state beyond constructor args
- Composable: Services depend on abstractions (other services) not implementations
- Testable: Can mock dependencies for unit testing