# Services Overview Comprehensive reference for all service classes in OpenQuery. ## 📋 Table of Contents 1. [Service Catalog](#service-catalog) 2. [Client Services](#client-services) 3. [Processing Services](#processing-services) 4. [Infrastructure Services](#infrastructure-services) 5. [Service Interactions](#service-interactions) ## Service Catalog OpenQuery's services are organized into three categories: | Category | Services | Purpose | |-----------|----------|---------| | **Clients** | `OpenRouterClient`, `SearxngClient` | External API communication | | **Processors** | `EmbeddingService`, `ChunkingService`, `ArticleService` | Data transformation & extraction | | **Infrastructure** | `RateLimiter`, `StatusReporter` | Cross-cutting concerns | All services are **stateless** (except for internal configuration) and can be safely reused across multiple operations. --- ## Client Services ### OpenRouterClient **Location**: `Services/OpenRouterClient.cs` **Purpose**: HTTP client for OpenRouter AI APIs (chat completions & embeddings) #### API Endpoints | Method | Endpoint | Purpose | |--------|----------|---------| | POST | `/chat/completions` | Chat completion (streaming or non-streaming) | | POST | `/embeddings` | Embedding generation for text inputs | #### Authentication ``` Authorization: Bearer {apiKey} Accept: application/json ``` #### Public Methods ##### `StreamAsync(ChatCompletionRequest request, CancellationToken cancellationToken)` - **Returns**: `IAsyncEnumerable` - **Behavior**: Sets `request.Stream = true`, posts, reads Server-Sent Events stream - **Use Case**: Final answer streaming, real-time responses - **Stream Format**: SSE lines `data: {json}`; yields `TextDelta` or `ToolCall` ##### `CompleteAsync(ChatCompletionRequest request)` - **Returns**: `Task` - **Behavior**: Sets `request.Stream = false`, posts, returns full response - **Use Case**: Query generation (non-streaming) ##### `EmbedAsync(string model, List inputs)` - **Returns**: `Task` - **Behavior**: POST `/embeddings`, returns array of vectors (ordered by input index) - **Use Case**: Batch embedding generation ##### `HttpClient` - **Property**: Internal `_httpClient` (created per instance) - **Note**: Could use `IHttpClientFactory` for pooling (not needed for CLI) #### Error Handling - `EnsureSuccessStatusCode()` throws `HttpRequestException` on 4xx/5xx - No retry logic (handled by `EmbeddingService`) #### Configuration ```csharp public OpenRouterClient(string apiKey) { _apiKey = apiKey; _httpClient = new HttpClient(); _httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey); _httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json")); } ``` #### Example Usage ```csharp var client = new OpenRouterClient("sk-or-..."); var request = new ChatCompletionRequest("model", new List { ... }); await foreach (var chunk in client.StreamAsync(request)) { Console.Write(chunk.TextDelta); } ``` --- ### SearxngClient **Location**: `Services/SearxngClient.cs` **Purpose**: HTTP client for SearxNG metasearch engine #### API Endpoint ``` GET /search?q={query}&format=json ``` #### Constructor ```csharp public SearxngClient(string baseUrl) // e.g., "http://localhost:8002" ``` - `baseUrl` trimmed of trailing `/` #### Public Methods ##### `SearchAsync(string query, int limit = 10)` - **Returns**: `Task>` - **Behavior**: GET request, deserialize JSON, take up to `limit` results - **On Failure**: Returns empty `List` (no exception) #### Error Handling - `response.EnsureSuccessStatusCode()` would throw, but code doesn't call it - If invalid JSON or missing `Results`, returns empty list - Failures are **tolerated** - individual search queries may fail without aborting whole operation #### Example Searxng Response ```json { "results": [ { "title": "Quantum Entanglement - Wikipedia", "url": "https://en.wikipedia.org/wiki/Quantum_entanglement", "content": "Quantum entanglement is a physical phenomenon..." }, ... ] } ``` --- ## Processing Services ### EmbeddingService **Location**: `Services/EmbeddingService.cs` **Purpose**: Generate embeddings with batching, rate limiting, and retry logic #### Configuration **Embedding Model**: `openai/text-embedding-3-small` (default, configurable via constructor) **ParallelProcessingOptions** (hardcoded defaults): ```csharp public class ParallelProcessingOptions { public int MaxConcurrentEmbeddingRequests { get; set; } = 4; public int EmbeddingBatchSize { get; set; } = 300; } ``` #### Public Methods ##### `GetEmbeddingsAsync(List texts, Action? onProgress, CancellationToken)` - **Returns**: `Task` - **Behavior**: - Splits `texts` into batches of `EmbeddingBatchSize` - Parallel executes batches (max `MaxConcurrentEmbeddingRequests` concurrent) - Each batch: rate-limited, retry-wrapped `client.EmbedAsync(model, batch)` - Reassembles in original order - Failed batches → empty `float[]` for each text - **Progress**: Invokes `onProgress` for each batch: `"[Generating embeddings: batch X/Y]"` - **Thread-Safe**: Uses lock for collecting results ##### `GetEmbeddingAsync(string text, CancellationToken)` - **Returns**: `Task` - **Behavior**: Single embedding with rate limiting and retry - **Use Case**: Query embedding ##### `Cos static float CosineSimilarity(float[] vector1, float[] vector2) ``` Uses `System.Numerics.Tensors.TensorPrimitives.CosineSimilarity` Returns float between -1 and 1 (typically 0-1 for normalized embeddings) ``` **Implementation**: Single line calling SIMD-accelerated tensor primitive --- ### ArticleService **Location**: `Services/ArticleService.cs` **Purpose**: Extract clean article content from web URLs #### Public Methods ##### `FetchArticleAsync(string url)` - **Returns**: `Task
` - **Behavior**: Delegates to `SmartReader.ParseArticleAsync(url)` - **Result**: `Article` with `Title`, `TextContent`, `IsReadable`, and metadata #### Errors - Propagates exceptions (SmartReader may throw on network failures, malformed HTML) - `SearchTool` catches and logs #### SmartReader Notes - Open-source article extraction library (bundled via NuGet) - Uses Readability algorithm (similar to Firefox Reader View) - Removes ads, navigation, boilerplate - `IsReadable` indicates quality (e.g., not a 404 page, not too short) --- ### ChunkingService **Location**: `Services/ChunkingService.cs` **Purpose**: Split text into 500-character chunks at natural boundaries #### Public Methods ##### `ChunkText(string text)` - **Returns**: `List` - **Algorithm**: - Constant `MAX_CHUNK_SIZE = 500` - While remaining text: - Take up to 500 chars - If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']` - Trim, add if non-empty - Advance start - Returns all chunks #### Characteristics - Static class (no instances) - Pure function (no side effects) - Zero dependencies - Handles edge cases (empty text, short text, text without breaks) --- ## Infrastructure Services ### RateLimiter **Location**: `Services/RateLimiter.cs` **Purpose**: Limit concurrent operations using semaphore #### Constructor ```csharp public RateLimiter(int maxConcurrentRequests) ``` Creates `SemaphoreSlim` with `maxConcurrentRequests` #### Public Methods ##### `ExecuteAsync(Func> action, CancellationToken)` ```csharp public async Task ExecuteAsync(Func> action, CancellationToken cancellationToken = default) { await _semaphore.WaitAsync(cancellationToken); try { return await action(); } finally { _semaphore.Release(); } } ``` - Waits for semaphore slot - Executes `action` (typically an API call) - Releases semaphore (even if exception) - Returns result from `action` ##### `ExecuteAsync(Func action, CancellationToken)` - Non-generic version (for void-returning actions) #### Disposal ```csharp public async ValueTask DisposeAsync() { _semaphore.Dispose(); } ``` Implements `IAsyncDisposable` for async cleanup #### Usage Pattern ```csharp var result = await _rateLimiter.ExecuteAsync(async () => { return await SomeApiCall(); }, cancellationToken); ``` #### Where Used - `EmbeddingService`: Limits concurrent embedding batch requests (default 4) --- ### StatusReporter **Location**: `Services/StatusReporter.cs` **Purpose**: Real-time progress display with spinner (compact) or verbose lines #### Constructor ```csharp public StatusReporter(bool verbose) ``` - `verbose = true`: all progress via `WriteLine()` (no spinner) - `verbose = false`: spinner with latest status #### Architecture **Components**: - `Channel _statusChannel` - producer-consumer queue - `Task _statusProcessor` - background task reading from channel - `CancellationTokenSource _spinnerCts` - spinner task cancellation - `Task _spinnerTask` - spinner animation task - `char[] _spinnerChars` - Braille spinner pattern **Spinner Animation**: - Runs at 10 FPS (100ms interval) - Cycles through `['⠋','⠙','⠹','⠸','⠼','⠴','⠦','⠧','⠇','⠏']` - Displays: `⠋ Fetching articles...` - Updates in place using ANSI: `\r\x1b[K` (carriage return + erase line) #### Public Methods ##### `UpdateStatus(string message)` - Fire-and-forget: writes to channel via `TryWrite` (non-blocking) - If channel full, message dropped (acceptable loss for UI) ##### `WriteLine(string text)` - Stops spinner temporarily - Clears current status line - Writes `text` with newline - In verbose mode: just `Console.WriteLine(text)` ##### `ClearStatus()` - In compact mode: `Console.Write("\r\x1b[K")` (erase line) - In verbose: no-op - Sets `_currentMessage = null` ##### `StartSpinner()` / `StopSpinner()` - Manual control (usually `StartSpinner` constructor call, `StopSpinner` by `Dispose`) ##### `Dispose()` - Completes channel writer - Awaits `_statusProcessor` completion - Calls `StopSpinner()` #### Background Processing **Status Processor**: ```csharp private async Task ProcessStatusUpdatesAsync() { await foreach (var message in _statusChannel.Reader.ReadAllAsync()) { if (_verbose) { Console.WriteLine(message); continue; } Console.Write("\r\x1b[K"); // Clear line Console.Write($"{_spinnerChars[0]} {message}"); // Static spinner _currentMessage = message; } } ``` **Spinner Task**: ```csharp _spinnerTask = Task.Run(async () => { while (_spinnerCts is { Token.IsCancellationRequested: false }) { if (_currentMessage != null) { Console.Write("\r\x1b[K"); var charIndex = index++ % spinner.Length; Console.Write($"{spinner[charIndex]} {_currentMessage}"); } await Task.Delay(100, _spinnerCts.Token); } }); ``` #### Thread Safety - `UpdateStatus` (producer) writes to channel - `ProcessStatusUpdatesAsync` (consumer) reads from channel - `_spinnerTask` runs concurrently - All UI writes happen in consumer/spinner task context (single-threaded UI) #### Design Notes - Could be simplified: just use `Console.CursorLeft` for spinner, no channel - Channel allows random `UpdateStatus` calls from any thread without blocking - Braille spinner requires terminal that supports Unicode (most modern terminals do) --- ## Service Interactions ### Dependency Graph ``` OpenQueryApp ├── OpenRouterClient ← (used for query gen + final answer) └── SearchTool ├── SearxngClient ├── ArticleService (uses SmartReader) ├── ChunkingService (static) ├── EmbeddingService │ └── OpenRouterClient (different instance) │ └── RateLimiter └── ParallelProcessingOptions (config) ``` ### Service Lifetimes All services are **transient** (new instance per query execution): - `OpenRouterClient` → 1 instance for query gen + answer - `SearxngClient` → 1 instance for all searches - `EmbeddingService` → 1 instance with its own `OpenRouterClient` and `RateLimiter` - `SearchTool` → 1 instance per query (constructed in `Program.cs`) No singleton or static state (except static utility classes like `ChunkingService`). ### Data Flow Through Services ``` OpenQueryApp │ ├─ OpenRouterClient.CompleteAsync() → query generation │ Messages → JSON → HTTP request → response → JSON → Messages │ └─ SearchTool.ExecuteAsync() │ ├─ SearxngClient.SearchAsync() × N │ query → URL encode → GET → JSON → SearxngResult[] │ ├─ ArticleService.FetchArticleAsync() × M │ URL → HTTP GET → SmartReader → Article │ ├─ ChunkingService.ChunkText() × M │ Article.TextContent → List chunks │ ├─ EmbeddingService.GetEmbeddingAsync(query) + GetEmbeddingsAsync(chunks[]) │ texts → batches → rate-limited HTTP POST → JSON → float[][] │ ├─ CosineSimilarity(queryEmbedding, chunkEmbedding) × M │ Vectors → dot product → magnitude → score │ └─ return context string (formatted chunks) ``` --- ## Next Steps - **[OpenQueryApp](../components/openquery-app.md)** - Orchestrates services - **[SearchTool](../components/search-tool.md)** - Coordinates pipeline - **[Models](../components/models.md)** - Data structures passed between services - **[API Reference](../../api/cli.md)** - CLI that uses these services --- **Service Design Principles**: - Single Responsibility: Each service does one thing well - Stateless: No instance state beyond constructor args - Composable: Services depend on abstractions (other services) not implementations - Testable: Can mock dependencies for unit testing