Files
OpenQuery/docs/components/services.md
OpenQuery Documentation 65ca2401ae docs: add comprehensive documentation with README and detailed guides
- Add user-friendly README.md with quick start guide
- Create docs/ folder with structured technical documentation:
  - installation.md: Build and setup instructions
  - configuration.md: Complete config reference
  - usage.md: CLI usage guide with examples
  - architecture.md: System design and patterns
  - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models)
  - api/: CLI reference, environment variables, programmatic API
  - troubleshooting.md: Common issues and solutions
  - performance.md: Latency, throughput, and optimization
- All documentation fully cross-referenced with internal links
- Covers project overview, architecture, components, APIs, and support

See individual files for complete documentation.
2026-03-19 10:01:58 +01:00

472 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Services Overview
Comprehensive reference for all service classes in OpenQuery.
## 📋 Table of Contents
1. [Service Catalog](#service-catalog)
2. [Client Services](#client-services)
3. [Processing Services](#processing-services)
4. [Infrastructure Services](#infrastructure-services)
5. [Service Interactions](#service-interactions)
## Service Catalog
OpenQuery's services are organized into three categories:
| Category | Services | Purpose |
|-----------|----------|---------|
| **Clients** | `OpenRouterClient`, `SearxngClient` | External API communication |
| **Processors** | `EmbeddingService`, `ChunkingService`, `ArticleService` | Data transformation & extraction |
| **Infrastructure** | `RateLimiter`, `StatusReporter` | Cross-cutting concerns |
All services are **stateless** (except for internal configuration) and can be safely reused across multiple operations.
---
## Client Services
### OpenRouterClient
**Location**: `Services/OpenRouterClient.cs`
**Purpose**: HTTP client for OpenRouter AI APIs (chat completions & embeddings)
#### API Endpoints
| Method | Endpoint | Purpose |
|--------|----------|---------|
| POST | `/chat/completions` | Chat completion (streaming or non-streaming) |
| POST | `/embeddings` | Embedding generation for text inputs |
#### Authentication
```
Authorization: Bearer {apiKey}
Accept: application/json
```
#### Public Methods
##### `StreamAsync(ChatCompletionRequest request, CancellationToken cancellationToken)`
- **Returns**: `IAsyncEnumerable<StreamChunk>`
- **Behavior**: Sets `request.Stream = true`, posts, reads Server-Sent Events stream
- **Use Case**: Final answer streaming, real-time responses
- **Stream Format**: SSE lines `data: {json}`; yields `TextDelta` or `ToolCall`
##### `CompleteAsync(ChatCompletionRequest request)`
- **Returns**: `Task<ChatCompletionResponse>`
- **Behavior**: Sets `request.Stream = false`, posts, returns full response
- **Use Case**: Query generation (non-streaming)
##### `EmbedAsync(string model, List<string> inputs)`
- **Returns**: `Task<float[][]>`
- **Behavior**: POST `/embeddings`, returns array of vectors (ordered by input index)
- **Use Case**: Batch embedding generation
##### `HttpClient`
- **Property**: Internal `_httpClient` (created per instance)
- **Note**: Could use `IHttpClientFactory` for pooling (not needed for CLI)
#### Error Handling
- `EnsureSuccessStatusCode()` throws `HttpRequestException` on 4xx/5xx
- No retry logic (handled by `EmbeddingService`)
#### Configuration
```csharp
public OpenRouterClient(string apiKey)
{
_apiKey = apiKey;
_httpClient = new HttpClient();
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
_httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
}
```
#### Example Usage
```csharp
var client = new OpenRouterClient("sk-or-...");
var request = new ChatCompletionRequest("model", new List<Message> { ... });
await foreach (var chunk in client.StreamAsync(request))
{
Console.Write(chunk.TextDelta);
}
```
---
### SearxngClient
**Location**: `Services/SearxngClient.cs`
**Purpose**: HTTP client for SearxNG metasearch engine
#### API Endpoint
```
GET /search?q={query}&format=json
```
#### Constructor
```csharp
public SearxngClient(string baseUrl) // e.g., "http://localhost:8002"
```
- `baseUrl` trimmed of trailing `/`
#### Public Methods
##### `SearchAsync(string query, int limit = 10)`
- **Returns**: `Task<List<SearxngResult>>`
- **Behavior**: GET request, deserialize JSON, take up to `limit` results
- **On Failure**: Returns empty `List<SearxngResult>` (no exception)
#### Error Handling
- `response.EnsureSuccessStatusCode()` would throw, but code doesn't call it
- If invalid JSON or missing `Results`, returns empty list
- Failures are **tolerated** - individual search queries may fail without aborting whole operation
#### Example Searxng Response
```json
{
"results": [
{
"title": "Quantum Entanglement - Wikipedia",
"url": "https://en.wikipedia.org/wiki/Quantum_entanglement",
"content": "Quantum entanglement is a physical phenomenon..."
},
...
]
}
```
---
## Processing Services
### EmbeddingService
**Location**: `Services/EmbeddingService.cs`
**Purpose**: Generate embeddings with batching, rate limiting, and retry logic
#### Configuration
**Embedding Model**: `openai/text-embedding-3-small` (default, configurable via constructor)
**ParallelProcessingOptions** (hardcoded defaults):
```csharp
public class ParallelProcessingOptions
{
public int MaxConcurrentEmbeddingRequests { get; set; } = 4;
public int EmbeddingBatchSize { get; set; } = 300;
}
```
#### Public Methods
##### `GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`
- **Returns**: `Task<float[][]>`
- **Behavior**:
- Splits `texts` into batches of `EmbeddingBatchSize`
- Parallel executes batches (max `MaxConcurrentEmbeddingRequests` concurrent)
- Each batch: rate-limited, retry-wrapped `client.EmbedAsync(model, batch)`
- Reassembles in original order
- Failed batches → empty `float[]` for each text
- **Progress**: Invokes `onProgress` for each batch: `"[Generating embeddings: batch X/Y]"`
- **Thread-Safe**: Uses lock for collecting results
##### `GetEmbeddingAsync(string text, CancellationToken)`
- **Returns**: `Task<float[]>`
- **Behavior**: Single embedding with rate limiting and retry
- **Use Case**: Query embedding
##### `Cos static float CosineSimilarity(float[] vector1, float[] vector2)
```
Uses `System.Numerics.Tensors.TensorPrimitives.CosineSimilarity`
Returns float between -1 and 1 (typically 0-1 for normalized embeddings)
```
**Implementation**: Single line calling SIMD-accelerated tensor primitive
---
### ArticleService
**Location**: `Services/ArticleService.cs`
**Purpose**: Extract clean article content from web URLs
#### Public Methods
##### `FetchArticleAsync(string url)`
- **Returns**: `Task<Article>`
- **Behavior**: Delegates to `SmartReader.ParseArticleAsync(url)`
- **Result**: `Article` with `Title`, `TextContent`, `IsReadable`, and metadata
#### Errors
- Propagates exceptions (SmartReader may throw on network failures, malformed HTML)
- `SearchTool` catches and logs
#### SmartReader Notes
- Open-source article extraction library (bundled via NuGet)
- Uses Readability algorithm (similar to Firefox Reader View)
- Removes ads, navigation, boilerplate
- `IsReadable` indicates quality (e.g., not a 404 page, not too short)
---
### ChunkingService
**Location**: `Services/ChunkingService.cs`
**Purpose**: Split text into 500-character chunks at natural boundaries
#### Public Methods
##### `ChunkText(string text)`
- **Returns**: `List<string>`
- **Algorithm**:
- Constant `MAX_CHUNK_SIZE = 500`
- While remaining text:
- Take up to 500 chars
- If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']`
- Trim, add if non-empty
- Advance start
- Returns all chunks
#### Characteristics
- Static class (no instances)
- Pure function (no side effects)
- Zero dependencies
- Handles edge cases (empty text, short text, text without breaks)
---
## Infrastructure Services
### RateLimiter
**Location**: `Services/RateLimiter.cs`
**Purpose**: Limit concurrent operations using semaphore
#### Constructor
```csharp
public RateLimiter(int maxConcurrentRequests)
```
Creates `SemaphoreSlim` with `maxConcurrentRequests`
#### Public Methods
##### `ExecuteAsync<T>(Func<Task<T>> action, CancellationToken)`
```csharp
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken cancellationToken = default)
{
await _semaphore.WaitAsync(cancellationToken);
try
{
return await action();
}
finally
{
_semaphore.Release();
}
}
```
- Waits for semaphore slot
- Executes `action` (typically an API call)
- Releases semaphore (even if exception)
- Returns result from `action`
##### `ExecuteAsync(Func<Task> action, CancellationToken)`
- Non-generic version (for void-returning actions)
#### Disposal
```csharp
public async ValueTask DisposeAsync()
{
_semaphore.Dispose();
}
```
Implements `IAsyncDisposable` for async cleanup
#### Usage Pattern
```csharp
var result = await _rateLimiter.ExecuteAsync(async () =>
{
return await SomeApiCall();
}, cancellationToken);
```
#### Where Used
- `EmbeddingService`: Limits concurrent embedding batch requests (default 4)
---
### StatusReporter
**Location**: `Services/StatusReporter.cs`
**Purpose**: Real-time progress display with spinner (compact) or verbose lines
#### Constructor
```csharp
public StatusReporter(bool verbose)
```
- `verbose = true`: all progress via `WriteLine()` (no spinner)
- `verbose = false`: spinner with latest status
#### Architecture
**Components**:
- `Channel<string> _statusChannel` - producer-consumer queue
- `Task _statusProcessor` - background task reading from channel
- `CancellationTokenSource _spinnerCts` - spinner task cancellation
- `Task _spinnerTask` - spinner animation task
- `char[] _spinnerChars` - Braille spinner pattern
**Spinner Animation**:
- Runs at 10 FPS (100ms interval)
- Cycles through `['⠋','⠙','⠹','⠸','⠼','⠴','⠦','⠧','⠇','⠏']`
- Displays: `⠋ Fetching articles...`
- Updates in place using ANSI: `\r\x1b[K` (carriage return + erase line)
#### Public Methods
##### `UpdateStatus(string message)`
- Fire-and-forget: writes to channel via `TryWrite` (non-blocking)
- If channel full, message dropped (acceptable loss for UI)
##### `WriteLine(string text)`
- Stops spinner temporarily
- Clears current status line
- Writes `text` with newline
- In verbose mode: just `Console.WriteLine(text)`
##### `ClearStatus()`
- In compact mode: `Console.Write("\r\x1b[K")` (erase line)
- In verbose: no-op
- Sets `_currentMessage = null`
##### `StartSpinner()` / `StopSpinner()`
- Manual control (usually `StartSpinner` constructor call, `StopSpinner` by `Dispose`)
##### `Dispose()`
- Completes channel writer
- Awaits `_statusProcessor` completion
- Calls `StopSpinner()`
#### Background Processing
**Status Processor**:
```csharp
private async Task ProcessStatusUpdatesAsync()
{
await foreach (var message in _statusChannel.Reader.ReadAllAsync())
{
if (_verbose)
{
Console.WriteLine(message);
continue;
}
Console.Write("\r\x1b[K"); // Clear line
Console.Write($"{_spinnerChars[0]} {message}"); // Static spinner
_currentMessage = message;
}
}
```
**Spinner Task**:
```csharp
_spinnerTask = Task.Run(async () =>
{
while (_spinnerCts is { Token.IsCancellationRequested: false })
{
if (_currentMessage != null)
{
Console.Write("\r\x1b[K");
var charIndex = index++ % spinner.Length;
Console.Write($"{spinner[charIndex]} {_currentMessage}");
}
await Task.Delay(100, _spinnerCts.Token);
}
});
```
#### Thread Safety
- `UpdateStatus` (producer) writes to channel
- `ProcessStatusUpdatesAsync` (consumer) reads from channel
- `_spinnerTask` runs concurrently
- All UI writes happen in consumer/spinner task context (single-threaded UI)
#### Design Notes
- Could be simplified: just use `Console.CursorLeft` for spinner, no channel
- Channel allows random `UpdateStatus` calls from any thread without blocking
- Braille spinner requires terminal that supports Unicode (most modern terminals do)
---
## Service Interactions
### Dependency Graph
```
OpenQueryApp
├── OpenRouterClient ← (used for query gen + final answer)
└── SearchTool
├── SearxngClient
├── ArticleService (uses SmartReader)
├── ChunkingService (static)
├── EmbeddingService
│ └── OpenRouterClient (different instance)
│ └── RateLimiter
└── ParallelProcessingOptions (config)
```
### Service Lifetimes
All services are **transient** (new instance per query execution):
- `OpenRouterClient` → 1 instance for query gen + answer
- `SearxngClient` → 1 instance for all searches
- `EmbeddingService` → 1 instance with its own `OpenRouterClient` and `RateLimiter`
- `SearchTool` → 1 instance per query (constructed in `Program.cs`)
No singleton or static state (except static utility classes like `ChunkingService`).
### Data Flow Through Services
```
OpenQueryApp
├─ OpenRouterClient.CompleteAsync() → query generation
│ Messages → JSON → HTTP request → response → JSON → Messages
└─ SearchTool.ExecuteAsync()
├─ SearxngClient.SearchAsync() × N
│ query → URL encode → GET → JSON → SearxngResult[]
├─ ArticleService.FetchArticleAsync() × M
│ URL → HTTP GET → SmartReader → Article
├─ ChunkingService.ChunkText() × M
│ Article.TextContent → List<string> chunks
├─ EmbeddingService.GetEmbeddingAsync(query) + GetEmbeddingsAsync(chunks[])
│ texts → batches → rate-limited HTTP POST → JSON → float[][]
├─ CosineSimilarity(queryEmbedding, chunkEmbedding) × M
│ Vectors → dot product → magnitude → score
└─ return context string (formatted chunks)
```
---
## Next Steps
- **[OpenQueryApp](../components/openquery-app.md)** - Orchestrates services
- **[SearchTool](../components/search-tool.md)** - Coordinates pipeline
- **[Models](../components/models.md)** - Data structures passed between services
- **[API Reference](../../api/cli.md)** - CLI that uses these services
---
**Service Design Principles**:
- Single Responsibility: Each service does one thing well
- Stateless: No instance state beyond constructor args
- Composable: Services depend on abstractions (other services) not implementations
- Testable: Can mock dependencies for unit testing