- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
472 lines
14 KiB
Markdown
472 lines
14 KiB
Markdown
# Services Overview
|
||
|
||
Comprehensive reference for all service classes in OpenQuery.
|
||
|
||
## 📋 Table of Contents
|
||
|
||
1. [Service Catalog](#service-catalog)
|
||
2. [Client Services](#client-services)
|
||
3. [Processing Services](#processing-services)
|
||
4. [Infrastructure Services](#infrastructure-services)
|
||
5. [Service Interactions](#service-interactions)
|
||
|
||
## Service Catalog
|
||
|
||
OpenQuery's services are organized into three categories:
|
||
|
||
| Category | Services | Purpose |
|
||
|-----------|----------|---------|
|
||
| **Clients** | `OpenRouterClient`, `SearxngClient` | External API communication |
|
||
| **Processors** | `EmbeddingService`, `ChunkingService`, `ArticleService` | Data transformation & extraction |
|
||
| **Infrastructure** | `RateLimiter`, `StatusReporter` | Cross-cutting concerns |
|
||
|
||
All services are **stateless** (except for internal configuration) and can be safely reused across multiple operations.
|
||
|
||
---
|
||
|
||
## Client Services
|
||
|
||
### OpenRouterClient
|
||
|
||
**Location**: `Services/OpenRouterClient.cs`
|
||
**Purpose**: HTTP client for OpenRouter AI APIs (chat completions & embeddings)
|
||
|
||
#### API Endpoints
|
||
|
||
| Method | Endpoint | Purpose |
|
||
|--------|----------|---------|
|
||
| POST | `/chat/completions` | Chat completion (streaming or non-streaming) |
|
||
| POST | `/embeddings` | Embedding generation for text inputs |
|
||
|
||
#### Authentication
|
||
```
|
||
Authorization: Bearer {apiKey}
|
||
Accept: application/json
|
||
```
|
||
|
||
#### Public Methods
|
||
|
||
##### `StreamAsync(ChatCompletionRequest request, CancellationToken cancellationToken)`
|
||
- **Returns**: `IAsyncEnumerable<StreamChunk>`
|
||
- **Behavior**: Sets `request.Stream = true`, posts, reads Server-Sent Events stream
|
||
- **Use Case**: Final answer streaming, real-time responses
|
||
- **Stream Format**: SSE lines `data: {json}`; yields `TextDelta` or `ToolCall`
|
||
|
||
##### `CompleteAsync(ChatCompletionRequest request)`
|
||
- **Returns**: `Task<ChatCompletionResponse>`
|
||
- **Behavior**: Sets `request.Stream = false`, posts, returns full response
|
||
- **Use Case**: Query generation (non-streaming)
|
||
|
||
##### `EmbedAsync(string model, List<string> inputs)`
|
||
- **Returns**: `Task<float[][]>`
|
||
- **Behavior**: POST `/embeddings`, returns array of vectors (ordered by input index)
|
||
- **Use Case**: Batch embedding generation
|
||
|
||
##### `HttpClient`
|
||
- **Property**: Internal `_httpClient` (created per instance)
|
||
- **Note**: Could use `IHttpClientFactory` for pooling (not needed for CLI)
|
||
|
||
#### Error Handling
|
||
- `EnsureSuccessStatusCode()` throws `HttpRequestException` on 4xx/5xx
|
||
- No retry logic (handled by `EmbeddingService`)
|
||
|
||
#### Configuration
|
||
```csharp
|
||
public OpenRouterClient(string apiKey)
|
||
{
|
||
_apiKey = apiKey;
|
||
_httpClient = new HttpClient();
|
||
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
|
||
_httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
|
||
}
|
||
```
|
||
|
||
#### Example Usage
|
||
```csharp
|
||
var client = new OpenRouterClient("sk-or-...");
|
||
var request = new ChatCompletionRequest("model", new List<Message> { ... });
|
||
await foreach (var chunk in client.StreamAsync(request))
|
||
{
|
||
Console.Write(chunk.TextDelta);
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### SearxngClient
|
||
|
||
**Location**: `Services/SearxngClient.cs`
|
||
**Purpose**: HTTP client for SearxNG metasearch engine
|
||
|
||
#### API Endpoint
|
||
```
|
||
GET /search?q={query}&format=json
|
||
```
|
||
|
||
#### Constructor
|
||
```csharp
|
||
public SearxngClient(string baseUrl) // e.g., "http://localhost:8002"
|
||
```
|
||
- `baseUrl` trimmed of trailing `/`
|
||
|
||
#### Public Methods
|
||
|
||
##### `SearchAsync(string query, int limit = 10)`
|
||
- **Returns**: `Task<List<SearxngResult>>`
|
||
- **Behavior**: GET request, deserialize JSON, take up to `limit` results
|
||
- **On Failure**: Returns empty `List<SearxngResult>` (no exception)
|
||
|
||
#### Error Handling
|
||
- `response.EnsureSuccessStatusCode()` would throw, but code doesn't call it
|
||
- If invalid JSON or missing `Results`, returns empty list
|
||
- Failures are **tolerated** - individual search queries may fail without aborting whole operation
|
||
|
||
#### Example Searxng Response
|
||
```json
|
||
{
|
||
"results": [
|
||
{
|
||
"title": "Quantum Entanglement - Wikipedia",
|
||
"url": "https://en.wikipedia.org/wiki/Quantum_entanglement",
|
||
"content": "Quantum entanglement is a physical phenomenon..."
|
||
},
|
||
...
|
||
]
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Processing Services
|
||
|
||
### EmbeddingService
|
||
|
||
**Location**: `Services/EmbeddingService.cs`
|
||
**Purpose**: Generate embeddings with batching, rate limiting, and retry logic
|
||
|
||
#### Configuration
|
||
|
||
**Embedding Model**: `openai/text-embedding-3-small` (default, configurable via constructor)
|
||
|
||
**ParallelProcessingOptions** (hardcoded defaults):
|
||
```csharp
|
||
public class ParallelProcessingOptions
|
||
{
|
||
public int MaxConcurrentEmbeddingRequests { get; set; } = 4;
|
||
public int EmbeddingBatchSize { get; set; } = 300;
|
||
}
|
||
```
|
||
|
||
#### Public Methods
|
||
|
||
##### `GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`
|
||
- **Returns**: `Task<float[][]>`
|
||
- **Behavior**:
|
||
- Splits `texts` into batches of `EmbeddingBatchSize`
|
||
- Parallel executes batches (max `MaxConcurrentEmbeddingRequests` concurrent)
|
||
- Each batch: rate-limited, retry-wrapped `client.EmbedAsync(model, batch)`
|
||
- Reassembles in original order
|
||
- Failed batches → empty `float[]` for each text
|
||
- **Progress**: Invokes `onProgress` for each batch: `"[Generating embeddings: batch X/Y]"`
|
||
- **Thread-Safe**: Uses lock for collecting results
|
||
|
||
##### `GetEmbeddingAsync(string text, CancellationToken)`
|
||
- **Returns**: `Task<float[]>`
|
||
- **Behavior**: Single embedding with rate limiting and retry
|
||
- **Use Case**: Query embedding
|
||
|
||
##### `Cos static float CosineSimilarity(float[] vector1, float[] vector2)
|
||
```
|
||
Uses `System.Numerics.Tensors.TensorPrimitives.CosineSimilarity`
|
||
|
||
Returns float between -1 and 1 (typically 0-1 for normalized embeddings)
|
||
```
|
||
|
||
**Implementation**: Single line calling SIMD-accelerated tensor primitive
|
||
|
||
---
|
||
|
||
### ArticleService
|
||
|
||
**Location**: `Services/ArticleService.cs`
|
||
**Purpose**: Extract clean article content from web URLs
|
||
|
||
#### Public Methods
|
||
|
||
##### `FetchArticleAsync(string url)`
|
||
- **Returns**: `Task<Article>`
|
||
- **Behavior**: Delegates to `SmartReader.ParseArticleAsync(url)`
|
||
- **Result**: `Article` with `Title`, `TextContent`, `IsReadable`, and metadata
|
||
|
||
#### Errors
|
||
- Propagates exceptions (SmartReader may throw on network failures, malformed HTML)
|
||
- `SearchTool` catches and logs
|
||
|
||
#### SmartReader Notes
|
||
- Open-source article extraction library (bundled via NuGet)
|
||
- Uses Readability algorithm (similar to Firefox Reader View)
|
||
- Removes ads, navigation, boilerplate
|
||
- `IsReadable` indicates quality (e.g., not a 404 page, not too short)
|
||
|
||
---
|
||
|
||
### ChunkingService
|
||
|
||
**Location**: `Services/ChunkingService.cs`
|
||
**Purpose**: Split text into 500-character chunks at natural boundaries
|
||
|
||
#### Public Methods
|
||
|
||
##### `ChunkText(string text)`
|
||
- **Returns**: `List<string>`
|
||
- **Algorithm**:
|
||
- Constant `MAX_CHUNK_SIZE = 500`
|
||
- While remaining text:
|
||
- Take up to 500 chars
|
||
- If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']`
|
||
- Trim, add if non-empty
|
||
- Advance start
|
||
- Returns all chunks
|
||
|
||
#### Characteristics
|
||
- Static class (no instances)
|
||
- Pure function (no side effects)
|
||
- Zero dependencies
|
||
- Handles edge cases (empty text, short text, text without breaks)
|
||
|
||
---
|
||
|
||
## Infrastructure Services
|
||
|
||
### RateLimiter
|
||
|
||
**Location**: `Services/RateLimiter.cs`
|
||
**Purpose**: Limit concurrent operations using semaphore
|
||
|
||
#### Constructor
|
||
```csharp
|
||
public RateLimiter(int maxConcurrentRequests)
|
||
```
|
||
Creates `SemaphoreSlim` with `maxConcurrentRequests`
|
||
|
||
#### Public Methods
|
||
|
||
##### `ExecuteAsync<T>(Func<Task<T>> action, CancellationToken)`
|
||
```csharp
|
||
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken cancellationToken = default)
|
||
{
|
||
await _semaphore.WaitAsync(cancellationToken);
|
||
try
|
||
{
|
||
return await action();
|
||
}
|
||
finally
|
||
{
|
||
_semaphore.Release();
|
||
}
|
||
}
|
||
```
|
||
- Waits for semaphore slot
|
||
- Executes `action` (typically an API call)
|
||
- Releases semaphore (even if exception)
|
||
- Returns result from `action`
|
||
|
||
##### `ExecuteAsync(Func<Task> action, CancellationToken)`
|
||
- Non-generic version (for void-returning actions)
|
||
|
||
#### Disposal
|
||
```csharp
|
||
public async ValueTask DisposeAsync()
|
||
{
|
||
_semaphore.Dispose();
|
||
}
|
||
```
|
||
Implements `IAsyncDisposable` for async cleanup
|
||
|
||
#### Usage Pattern
|
||
```csharp
|
||
var result = await _rateLimiter.ExecuteAsync(async () =>
|
||
{
|
||
return await SomeApiCall();
|
||
}, cancellationToken);
|
||
```
|
||
|
||
#### Where Used
|
||
- `EmbeddingService`: Limits concurrent embedding batch requests (default 4)
|
||
|
||
---
|
||
|
||
### StatusReporter
|
||
|
||
**Location**: `Services/StatusReporter.cs`
|
||
**Purpose**: Real-time progress display with spinner (compact) or verbose lines
|
||
|
||
#### Constructor
|
||
```csharp
|
||
public StatusReporter(bool verbose)
|
||
```
|
||
- `verbose = true`: all progress via `WriteLine()` (no spinner)
|
||
- `verbose = false`: spinner with latest status
|
||
|
||
#### Architecture
|
||
|
||
**Components**:
|
||
- `Channel<string> _statusChannel` - producer-consumer queue
|
||
- `Task _statusProcessor` - background task reading from channel
|
||
- `CancellationTokenSource _spinnerCts` - spinner task cancellation
|
||
- `Task _spinnerTask` - spinner animation task
|
||
- `char[] _spinnerChars` - Braille spinner pattern
|
||
|
||
**Spinner Animation**:
|
||
- Runs at 10 FPS (100ms interval)
|
||
- Cycles through `['⠋','⠙','⠹','⠸','⠼','⠴','⠦','⠧','⠇','⠏']`
|
||
- Displays: `⠋ Fetching articles...`
|
||
- Updates in place using ANSI: `\r\x1b[K` (carriage return + erase line)
|
||
|
||
#### Public Methods
|
||
|
||
##### `UpdateStatus(string message)`
|
||
- Fire-and-forget: writes to channel via `TryWrite` (non-blocking)
|
||
- If channel full, message dropped (acceptable loss for UI)
|
||
|
||
##### `WriteLine(string text)`
|
||
- Stops spinner temporarily
|
||
- Clears current status line
|
||
- Writes `text` with newline
|
||
- In verbose mode: just `Console.WriteLine(text)`
|
||
|
||
##### `ClearStatus()`
|
||
- In compact mode: `Console.Write("\r\x1b[K")` (erase line)
|
||
- In verbose: no-op
|
||
- Sets `_currentMessage = null`
|
||
|
||
##### `StartSpinner()` / `StopSpinner()`
|
||
- Manual control (usually `StartSpinner` constructor call, `StopSpinner` by `Dispose`)
|
||
|
||
##### `Dispose()`
|
||
- Completes channel writer
|
||
- Awaits `_statusProcessor` completion
|
||
- Calls `StopSpinner()`
|
||
|
||
#### Background Processing
|
||
|
||
**Status Processor**:
|
||
```csharp
|
||
private async Task ProcessStatusUpdatesAsync()
|
||
{
|
||
await foreach (var message in _statusChannel.Reader.ReadAllAsync())
|
||
{
|
||
if (_verbose)
|
||
{
|
||
Console.WriteLine(message);
|
||
continue;
|
||
}
|
||
Console.Write("\r\x1b[K"); // Clear line
|
||
Console.Write($"{_spinnerChars[0]} {message}"); // Static spinner
|
||
_currentMessage = message;
|
||
}
|
||
}
|
||
```
|
||
|
||
**Spinner Task**:
|
||
```csharp
|
||
_spinnerTask = Task.Run(async () =>
|
||
{
|
||
while (_spinnerCts is { Token.IsCancellationRequested: false })
|
||
{
|
||
if (_currentMessage != null)
|
||
{
|
||
Console.Write("\r\x1b[K");
|
||
var charIndex = index++ % spinner.Length;
|
||
Console.Write($"{spinner[charIndex]} {_currentMessage}");
|
||
}
|
||
await Task.Delay(100, _spinnerCts.Token);
|
||
}
|
||
});
|
||
```
|
||
|
||
#### Thread Safety
|
||
- `UpdateStatus` (producer) writes to channel
|
||
- `ProcessStatusUpdatesAsync` (consumer) reads from channel
|
||
- `_spinnerTask` runs concurrently
|
||
- All UI writes happen in consumer/spinner task context (single-threaded UI)
|
||
|
||
#### Design Notes
|
||
- Could be simplified: just use `Console.CursorLeft` for spinner, no channel
|
||
- Channel allows random `UpdateStatus` calls from any thread without blocking
|
||
- Braille spinner requires terminal that supports Unicode (most modern terminals do)
|
||
|
||
---
|
||
|
||
## Service Interactions
|
||
|
||
### Dependency Graph
|
||
|
||
```
|
||
OpenQueryApp
|
||
├── OpenRouterClient ← (used for query gen + final answer)
|
||
└── SearchTool
|
||
├── SearxngClient
|
||
├── ArticleService (uses SmartReader)
|
||
├── ChunkingService (static)
|
||
├── EmbeddingService
|
||
│ └── OpenRouterClient (different instance)
|
||
│ └── RateLimiter
|
||
└── ParallelProcessingOptions (config)
|
||
```
|
||
|
||
### Service Lifetimes
|
||
|
||
All services are **transient** (new instance per query execution):
|
||
- `OpenRouterClient` → 1 instance for query gen + answer
|
||
- `SearxngClient` → 1 instance for all searches
|
||
- `EmbeddingService` → 1 instance with its own `OpenRouterClient` and `RateLimiter`
|
||
- `SearchTool` → 1 instance per query (constructed in `Program.cs`)
|
||
|
||
No singleton or static state (except static utility classes like `ChunkingService`).
|
||
|
||
### Data Flow Through Services
|
||
|
||
```
|
||
OpenQueryApp
|
||
│
|
||
├─ OpenRouterClient.CompleteAsync() → query generation
|
||
│ Messages → JSON → HTTP request → response → JSON → Messages
|
||
│
|
||
└─ SearchTool.ExecuteAsync()
|
||
│
|
||
├─ SearxngClient.SearchAsync() × N
|
||
│ query → URL encode → GET → JSON → SearxngResult[]
|
||
│
|
||
├─ ArticleService.FetchArticleAsync() × M
|
||
│ URL → HTTP GET → SmartReader → Article
|
||
│
|
||
├─ ChunkingService.ChunkText() × M
|
||
│ Article.TextContent → List<string> chunks
|
||
│
|
||
├─ EmbeddingService.GetEmbeddingAsync(query) + GetEmbeddingsAsync(chunks[])
|
||
│ texts → batches → rate-limited HTTP POST → JSON → float[][]
|
||
│
|
||
├─ CosineSimilarity(queryEmbedding, chunkEmbedding) × M
|
||
│ Vectors → dot product → magnitude → score
|
||
│
|
||
└─ return context string (formatted chunks)
|
||
```
|
||
|
||
---
|
||
|
||
## Next Steps
|
||
|
||
- **[OpenQueryApp](../components/openquery-app.md)** - Orchestrates services
|
||
- **[SearchTool](../components/search-tool.md)** - Coordinates pipeline
|
||
- **[Models](../components/models.md)** - Data structures passed between services
|
||
- **[API Reference](../../api/cli.md)** - CLI that uses these services
|
||
|
||
---
|
||
|
||
**Service Design Principles**:
|
||
- Single Responsibility: Each service does one thing well
|
||
- Stateless: No instance state beyond constructor args
|
||
- Composable: Services depend on abstractions (other services) not implementations
|
||
- Testable: Can mock dependencies for unit testing
|