docs: add comprehensive documentation with README and detailed guides
- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
This commit is contained in:
471
docs/components/services.md
Normal file
471
docs/components/services.md
Normal file
@@ -0,0 +1,471 @@
|
||||
# Services Overview
|
||||
|
||||
Comprehensive reference for all service classes in OpenQuery.
|
||||
|
||||
## 📋 Table of Contents
|
||||
|
||||
1. [Service Catalog](#service-catalog)
|
||||
2. [Client Services](#client-services)
|
||||
3. [Processing Services](#processing-services)
|
||||
4. [Infrastructure Services](#infrastructure-services)
|
||||
5. [Service Interactions](#service-interactions)
|
||||
|
||||
## Service Catalog
|
||||
|
||||
OpenQuery's services are organized into three categories:
|
||||
|
||||
| Category | Services | Purpose |
|
||||
|-----------|----------|---------|
|
||||
| **Clients** | `OpenRouterClient`, `SearxngClient` | External API communication |
|
||||
| **Processors** | `EmbeddingService`, `ChunkingService`, `ArticleService` | Data transformation & extraction |
|
||||
| **Infrastructure** | `RateLimiter`, `StatusReporter` | Cross-cutting concerns |
|
||||
|
||||
All services are **stateless** (except for internal configuration) and can be safely reused across multiple operations.
|
||||
|
||||
---
|
||||
|
||||
## Client Services
|
||||
|
||||
### OpenRouterClient
|
||||
|
||||
**Location**: `Services/OpenRouterClient.cs`
|
||||
**Purpose**: HTTP client for OpenRouter AI APIs (chat completions & embeddings)
|
||||
|
||||
#### API Endpoints
|
||||
|
||||
| Method | Endpoint | Purpose |
|
||||
|--------|----------|---------|
|
||||
| POST | `/chat/completions` | Chat completion (streaming or non-streaming) |
|
||||
| POST | `/embeddings` | Embedding generation for text inputs |
|
||||
|
||||
#### Authentication
|
||||
```
|
||||
Authorization: Bearer {apiKey}
|
||||
Accept: application/json
|
||||
```
|
||||
|
||||
#### Public Methods
|
||||
|
||||
##### `StreamAsync(ChatCompletionRequest request, CancellationToken cancellationToken)`
|
||||
- **Returns**: `IAsyncEnumerable<StreamChunk>`
|
||||
- **Behavior**: Sets `request.Stream = true`, posts, reads Server-Sent Events stream
|
||||
- **Use Case**: Final answer streaming, real-time responses
|
||||
- **Stream Format**: SSE lines `data: {json}`; yields `TextDelta` or `ToolCall`
|
||||
|
||||
##### `CompleteAsync(ChatCompletionRequest request)`
|
||||
- **Returns**: `Task<ChatCompletionResponse>`
|
||||
- **Behavior**: Sets `request.Stream = false`, posts, returns full response
|
||||
- **Use Case**: Query generation (non-streaming)
|
||||
|
||||
##### `EmbedAsync(string model, List<string> inputs)`
|
||||
- **Returns**: `Task<float[][]>`
|
||||
- **Behavior**: POST `/embeddings`, returns array of vectors (ordered by input index)
|
||||
- **Use Case**: Batch embedding generation
|
||||
|
||||
##### `HttpClient`
|
||||
- **Property**: Internal `_httpClient` (created per instance)
|
||||
- **Note**: Could use `IHttpClientFactory` for pooling (not needed for CLI)
|
||||
|
||||
#### Error Handling
|
||||
- `EnsureSuccessStatusCode()` throws `HttpRequestException` on 4xx/5xx
|
||||
- No retry logic (handled by `EmbeddingService`)
|
||||
|
||||
#### Configuration
|
||||
```csharp
|
||||
public OpenRouterClient(string apiKey)
|
||||
{
|
||||
_apiKey = apiKey;
|
||||
_httpClient = new HttpClient();
|
||||
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
|
||||
_httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
|
||||
}
|
||||
```
|
||||
|
||||
#### Example Usage
|
||||
```csharp
|
||||
var client = new OpenRouterClient("sk-or-...");
|
||||
var request = new ChatCompletionRequest("model", new List<Message> { ... });
|
||||
await foreach (var chunk in client.StreamAsync(request))
|
||||
{
|
||||
Console.Write(chunk.TextDelta);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### SearxngClient
|
||||
|
||||
**Location**: `Services/SearxngClient.cs`
|
||||
**Purpose**: HTTP client for SearxNG metasearch engine
|
||||
|
||||
#### API Endpoint
|
||||
```
|
||||
GET /search?q={query}&format=json
|
||||
```
|
||||
|
||||
#### Constructor
|
||||
```csharp
|
||||
public SearxngClient(string baseUrl) // e.g., "http://localhost:8002"
|
||||
```
|
||||
- `baseUrl` trimmed of trailing `/`
|
||||
|
||||
#### Public Methods
|
||||
|
||||
##### `SearchAsync(string query, int limit = 10)`
|
||||
- **Returns**: `Task<List<SearxngResult>>`
|
||||
- **Behavior**: GET request, deserialize JSON, take up to `limit` results
|
||||
- **On Failure**: Returns empty `List<SearxngResult>` (no exception)
|
||||
|
||||
#### Error Handling
|
||||
- `response.EnsureSuccessStatusCode()` would throw, but code doesn't call it
|
||||
- If invalid JSON or missing `Results`, returns empty list
|
||||
- Failures are **tolerated** - individual search queries may fail without aborting whole operation
|
||||
|
||||
#### Example Searxng Response
|
||||
```json
|
||||
{
|
||||
"results": [
|
||||
{
|
||||
"title": "Quantum Entanglement - Wikipedia",
|
||||
"url": "https://en.wikipedia.org/wiki/Quantum_entanglement",
|
||||
"content": "Quantum entanglement is a physical phenomenon..."
|
||||
},
|
||||
...
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Processing Services
|
||||
|
||||
### EmbeddingService
|
||||
|
||||
**Location**: `Services/EmbeddingService.cs`
|
||||
**Purpose**: Generate embeddings with batching, rate limiting, and retry logic
|
||||
|
||||
#### Configuration
|
||||
|
||||
**Embedding Model**: `openai/text-embedding-3-small` (default, configurable via constructor)
|
||||
|
||||
**ParallelProcessingOptions** (hardcoded defaults):
|
||||
```csharp
|
||||
public class ParallelProcessingOptions
|
||||
{
|
||||
public int MaxConcurrentEmbeddingRequests { get; set; } = 4;
|
||||
public int EmbeddingBatchSize { get; set; } = 300;
|
||||
}
|
||||
```
|
||||
|
||||
#### Public Methods
|
||||
|
||||
##### `GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`
|
||||
- **Returns**: `Task<float[][]>`
|
||||
- **Behavior**:
|
||||
- Splits `texts` into batches of `EmbeddingBatchSize`
|
||||
- Parallel executes batches (max `MaxConcurrentEmbeddingRequests` concurrent)
|
||||
- Each batch: rate-limited, retry-wrapped `client.EmbedAsync(model, batch)`
|
||||
- Reassembles in original order
|
||||
- Failed batches → empty `float[]` for each text
|
||||
- **Progress**: Invokes `onProgress` for each batch: `"[Generating embeddings: batch X/Y]"`
|
||||
- **Thread-Safe**: Uses lock for collecting results
|
||||
|
||||
##### `GetEmbeddingAsync(string text, CancellationToken)`
|
||||
- **Returns**: `Task<float[]>`
|
||||
- **Behavior**: Single embedding with rate limiting and retry
|
||||
- **Use Case**: Query embedding
|
||||
|
||||
##### `Cos static float CosineSimilarity(float[] vector1, float[] vector2)
|
||||
```
|
||||
Uses `System.Numerics.Tensors.TensorPrimitives.CosineSimilarity`
|
||||
|
||||
Returns float between -1 and 1 (typically 0-1 for normalized embeddings)
|
||||
```
|
||||
|
||||
**Implementation**: Single line calling SIMD-accelerated tensor primitive
|
||||
|
||||
---
|
||||
|
||||
### ArticleService
|
||||
|
||||
**Location**: `Services/ArticleService.cs`
|
||||
**Purpose**: Extract clean article content from web URLs
|
||||
|
||||
#### Public Methods
|
||||
|
||||
##### `FetchArticleAsync(string url)`
|
||||
- **Returns**: `Task<Article>`
|
||||
- **Behavior**: Delegates to `SmartReader.ParseArticleAsync(url)`
|
||||
- **Result**: `Article` with `Title`, `TextContent`, `IsReadable`, and metadata
|
||||
|
||||
#### Errors
|
||||
- Propagates exceptions (SmartReader may throw on network failures, malformed HTML)
|
||||
- `SearchTool` catches and logs
|
||||
|
||||
#### SmartReader Notes
|
||||
- Open-source article extraction library (bundled via NuGet)
|
||||
- Uses Readability algorithm (similar to Firefox Reader View)
|
||||
- Removes ads, navigation, boilerplate
|
||||
- `IsReadable` indicates quality (e.g., not a 404 page, not too short)
|
||||
|
||||
---
|
||||
|
||||
### ChunkingService
|
||||
|
||||
**Location**: `Services/ChunkingService.cs`
|
||||
**Purpose**: Split text into 500-character chunks at natural boundaries
|
||||
|
||||
#### Public Methods
|
||||
|
||||
##### `ChunkText(string text)`
|
||||
- **Returns**: `List<string>`
|
||||
- **Algorithm**:
|
||||
- Constant `MAX_CHUNK_SIZE = 500`
|
||||
- While remaining text:
|
||||
- Take up to 500 chars
|
||||
- If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']`
|
||||
- Trim, add if non-empty
|
||||
- Advance start
|
||||
- Returns all chunks
|
||||
|
||||
#### Characteristics
|
||||
- Static class (no instances)
|
||||
- Pure function (no side effects)
|
||||
- Zero dependencies
|
||||
- Handles edge cases (empty text, short text, text without breaks)
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure Services
|
||||
|
||||
### RateLimiter
|
||||
|
||||
**Location**: `Services/RateLimiter.cs`
|
||||
**Purpose**: Limit concurrent operations using semaphore
|
||||
|
||||
#### Constructor
|
||||
```csharp
|
||||
public RateLimiter(int maxConcurrentRequests)
|
||||
```
|
||||
Creates `SemaphoreSlim` with `maxConcurrentRequests`
|
||||
|
||||
#### Public Methods
|
||||
|
||||
##### `ExecuteAsync<T>(Func<Task<T>> action, CancellationToken)`
|
||||
```csharp
|
||||
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken cancellationToken = default)
|
||||
{
|
||||
await _semaphore.WaitAsync(cancellationToken);
|
||||
try
|
||||
{
|
||||
return await action();
|
||||
}
|
||||
finally
|
||||
{
|
||||
_semaphore.Release();
|
||||
}
|
||||
}
|
||||
```
|
||||
- Waits for semaphore slot
|
||||
- Executes `action` (typically an API call)
|
||||
- Releases semaphore (even if exception)
|
||||
- Returns result from `action`
|
||||
|
||||
##### `ExecuteAsync(Func<Task> action, CancellationToken)`
|
||||
- Non-generic version (for void-returning actions)
|
||||
|
||||
#### Disposal
|
||||
```csharp
|
||||
public async ValueTask DisposeAsync()
|
||||
{
|
||||
_semaphore.Dispose();
|
||||
}
|
||||
```
|
||||
Implements `IAsyncDisposable` for async cleanup
|
||||
|
||||
#### Usage Pattern
|
||||
```csharp
|
||||
var result = await _rateLimiter.ExecuteAsync(async () =>
|
||||
{
|
||||
return await SomeApiCall();
|
||||
}, cancellationToken);
|
||||
```
|
||||
|
||||
#### Where Used
|
||||
- `EmbeddingService`: Limits concurrent embedding batch requests (default 4)
|
||||
|
||||
---
|
||||
|
||||
### StatusReporter
|
||||
|
||||
**Location**: `Services/StatusReporter.cs`
|
||||
**Purpose**: Real-time progress display with spinner (compact) or verbose lines
|
||||
|
||||
#### Constructor
|
||||
```csharp
|
||||
public StatusReporter(bool verbose)
|
||||
```
|
||||
- `verbose = true`: all progress via `WriteLine()` (no spinner)
|
||||
- `verbose = false`: spinner with latest status
|
||||
|
||||
#### Architecture
|
||||
|
||||
**Components**:
|
||||
- `Channel<string> _statusChannel` - producer-consumer queue
|
||||
- `Task _statusProcessor` - background task reading from channel
|
||||
- `CancellationTokenSource _spinnerCts` - spinner task cancellation
|
||||
- `Task _spinnerTask` - spinner animation task
|
||||
- `char[] _spinnerChars` - Braille spinner pattern
|
||||
|
||||
**Spinner Animation**:
|
||||
- Runs at 10 FPS (100ms interval)
|
||||
- Cycles through `['⠋','⠙','⠹','⠸','⠼','⠴','⠦','⠧','⠇','⠏']`
|
||||
- Displays: `⠋ Fetching articles...`
|
||||
- Updates in place using ANSI: `\r\x1b[K` (carriage return + erase line)
|
||||
|
||||
#### Public Methods
|
||||
|
||||
##### `UpdateStatus(string message)`
|
||||
- Fire-and-forget: writes to channel via `TryWrite` (non-blocking)
|
||||
- If channel full, message dropped (acceptable loss for UI)
|
||||
|
||||
##### `WriteLine(string text)`
|
||||
- Stops spinner temporarily
|
||||
- Clears current status line
|
||||
- Writes `text` with newline
|
||||
- In verbose mode: just `Console.WriteLine(text)`
|
||||
|
||||
##### `ClearStatus()`
|
||||
- In compact mode: `Console.Write("\r\x1b[K")` (erase line)
|
||||
- In verbose: no-op
|
||||
- Sets `_currentMessage = null`
|
||||
|
||||
##### `StartSpinner()` / `StopSpinner()`
|
||||
- Manual control (usually `StartSpinner` constructor call, `StopSpinner` by `Dispose`)
|
||||
|
||||
##### `Dispose()`
|
||||
- Completes channel writer
|
||||
- Awaits `_statusProcessor` completion
|
||||
- Calls `StopSpinner()`
|
||||
|
||||
#### Background Processing
|
||||
|
||||
**Status Processor**:
|
||||
```csharp
|
||||
private async Task ProcessStatusUpdatesAsync()
|
||||
{
|
||||
await foreach (var message in _statusChannel.Reader.ReadAllAsync())
|
||||
{
|
||||
if (_verbose)
|
||||
{
|
||||
Console.WriteLine(message);
|
||||
continue;
|
||||
}
|
||||
Console.Write("\r\x1b[K"); // Clear line
|
||||
Console.Write($"{_spinnerChars[0]} {message}"); // Static spinner
|
||||
_currentMessage = message;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Spinner Task**:
|
||||
```csharp
|
||||
_spinnerTask = Task.Run(async () =>
|
||||
{
|
||||
while (_spinnerCts is { Token.IsCancellationRequested: false })
|
||||
{
|
||||
if (_currentMessage != null)
|
||||
{
|
||||
Console.Write("\r\x1b[K");
|
||||
var charIndex = index++ % spinner.Length;
|
||||
Console.Write($"{spinner[charIndex]} {_currentMessage}");
|
||||
}
|
||||
await Task.Delay(100, _spinnerCts.Token);
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
#### Thread Safety
|
||||
- `UpdateStatus` (producer) writes to channel
|
||||
- `ProcessStatusUpdatesAsync` (consumer) reads from channel
|
||||
- `_spinnerTask` runs concurrently
|
||||
- All UI writes happen in consumer/spinner task context (single-threaded UI)
|
||||
|
||||
#### Design Notes
|
||||
- Could be simplified: just use `Console.CursorLeft` for spinner, no channel
|
||||
- Channel allows random `UpdateStatus` calls from any thread without blocking
|
||||
- Braille spinner requires terminal that supports Unicode (most modern terminals do)
|
||||
|
||||
---
|
||||
|
||||
## Service Interactions
|
||||
|
||||
### Dependency Graph
|
||||
|
||||
```
|
||||
OpenQueryApp
|
||||
├── OpenRouterClient ← (used for query gen + final answer)
|
||||
└── SearchTool
|
||||
├── SearxngClient
|
||||
├── ArticleService (uses SmartReader)
|
||||
├── ChunkingService (static)
|
||||
├── EmbeddingService
|
||||
│ └── OpenRouterClient (different instance)
|
||||
│ └── RateLimiter
|
||||
└── ParallelProcessingOptions (config)
|
||||
```
|
||||
|
||||
### Service Lifetimes
|
||||
|
||||
All services are **transient** (new instance per query execution):
|
||||
- `OpenRouterClient` → 1 instance for query gen + answer
|
||||
- `SearxngClient` → 1 instance for all searches
|
||||
- `EmbeddingService` → 1 instance with its own `OpenRouterClient` and `RateLimiter`
|
||||
- `SearchTool` → 1 instance per query (constructed in `Program.cs`)
|
||||
|
||||
No singleton or static state (except static utility classes like `ChunkingService`).
|
||||
|
||||
### Data Flow Through Services
|
||||
|
||||
```
|
||||
OpenQueryApp
|
||||
│
|
||||
├─ OpenRouterClient.CompleteAsync() → query generation
|
||||
│ Messages → JSON → HTTP request → response → JSON → Messages
|
||||
│
|
||||
└─ SearchTool.ExecuteAsync()
|
||||
│
|
||||
├─ SearxngClient.SearchAsync() × N
|
||||
│ query → URL encode → GET → JSON → SearxngResult[]
|
||||
│
|
||||
├─ ArticleService.FetchArticleAsync() × M
|
||||
│ URL → HTTP GET → SmartReader → Article
|
||||
│
|
||||
├─ ChunkingService.ChunkText() × M
|
||||
│ Article.TextContent → List<string> chunks
|
||||
│
|
||||
├─ EmbeddingService.GetEmbeddingAsync(query) + GetEmbeddingsAsync(chunks[])
|
||||
│ texts → batches → rate-limited HTTP POST → JSON → float[][]
|
||||
│
|
||||
├─ CosineSimilarity(queryEmbedding, chunkEmbedding) × M
|
||||
│ Vectors → dot product → magnitude → score
|
||||
│
|
||||
└─ return context string (formatted chunks)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- **[OpenQueryApp](../components/openquery-app.md)** - Orchestrates services
|
||||
- **[SearchTool](../components/search-tool.md)** - Coordinates pipeline
|
||||
- **[Models](../components/models.md)** - Data structures passed between services
|
||||
- **[API Reference](../../api/cli.md)** - CLI that uses these services
|
||||
|
||||
---
|
||||
|
||||
**Service Design Principles**:
|
||||
- Single Responsibility: Each service does one thing well
|
||||
- Stateless: No instance state beyond constructor args
|
||||
- Composable: Services depend on abstractions (other services) not implementations
|
||||
- Testable: Can mock dependencies for unit testing
|
||||
Reference in New Issue
Block a user