docs: add comprehensive documentation with README and detailed guides

- Add user-friendly README.md with quick start guide
- Create docs/ folder with structured technical documentation:
  - installation.md: Build and setup instructions
  - configuration.md: Complete config reference
  - usage.md: CLI usage guide with examples
  - architecture.md: System design and patterns
  - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models)
  - api/: CLI reference, environment variables, programmatic API
  - troubleshooting.md: Common issues and solutions
  - performance.md: Latency, throughput, and optimization
- All documentation fully cross-referenced with internal links
- Covers project overview, architecture, components, APIs, and support

See individual files for complete documentation.
This commit is contained in:
OpenQuery Documentation
2026-03-19 10:01:58 +01:00
parent b28d8998f7
commit 65ca2401ae
16 changed files with 7073 additions and 0 deletions

471
docs/components/services.md Normal file
View File

@@ -0,0 +1,471 @@
# Services Overview
Comprehensive reference for all service classes in OpenQuery.
## 📋 Table of Contents
1. [Service Catalog](#service-catalog)
2. [Client Services](#client-services)
3. [Processing Services](#processing-services)
4. [Infrastructure Services](#infrastructure-services)
5. [Service Interactions](#service-interactions)
## Service Catalog
OpenQuery's services are organized into three categories:
| Category | Services | Purpose |
|-----------|----------|---------|
| **Clients** | `OpenRouterClient`, `SearxngClient` | External API communication |
| **Processors** | `EmbeddingService`, `ChunkingService`, `ArticleService` | Data transformation & extraction |
| **Infrastructure** | `RateLimiter`, `StatusReporter` | Cross-cutting concerns |
All services are **stateless** (except for internal configuration) and can be safely reused across multiple operations.
---
## Client Services
### OpenRouterClient
**Location**: `Services/OpenRouterClient.cs`
**Purpose**: HTTP client for OpenRouter AI APIs (chat completions & embeddings)
#### API Endpoints
| Method | Endpoint | Purpose |
|--------|----------|---------|
| POST | `/chat/completions` | Chat completion (streaming or non-streaming) |
| POST | `/embeddings` | Embedding generation for text inputs |
#### Authentication
```
Authorization: Bearer {apiKey}
Accept: application/json
```
#### Public Methods
##### `StreamAsync(ChatCompletionRequest request, CancellationToken cancellationToken)`
- **Returns**: `IAsyncEnumerable<StreamChunk>`
- **Behavior**: Sets `request.Stream = true`, posts, reads Server-Sent Events stream
- **Use Case**: Final answer streaming, real-time responses
- **Stream Format**: SSE lines `data: {json}`; yields `TextDelta` or `ToolCall`
##### `CompleteAsync(ChatCompletionRequest request)`
- **Returns**: `Task<ChatCompletionResponse>`
- **Behavior**: Sets `request.Stream = false`, posts, returns full response
- **Use Case**: Query generation (non-streaming)
##### `EmbedAsync(string model, List<string> inputs)`
- **Returns**: `Task<float[][]>`
- **Behavior**: POST `/embeddings`, returns array of vectors (ordered by input index)
- **Use Case**: Batch embedding generation
##### `HttpClient`
- **Property**: Internal `_httpClient` (created per instance)
- **Note**: Could use `IHttpClientFactory` for pooling (not needed for CLI)
#### Error Handling
- `EnsureSuccessStatusCode()` throws `HttpRequestException` on 4xx/5xx
- No retry logic (handled by `EmbeddingService`)
#### Configuration
```csharp
public OpenRouterClient(string apiKey)
{
_apiKey = apiKey;
_httpClient = new HttpClient();
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
_httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
}
```
#### Example Usage
```csharp
var client = new OpenRouterClient("sk-or-...");
var request = new ChatCompletionRequest("model", new List<Message> { ... });
await foreach (var chunk in client.StreamAsync(request))
{
Console.Write(chunk.TextDelta);
}
```
---
### SearxngClient
**Location**: `Services/SearxngClient.cs`
**Purpose**: HTTP client for SearxNG metasearch engine
#### API Endpoint
```
GET /search?q={query}&format=json
```
#### Constructor
```csharp
public SearxngClient(string baseUrl) // e.g., "http://localhost:8002"
```
- `baseUrl` trimmed of trailing `/`
#### Public Methods
##### `SearchAsync(string query, int limit = 10)`
- **Returns**: `Task<List<SearxngResult>>`
- **Behavior**: GET request, deserialize JSON, take up to `limit` results
- **On Failure**: Returns empty `List<SearxngResult>` (no exception)
#### Error Handling
- `response.EnsureSuccessStatusCode()` would throw, but code doesn't call it
- If invalid JSON or missing `Results`, returns empty list
- Failures are **tolerated** - individual search queries may fail without aborting whole operation
#### Example Searxng Response
```json
{
"results": [
{
"title": "Quantum Entanglement - Wikipedia",
"url": "https://en.wikipedia.org/wiki/Quantum_entanglement",
"content": "Quantum entanglement is a physical phenomenon..."
},
...
]
}
```
---
## Processing Services
### EmbeddingService
**Location**: `Services/EmbeddingService.cs`
**Purpose**: Generate embeddings with batching, rate limiting, and retry logic
#### Configuration
**Embedding Model**: `openai/text-embedding-3-small` (default, configurable via constructor)
**ParallelProcessingOptions** (hardcoded defaults):
```csharp
public class ParallelProcessingOptions
{
public int MaxConcurrentEmbeddingRequests { get; set; } = 4;
public int EmbeddingBatchSize { get; set; } = 300;
}
```
#### Public Methods
##### `GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`
- **Returns**: `Task<float[][]>`
- **Behavior**:
- Splits `texts` into batches of `EmbeddingBatchSize`
- Parallel executes batches (max `MaxConcurrentEmbeddingRequests` concurrent)
- Each batch: rate-limited, retry-wrapped `client.EmbedAsync(model, batch)`
- Reassembles in original order
- Failed batches → empty `float[]` for each text
- **Progress**: Invokes `onProgress` for each batch: `"[Generating embeddings: batch X/Y]"`
- **Thread-Safe**: Uses lock for collecting results
##### `GetEmbeddingAsync(string text, CancellationToken)`
- **Returns**: `Task<float[]>`
- **Behavior**: Single embedding with rate limiting and retry
- **Use Case**: Query embedding
##### `Cos static float CosineSimilarity(float[] vector1, float[] vector2)
```
Uses `System.Numerics.Tensors.TensorPrimitives.CosineSimilarity`
Returns float between -1 and 1 (typically 0-1 for normalized embeddings)
```
**Implementation**: Single line calling SIMD-accelerated tensor primitive
---
### ArticleService
**Location**: `Services/ArticleService.cs`
**Purpose**: Extract clean article content from web URLs
#### Public Methods
##### `FetchArticleAsync(string url)`
- **Returns**: `Task<Article>`
- **Behavior**: Delegates to `SmartReader.ParseArticleAsync(url)`
- **Result**: `Article` with `Title`, `TextContent`, `IsReadable`, and metadata
#### Errors
- Propagates exceptions (SmartReader may throw on network failures, malformed HTML)
- `SearchTool` catches and logs
#### SmartReader Notes
- Open-source article extraction library (bundled via NuGet)
- Uses Readability algorithm (similar to Firefox Reader View)
- Removes ads, navigation, boilerplate
- `IsReadable` indicates quality (e.g., not a 404 page, not too short)
---
### ChunkingService
**Location**: `Services/ChunkingService.cs`
**Purpose**: Split text into 500-character chunks at natural boundaries
#### Public Methods
##### `ChunkText(string text)`
- **Returns**: `List<string>`
- **Algorithm**:
- Constant `MAX_CHUNK_SIZE = 500`
- While remaining text:
- Take up to 500 chars
- If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']`
- Trim, add if non-empty
- Advance start
- Returns all chunks
#### Characteristics
- Static class (no instances)
- Pure function (no side effects)
- Zero dependencies
- Handles edge cases (empty text, short text, text without breaks)
---
## Infrastructure Services
### RateLimiter
**Location**: `Services/RateLimiter.cs`
**Purpose**: Limit concurrent operations using semaphore
#### Constructor
```csharp
public RateLimiter(int maxConcurrentRequests)
```
Creates `SemaphoreSlim` with `maxConcurrentRequests`
#### Public Methods
##### `ExecuteAsync<T>(Func<Task<T>> action, CancellationToken)`
```csharp
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken cancellationToken = default)
{
await _semaphore.WaitAsync(cancellationToken);
try
{
return await action();
}
finally
{
_semaphore.Release();
}
}
```
- Waits for semaphore slot
- Executes `action` (typically an API call)
- Releases semaphore (even if exception)
- Returns result from `action`
##### `ExecuteAsync(Func<Task> action, CancellationToken)`
- Non-generic version (for void-returning actions)
#### Disposal
```csharp
public async ValueTask DisposeAsync()
{
_semaphore.Dispose();
}
```
Implements `IAsyncDisposable` for async cleanup
#### Usage Pattern
```csharp
var result = await _rateLimiter.ExecuteAsync(async () =>
{
return await SomeApiCall();
}, cancellationToken);
```
#### Where Used
- `EmbeddingService`: Limits concurrent embedding batch requests (default 4)
---
### StatusReporter
**Location**: `Services/StatusReporter.cs`
**Purpose**: Real-time progress display with spinner (compact) or verbose lines
#### Constructor
```csharp
public StatusReporter(bool verbose)
```
- `verbose = true`: all progress via `WriteLine()` (no spinner)
- `verbose = false`: spinner with latest status
#### Architecture
**Components**:
- `Channel<string> _statusChannel` - producer-consumer queue
- `Task _statusProcessor` - background task reading from channel
- `CancellationTokenSource _spinnerCts` - spinner task cancellation
- `Task _spinnerTask` - spinner animation task
- `char[] _spinnerChars` - Braille spinner pattern
**Spinner Animation**:
- Runs at 10 FPS (100ms interval)
- Cycles through `['⠋','⠙','⠹','⠸','⠼','⠴','⠦','⠧','⠇','⠏']`
- Displays: `⠋ Fetching articles...`
- Updates in place using ANSI: `\r\x1b[K` (carriage return + erase line)
#### Public Methods
##### `UpdateStatus(string message)`
- Fire-and-forget: writes to channel via `TryWrite` (non-blocking)
- If channel full, message dropped (acceptable loss for UI)
##### `WriteLine(string text)`
- Stops spinner temporarily
- Clears current status line
- Writes `text` with newline
- In verbose mode: just `Console.WriteLine(text)`
##### `ClearStatus()`
- In compact mode: `Console.Write("\r\x1b[K")` (erase line)
- In verbose: no-op
- Sets `_currentMessage = null`
##### `StartSpinner()` / `StopSpinner()`
- Manual control (usually `StartSpinner` constructor call, `StopSpinner` by `Dispose`)
##### `Dispose()`
- Completes channel writer
- Awaits `_statusProcessor` completion
- Calls `StopSpinner()`
#### Background Processing
**Status Processor**:
```csharp
private async Task ProcessStatusUpdatesAsync()
{
await foreach (var message in _statusChannel.Reader.ReadAllAsync())
{
if (_verbose)
{
Console.WriteLine(message);
continue;
}
Console.Write("\r\x1b[K"); // Clear line
Console.Write($"{_spinnerChars[0]} {message}"); // Static spinner
_currentMessage = message;
}
}
```
**Spinner Task**:
```csharp
_spinnerTask = Task.Run(async () =>
{
while (_spinnerCts is { Token.IsCancellationRequested: false })
{
if (_currentMessage != null)
{
Console.Write("\r\x1b[K");
var charIndex = index++ % spinner.Length;
Console.Write($"{spinner[charIndex]} {_currentMessage}");
}
await Task.Delay(100, _spinnerCts.Token);
}
});
```
#### Thread Safety
- `UpdateStatus` (producer) writes to channel
- `ProcessStatusUpdatesAsync` (consumer) reads from channel
- `_spinnerTask` runs concurrently
- All UI writes happen in consumer/spinner task context (single-threaded UI)
#### Design Notes
- Could be simplified: just use `Console.CursorLeft` for spinner, no channel
- Channel allows random `UpdateStatus` calls from any thread without blocking
- Braille spinner requires terminal that supports Unicode (most modern terminals do)
---
## Service Interactions
### Dependency Graph
```
OpenQueryApp
├── OpenRouterClient ← (used for query gen + final answer)
└── SearchTool
├── SearxngClient
├── ArticleService (uses SmartReader)
├── ChunkingService (static)
├── EmbeddingService
│ └── OpenRouterClient (different instance)
│ └── RateLimiter
└── ParallelProcessingOptions (config)
```
### Service Lifetimes
All services are **transient** (new instance per query execution):
- `OpenRouterClient` → 1 instance for query gen + answer
- `SearxngClient` → 1 instance for all searches
- `EmbeddingService` → 1 instance with its own `OpenRouterClient` and `RateLimiter`
- `SearchTool` → 1 instance per query (constructed in `Program.cs`)
No singleton or static state (except static utility classes like `ChunkingService`).
### Data Flow Through Services
```
OpenQueryApp
├─ OpenRouterClient.CompleteAsync() → query generation
│ Messages → JSON → HTTP request → response → JSON → Messages
└─ SearchTool.ExecuteAsync()
├─ SearxngClient.SearchAsync() × N
│ query → URL encode → GET → JSON → SearxngResult[]
├─ ArticleService.FetchArticleAsync() × M
│ URL → HTTP GET → SmartReader → Article
├─ ChunkingService.ChunkText() × M
│ Article.TextContent → List<string> chunks
├─ EmbeddingService.GetEmbeddingAsync(query) + GetEmbeddingsAsync(chunks[])
│ texts → batches → rate-limited HTTP POST → JSON → float[][]
├─ CosineSimilarity(queryEmbedding, chunkEmbedding) × M
│ Vectors → dot product → magnitude → score
└─ return context string (formatted chunks)
```
---
## Next Steps
- **[OpenQueryApp](../components/openquery-app.md)** - Orchestrates services
- **[SearchTool](../components/search-tool.md)** - Coordinates pipeline
- **[Models](../components/models.md)** - Data structures passed between services
- **[API Reference](../../api/cli.md)** - CLI that uses these services
---
**Service Design Principles**:
- Single Responsibility: Each service does one thing well
- Stateless: No instance state beyond constructor args
- Composable: Services depend on abstractions (other services) not implementations
- Testable: Can mock dependencies for unit testing