Files
OpenQuery/docs/components/services.md
OpenQuery Documentation 65ca2401ae docs: add comprehensive documentation with README and detailed guides
- Add user-friendly README.md with quick start guide
- Create docs/ folder with structured technical documentation:
  - installation.md: Build and setup instructions
  - configuration.md: Complete config reference
  - usage.md: CLI usage guide with examples
  - architecture.md: System design and patterns
  - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models)
  - api/: CLI reference, environment variables, programmatic API
  - troubleshooting.md: Common issues and solutions
  - performance.md: Latency, throughput, and optimization
- All documentation fully cross-referenced with internal links
- Covers project overview, architecture, components, APIs, and support

See individual files for complete documentation.
2026-03-19 10:01:58 +01:00

14 KiB
Raw Blame History

Services Overview

Comprehensive reference for all service classes in OpenQuery.

📋 Table of Contents

  1. Service Catalog
  2. Client Services
  3. Processing Services
  4. Infrastructure Services
  5. Service Interactions

Service Catalog

OpenQuery's services are organized into three categories:

Category Services Purpose
Clients OpenRouterClient, SearxngClient External API communication
Processors EmbeddingService, ChunkingService, ArticleService Data transformation & extraction
Infrastructure RateLimiter, StatusReporter Cross-cutting concerns

All services are stateless (except for internal configuration) and can be safely reused across multiple operations.


Client Services

OpenRouterClient

Location: Services/OpenRouterClient.cs
Purpose: HTTP client for OpenRouter AI APIs (chat completions & embeddings)

API Endpoints

Method Endpoint Purpose
POST /chat/completions Chat completion (streaming or non-streaming)
POST /embeddings Embedding generation for text inputs

Authentication

Authorization: Bearer {apiKey}
Accept: application/json

Public Methods

StreamAsync(ChatCompletionRequest request, CancellationToken cancellationToken)
  • Returns: IAsyncEnumerable<StreamChunk>
  • Behavior: Sets request.Stream = true, posts, reads Server-Sent Events stream
  • Use Case: Final answer streaming, real-time responses
  • Stream Format: SSE lines data: {json}; yields TextDelta or ToolCall
CompleteAsync(ChatCompletionRequest request)
  • Returns: Task<ChatCompletionResponse>
  • Behavior: Sets request.Stream = false, posts, returns full response
  • Use Case: Query generation (non-streaming)
EmbedAsync(string model, List<string> inputs)
  • Returns: Task<float[][]>
  • Behavior: POST /embeddings, returns array of vectors (ordered by input index)
  • Use Case: Batch embedding generation
HttpClient
  • Property: Internal _httpClient (created per instance)
  • Note: Could use IHttpClientFactory for pooling (not needed for CLI)

Error Handling

  • EnsureSuccessStatusCode() throws HttpRequestException on 4xx/5xx
  • No retry logic (handled by EmbeddingService)

Configuration

public OpenRouterClient(string apiKey)
{
    _apiKey = apiKey;
    _httpClient = new HttpClient();
    _httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
    _httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
}

Example Usage

var client = new OpenRouterClient("sk-or-...");
var request = new ChatCompletionRequest("model", new List<Message> { ... });
await foreach (var chunk in client.StreamAsync(request))
{
    Console.Write(chunk.TextDelta);
}

SearxngClient

Location: Services/SearxngClient.cs
Purpose: HTTP client for SearxNG metasearch engine

API Endpoint

GET /search?q={query}&format=json

Constructor

public SearxngClient(string baseUrl)  // e.g., "http://localhost:8002"
  • baseUrl trimmed of trailing /

Public Methods

SearchAsync(string query, int limit = 10)
  • Returns: Task<List<SearxngResult>>
  • Behavior: GET request, deserialize JSON, take up to limit results
  • On Failure: Returns empty List<SearxngResult> (no exception)

Error Handling

  • response.EnsureSuccessStatusCode() would throw, but code doesn't call it
  • If invalid JSON or missing Results, returns empty list
  • Failures are tolerated - individual search queries may fail without aborting whole operation

Example Searxng Response

{
  "results": [
    {
      "title": "Quantum Entanglement - Wikipedia",
      "url": "https://en.wikipedia.org/wiki/Quantum_entanglement",
      "content": "Quantum entanglement is a physical phenomenon..."
    },
    ...
  ]
}

Processing Services

EmbeddingService

Location: Services/EmbeddingService.cs
Purpose: Generate embeddings with batching, rate limiting, and retry logic

Configuration

Embedding Model: openai/text-embedding-3-small (default, configurable via constructor)

ParallelProcessingOptions (hardcoded defaults):

public class ParallelProcessingOptions
{
    public int MaxConcurrentEmbeddingRequests { get; set; } = 4;
    public int EmbeddingBatchSize { get; set; } = 300;
}

Public Methods

GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)
  • Returns: Task<float[][]>
  • Behavior:
    • Splits texts into batches of EmbeddingBatchSize
    • Parallel executes batches (max MaxConcurrentEmbeddingRequests concurrent)
    • Each batch: rate-limited, retry-wrapped client.EmbedAsync(model, batch)
    • Reassembles in original order
    • Failed batches → empty float[] for each text
  • Progress: Invokes onProgress for each batch: "[Generating embeddings: batch X/Y]"
  • Thread-Safe: Uses lock for collecting results
GetEmbeddingAsync(string text, CancellationToken)
  • Returns: Task<float[]>
  • Behavior: Single embedding with rate limiting and retry
  • Use Case: Query embedding
`Cos static float CosineSimilarity(float[] vector1, float[] vector2)
Uses `System.Numerics.Tensors.TensorPrimitives.CosineSimilarity`

Returns float between -1 and 1 (typically 0-1 for normalized embeddings)

Implementation: Single line calling SIMD-accelerated tensor primitive


ArticleService

Location: Services/ArticleService.cs
Purpose: Extract clean article content from web URLs

Public Methods

FetchArticleAsync(string url)
  • Returns: Task<Article>
  • Behavior: Delegates to SmartReader.ParseArticleAsync(url)
  • Result: Article with Title, TextContent, IsReadable, and metadata

Errors

  • Propagates exceptions (SmartReader may throw on network failures, malformed HTML)
  • SearchTool catches and logs

SmartReader Notes

  • Open-source article extraction library (bundled via NuGet)
  • Uses Readability algorithm (similar to Firefox Reader View)
  • Removes ads, navigation, boilerplate
  • IsReadable indicates quality (e.g., not a 404 page, not too short)

ChunkingService

Location: Services/ChunkingService.cs
Purpose: Split text into 500-character chunks at natural boundaries

Public Methods

ChunkText(string text)
  • Returns: List<string>
  • Algorithm:
    • Constant MAX_CHUNK_SIZE = 500
    • While remaining text:
      • Take up to 500 chars
      • If not at end, backtrack to last [' ', '\n', '\r', '.', '!']
      • Trim, add if non-empty
      • Advance start
    • Returns all chunks

Characteristics

  • Static class (no instances)
  • Pure function (no side effects)
  • Zero dependencies
  • Handles edge cases (empty text, short text, text without breaks)

Infrastructure Services

RateLimiter

Location: Services/RateLimiter.cs
Purpose: Limit concurrent operations using semaphore

Constructor

public RateLimiter(int maxConcurrentRequests)

Creates SemaphoreSlim with maxConcurrentRequests

Public Methods

ExecuteAsync<T>(Func<Task<T>> action, CancellationToken)
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken cancellationToken = default)
{
    await _semaphore.WaitAsync(cancellationToken);
    try
    {
        return await action();
    }
    finally
    {
        _semaphore.Release();
    }
}
  • Waits for semaphore slot
  • Executes action (typically an API call)
  • Releases semaphore (even if exception)
  • Returns result from action
ExecuteAsync(Func<Task> action, CancellationToken)
  • Non-generic version (for void-returning actions)

Disposal

public async ValueTask DisposeAsync()
{
    _semaphore.Dispose();
}

Implements IAsyncDisposable for async cleanup

Usage Pattern

var result = await _rateLimiter.ExecuteAsync(async () =>
{
    return await SomeApiCall();
}, cancellationToken);

Where Used

  • EmbeddingService: Limits concurrent embedding batch requests (default 4)

StatusReporter

Location: Services/StatusReporter.cs
Purpose: Real-time progress display with spinner (compact) or verbose lines

Constructor

public StatusReporter(bool verbose)
  • verbose = true: all progress via WriteLine() (no spinner)
  • verbose = false: spinner with latest status

Architecture

Components:

  • Channel<string> _statusChannel - producer-consumer queue
  • Task _statusProcessor - background task reading from channel
  • CancellationTokenSource _spinnerCts - spinner task cancellation
  • Task _spinnerTask - spinner animation task
  • char[] _spinnerChars - Braille spinner pattern

Spinner Animation:

  • Runs at 10 FPS (100ms interval)
  • Cycles through ['⠋','⠙','⠹','⠸','⠼','⠴','⠦','⠧','⠇','⠏']
  • Displays: ⠋ Fetching articles...
  • Updates in place using ANSI: \r\x1b[K (carriage return + erase line)

Public Methods

UpdateStatus(string message)
  • Fire-and-forget: writes to channel via TryWrite (non-blocking)
  • If channel full, message dropped (acceptable loss for UI)
WriteLine(string text)
  • Stops spinner temporarily
  • Clears current status line
  • Writes text with newline
  • In verbose mode: just Console.WriteLine(text)
ClearStatus()
  • In compact mode: Console.Write("\r\x1b[K") (erase line)
  • In verbose: no-op
  • Sets _currentMessage = null
StartSpinner() / StopSpinner()
  • Manual control (usually StartSpinner constructor call, StopSpinner by Dispose)
Dispose()
  • Completes channel writer
  • Awaits _statusProcessor completion
  • Calls StopSpinner()

Background Processing

Status Processor:

private async Task ProcessStatusUpdatesAsync()
{
    await foreach (var message in _statusChannel.Reader.ReadAllAsync())
    {
        if (_verbose)
        {
            Console.WriteLine(message);
            continue;
        }
        Console.Write("\r\x1b[K");  // Clear line
        Console.Write($"{_spinnerChars[0]} {message}");  // Static spinner
        _currentMessage = message;
    }
}

Spinner Task:

_spinnerTask = Task.Run(async () =>
{
    while (_spinnerCts is { Token.IsCancellationRequested: false })
    {
        if (_currentMessage != null)
        {
            Console.Write("\r\x1b[K");
            var charIndex = index++ % spinner.Length;
            Console.Write($"{spinner[charIndex]} {_currentMessage}");
        }
        await Task.Delay(100, _spinnerCts.Token);
    }
});

Thread Safety

  • UpdateStatus (producer) writes to channel
  • ProcessStatusUpdatesAsync (consumer) reads from channel
  • _spinnerTask runs concurrently
  • All UI writes happen in consumer/spinner task context (single-threaded UI)

Design Notes

  • Could be simplified: just use Console.CursorLeft for spinner, no channel
  • Channel allows random UpdateStatus calls from any thread without blocking
  • Braille spinner requires terminal that supports Unicode (most modern terminals do)

Service Interactions

Dependency Graph

OpenQueryApp
├── OpenRouterClient ← (used for query gen + final answer)
└── SearchTool
    ├── SearxngClient
    ├── ArticleService (uses SmartReader)
    ├── ChunkingService (static)
    ├── EmbeddingService
    │   └── OpenRouterClient (different instance)
    │   └── RateLimiter
    └── ParallelProcessingOptions (config)

Service Lifetimes

All services are transient (new instance per query execution):

  • OpenRouterClient → 1 instance for query gen + answer
  • SearxngClient → 1 instance for all searches
  • EmbeddingService → 1 instance with its own OpenRouterClient and RateLimiter
  • SearchTool → 1 instance per query (constructed in Program.cs)

No singleton or static state (except static utility classes like ChunkingService).

Data Flow Through Services

OpenQueryApp
  │
  ├─ OpenRouterClient.CompleteAsync() → query generation
  │   Messages → JSON → HTTP request → response → JSON → Messages
  │
  └─ SearchTool.ExecuteAsync()
      │
      ├─ SearxngClient.SearchAsync() × N
      │   query → URL encode → GET → JSON → SearxngResult[]
      │
      ├─ ArticleService.FetchArticleAsync() × M
      │   URL → HTTP GET → SmartReader → Article
      │
      ├─ ChunkingService.ChunkText() × M
      │   Article.TextContent → List<string> chunks
      │
      ├─ EmbeddingService.GetEmbeddingAsync(query) + GetEmbeddingsAsync(chunks[])
      │   texts → batches → rate-limited HTTP POST → JSON → float[][]
      │
      ├─ CosineSimilarity(queryEmbedding, chunkEmbedding) × M
      │   Vectors → dot product → magnitude → score
      │
      └─ return context string (formatted chunks)

Next Steps


Service Design Principles:

  • Single Responsibility: Each service does one thing well
  • Stateless: No instance state beyond constructor args
  • Composable: Services depend on abstractions (other services) not implementations
  • Testable: Can mock dependencies for unit testing