# Models Reference Complete reference for all data models, DTOs, and records in OpenQuery. ## 📋 Table of Contents 1. [Core Data Models](#core-data-models) 2. [OpenRouter API Models](#openrouter-api-models) 3. [SearxNG API Models](#searxng-api-models) 4. [JSON Serialization](#json-serialization) 5. [Model Relationships](#model-relationships) ## Core Data Models ### OpenQueryOptions **Location**: `Models/OpenQueryOptions.cs` **Type**: `record` **Purpose**: Immutable options object for a single query execution ```csharp public record OpenQueryOptions( int Chunks, // Number of top chunks to include in context int Results, // Search results per generated query int Queries, // Number of search queries to generate (if >1) bool Short, // Request concise answer bool Long, // Request detailed answer bool Verbose, // Enable verbose logging string Question // Original user question (required) ); ``` **Lifecycle**: - Created in `Program.cs` by combining CLI options, config defaults, and environment variables - Passed to `OpenQueryApp.RunAsync(options)` **Validation**: None (assumes valid values from CLI parser/config) **Example**: ```csharp var options = new OpenQueryOptions( Chunks: 3, Results: 5, Queries: 3, Short: false, Long: false, Verbose: true, Question: "What is quantum entanglement?" ); ``` --- ### Chunk **Location**: `Models/Chunk.cs` **Type**: `record` **Purpose**: Content chunk with metadata, embedding, and relevance score ```csharp public record Chunk( string Content, // Text content (typically ~500 chars) string SourceUrl, // Original article URL string? Title = null // Article title (optional, may be null) ) { public float[]? Embedding { get; set; } // Vector embedding (1536-dim for text-embedding-3-small) public float Score { get; set; } // Relevance score (0-1, higher = more relevant) } ``` **Lifecycle**: 1. **Created** in `SearchTool.ExecuteParallelArticleFetchingAsync`: ```csharp chunks.Add(new Chunk(chunkText, result.Url, article.Title)); ``` At this point: `Embedding = null`, `Score = 0` 2. **Embedded** in `SearchTool.ExecuteParallelEmbeddingsAsync`: ```csharp validChunks[i].Embedding = validEmbeddings[i]; ``` 3. **Scored** in `SearchTool.RankAndSelectTopChunks`: ```csharp chunk.Score = EmbeddingService.CosineSimilarity(queryEmbedding, chunk.Embedding!); ``` 4. **Formatted** into context string: ```csharp $"[Source {i+1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}" ``` **Properties**: - `Content`: Never null/empty (filters empty chunks in `ChunkingService`) - `SourceUrl`: Always provided (from `SearxngResult.Url`) - `Title`: May be null if article extraction failed to get title - `Embedding`: Null until phase 3; may remain null if embedding failed - `Score`: 0 until phase 4; irrelevant for non-embedded chunks **Equality**: Records use value equality (all properties compared). Two chunks with same content/url/title are equal; embeddings and scores ignored for equality (as they're mutable). --- ### ParallelProcessingOptions **Location**: `Models/ParallelOptions.cs` **Type**: `class` **Purpose**: Configuration for parallel/concurrent operations ```csharp public class ParallelProcessingOptions { public int MaxConcurrentArticleFetches { get; set; } = 10; public int MaxConcurrentEmbeddingRequests { get; set; } = 4; public int EmbeddingBatchSize { get; set; } = 300; } ``` **Usage**: - Instantiated in `SearchTool` constructor (hardcoded new) - Passed to `EmbeddingService` constructor - Read by `SearchTool` for article fetching semaphore **Default Values**: | Property | Default | Effect | |----------|---------|--------| | `MaxConcurrentArticleFetches` | 10 | Up to 10 articles fetched simultaneously | | `MaxConcurrentEmbeddingRequests` | 4 | Up to 4 embedding batches in parallel | | `EmbeddingBatchSize` | 300 | Each embedding API call handles up to 300 texts | **Current Limitation**: These are **compile-time defaults** (hardcoded in `SearchTool.cs`). To make them configurable: 1. Add to `AppConfig` 2. Read in `ConfigManager` 3. Pass through `SearchTool` constructor --- ## OpenRouter API Models **Location**: `Models/OpenRouter.cs` **Purpose**: DTOs for OpenRouter's REST API (JSON serialization) ### Chat Completion #### `ChatCompletionRequest` ```csharp public record ChatCompletionRequest( [property: JsonPropertyName("model")] string Model, [property: JsonPropertyName("messages")] List Messages, [property: JsonPropertyName("tools")] List? Tools = null, [property: JsonPropertyName("stream")] bool Stream = false ); ``` **Example**: ```json { "model": "qwen/qwen3.5-flash-02-23", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is 2+2?" } ], "stream": true } ``` #### `Message` ```csharp public record Message( [property: JsonPropertyName("role")] string Role, [property: JsonPropertyName("content")] string? Content = null, [property: JsonPropertyName("tool_calls")] List? ToolCalls = null, [property: JsonPropertyName("tool_call_id")] string? ToolCallId = null ) { // Factory method for tool responses public static Message FromTool(string content, string toolCallId) => new Message("tool", content, null, toolCallId); } ``` **Roles**: `"system"`, `"user"`, `"assistant"`, `"tool"` **Usage**: - `Content` for text messages - `ToolCalls` when assistant requests tool use - `ToolCallId` when responding to tool call #### `ChatCompletionResponse` ```csharp public record ChatCompletionResponse( [property: JsonPropertyName("choices")] List Choices, [property: JsonPropertyName("usage")] Usage? Usage = null ); public record Choice( [property: JsonPropertyName("message")] Message Message, [property: JsonPropertyName("finish_reason")] string? FinishReason = null ); ``` **Response Example**: ```json { "choices": [ { "message": { "role": "assistant", "content": "Answer text..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 100, "completion_tokens": 50, "total_tokens": 150 } } ``` #### `Usage` ```csharp public record Usage( [property: JsonPropertyName("prompt_tokens")] int PromptTokens, [property: JsonPropertyName("completion_tokens")] int CompletionTokens, [property: JsonPropertyName("total_tokens")] int TotalTokens ); ``` ### Tool Calling (Not Currently Used) #### `ToolDefinition` / `ToolFunction` ```csharp public record ToolDefinition( [property: JsonPropertyName("type")] string Type, // e.g., "function" [property: JsonPropertyName("function")] ToolFunction Function ); public record ToolFunction( [property: JsonPropertyName("name")] string Name, [property: JsonPropertyName("description")] string Description, [property: JsonPropertyName("parameters")] JsonElement Parameters // JSON Schema ); ``` #### `ToolCall` / `FunctionCall` ```csharp public record ToolCall( [property: JsonPropertyName("id")] string Id, [property: JsonPropertyName("type")] string Type, [property: JsonPropertyName("function")] FunctionCall Function ); public record FunctionCall( [property: JsonPropertyName("name")] string Name, [property: JsonPropertyName("arguments")] string Arguments // JSON string ); ``` **Note**: OpenQuery doesn't use tools currently, but models are defined for future tool-calling capability. ### Streaming #### `StreamChunk` ```csharp public record StreamChunk( string? TextDelta = null, ClientToolCall? Tool = null ); ``` Yielded by `OpenRouterClient.StreamAsync()` for each SSE event. #### `ChatCompletionChunk` (Server Response) ```csharp public record ChatCompletionChunk( [property: JsonPropertyName("choices")] List Choices ); public record ChunkChoice( [property: JsonPropertyName("delta")] ChunkDelta Delta ); public record ChunkDelta( [property: JsonPropertyName("content")] string? Content = null, [property: JsonPropertyName("tool_calls")] List? ToolCalls = null ); ``` **Streaming Response Example** (SSE): ``` data: {"choices":[{"delta":{"content":"Hello"}}]} data: {"choices":[{"delta":{"content":" world"}}]} data: [DONE] ``` `OpenRouterClient.StreamAsync` parses and yields `StreamChunk` with non-null `TextDelta` for content. ### Embeddings #### `EmbeddingRequest` ```csharp public record EmbeddingRequest( [property: JsonPropertyName("model")] string Model, [property: JsonPropertyName("input")] List Input ); ``` **Example**: ```json { "model": "openai/text-embedding-3-small", "input": ["text 1", "text 2", ...] } ``` #### `EmbeddingResponse` ```csharp public record EmbeddingResponse( [property: JsonPropertyName("data")] List Data, [property: JsonPropertyName("usage")] Usage Usage ); public record EmbeddingData( [property: JsonPropertyName("embedding")] float[] Embedding, [property: JsonPropertyName("index")] int Index ); ``` **Response Example**: ```json { "data": [ { "embedding": [0.1, 0.2, ...], "index": 0 }, { "embedding": [0.3, 0.4, ...], "index": 1 } ], "usage": { "prompt_tokens": 100, "total_tokens": 100 } } ``` **Note**: `_client.EmbedAsync` orders by `index` to match input order. --- ## SearxNG API Models **Location**: `Models/Searxng.cs` **Purpose**: DTOs for SearxNG's JSON response format ### `SearxngRoot` ```csharp public record SearxngRoot( [property: JsonPropertyName("results")] List Results ); ``` Top-level response object. ### `SearxngResult` ```csharp public record SearxngResult( [property: JsonPropertyName("title")] string Title, [property: JsonPropertyName("url")] string Url, [property: JsonPropertyName("content")] string Content // Snippet/description ); ``` **Fields**: - `Title`: Result title (from page `` or OpenGraph) - `Url`: Absolute URL to article - `Content`: Short snippet (~200 chars) from search engine **Usage**: - `Url` passed to `ArticleService.FetchArticleAsync` - `Title` used as fallback if article extraction fails - `Content` currently unused (could be for quick answer without fetching) **Example Response**: ```json { "results": [ { "title": "Quantum Entanglement - Wikipedia", "url": "https://en.wikipedia.org/wiki/Quantum_entanglement", "content": "Quantum entanglement is a physical phenomenon..." } ] } ``` --- ## JSON Serialization ### JsonContext (Source Generation) **Location**: `Models/JsonContexts.cs` **Purpose**: Provide source-generated JSON serializer context for AOT compatibility #### Declaration ```csharp [JsonSerializable(typeof(ChatCompletionRequest))] [JsonSerializable(typeof(ChatCompletionResponse))] [JsonSerializable(typeof(ChatCompletionChunk))] [JsonSerializable(typeof(EmbeddingRequest))] [JsonSerializable(typeof(EmbeddingResponse))] [JsonSerializable(typeof(SearxngRoot))] [JsonJsonSerializer(typeof(List<string>))] internal partial class AppJsonContext : JsonSerializerContext { } ``` **Usage**: ```csharp var json = JsonSerializer.Serialize(request, AppJsonContext.Default.ChatCompletionRequest); var response = JsonSerializer.Deserialize(json, AppJsonContext.Default.ChatCompletionResponse); ``` **Benefits**: - **AOT-compatible**: No reflection, works with PublishAot=true - **Performance**: Pre-compiled serializers are faster - **Trimming safe**: Unused serializers trimmed automatically **Generated**: Partial class compiled by source generator (no manual implementation) **Important**: Must include ALL types that will be serialized/deserialized in `[JsonSerializable]` attributes, otherwise runtime exception in AOT. --- ## Model Relationships ### Object Graph (Typical Execution) ``` OpenQueryOptions ↓ OpenQueryApp.RunAsync() │ ├─ queryGenerationMessages (List<Message>) │ ├─ system: "You are an expert researcher..." │ └─ user: "Generate N queries for: {question}" │ ↓ │ ChatCompletionRequest → OpenRouter → ChatCompletionResponse │ ↓ │ List<string> generatedQueries │ ├─ SearchTool.ExecuteAsync() │ ↓ │ ┌─────────────────────────────────────┐ │ │ Phase 1: Parallel Searches │ │ │ SearxngClient.SearchAsync(query) × N │ │ → List<SearxngResult> │ │ │ (Title, Url, Content) │ │ └─────────────────────────────────────┘ │ ↓ │ ┌─────────────────────────────────────┐ │ │ Phase 2: Article Fetch & Chunking │ │ │ ArticleService.FetchAsync(Url) × M │ │ → Article (TextContent, Title) │ │ → ChunkingService.ChunkText → List<string> chunks │ │ → Chunk(content, url, title) × K │ │ └─────────────────────────────────────┘ │ ↓ │ ┌─────────────────────────────────────┐ │ │ Phase 3: Embeddings │ │ │ EmbeddingService.GetEmbeddingsAsync(chunkContents) │ │ → float[][] chunkEmbeddings │ │ │ → Set chunk.Embedding for each │ │ │ Also: GetEmbeddingAsync(question) → float[] queryEmbedding │ └─────────────────────────────────────┘ │ ↓ │ ┌─────────────────────────────────────┐ │ │ Phase 4: Ranking │ │ │ For each chunk: Score = CosineSimilarity(queryEmbedding, chunk.Embedding) │ │ → Set chunk.Score │ │ │ → OrderByDescending(Score) │ │ │ → Take(topChunksLimit) → topChunks (List<Chunk>) │ └─────────────────────────────────────┘ │ ↓ │ Context string: formatted topChunks │ ↓ └─ OpenQueryApp → final ChatCompletionRequest System: "Answer based on context..." User: "Context:\n{context}\n\nQuestion: {question}" ↓ StreamAsync() → StreamChunk.TextDelta → Console ``` ### Record Immutability Most DTOs are `record` types: - **Immutable**: Properties are init-only (`{ get; init; }`) - **Value semantics**: Equality based on content - **Thread-safe**: Can be shared across threads **Exception**: - `Chunk`: Has mutable properties `Embedding` and `Score` (set during pipeline) - `ParallelProcessingOptions`: Class with mutable setters - `AppConfig`: Class with mutable setters --- ## Next Steps - **[API Reference](../../api/cli.md)** - How these models are used in CLI commands - **[OpenRouterClient](../../services/OpenRouterClient.md)** - Uses OpenRouter models - **[SearxngClient](../../services/SearxngClient.md)** - Uses Searxng models - **[SearchTool](../../components/search-tool.md)** - Orchestrates all models --- **Quick Reference Table** | Model | Category | Purpose | Mutable? | |-------|----------|---------|----------| | `OpenQueryOptions` | Core | CLI options | No (record) | | `Chunk` | Core | Content + metadata + ranking | Partially (Embedding, Score) | | `ParallelProcessingOptions` | Config | Concurrency settings | Yes (class) | | `ChatCompletionRequest/Response` | OpenRouter | LLM API | No | | `EmbeddingRequest/Response` | OpenRouter | Embeddings API | No | | `SearxngRoot/Result` | SearxNG | Search results | No | | `AppJsonContext` | Internal | JSON serialization | No (generated partial) |