docs: add comprehensive documentation with README and detailed guides

- Add user-friendly README.md with quick start guide
- Create docs/ folder with structured technical documentation:
  - installation.md: Build and setup instructions
  - configuration.md: Complete config reference
  - usage.md: CLI usage guide with examples
  - architecture.md: System design and patterns
  - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models)
  - api/: CLI reference, environment variables, programmatic API
  - troubleshooting.md: Common issues and solutions
  - performance.md: Latency, throughput, and optimization
- All documentation fully cross-referenced with internal links
- Covers project overview, architecture, components, APIs, and support

See individual files for complete documentation.
This commit is contained in:
OpenQuery Documentation
2026-03-19 10:01:58 +01:00
parent b28d8998f7
commit 65ca2401ae
16 changed files with 7073 additions and 0 deletions

View File

@@ -0,0 +1,395 @@
# OpenQueryApp Component
Deep dive into the `OpenQueryApp` class - the main application orchestrator.
## Overview
`OpenQueryApp` is the heart of OpenQuery. It coordinates all components, manages the workflow from question to answer, and handles progress reporting.
## Location
`OpenQuery.cs` in project root
## Class Definition
```csharp
public class OpenQueryApp
{
private readonly OpenRouterClient _client;
private readonly SearchTool _searchTool;
private readonly string _model;
public OpenQueryApp(
OpenRouterClient client,
SearchTool searchTool,
string model);
public async Task RunAsync(OpenQueryOptions options);
}
```
**Dependencies**:
- `OpenRouterClient` - for query generation and final answer streaming
- `SearchTool` - for search-retrieve-rank pipeline
- `string _model` - model identifier to use for LLM calls
**Lifecycle**: Instantiated once per query execution in `Program.cs`, then `RunAsync()` called once.
## RunAsync Workflow
```csharp
public async Task RunAsync(OpenQueryOptions options)
{
// 1. Setup
using var reporter = new StatusReporter(options.Verbose);
reporter.StartSpinner();
// 2. Query Generation (if needed)
List<string> queries = await GenerateQueriesIfNeededAsync(options, reporter);
// 3. Search Pipeline
string searchResult = await ExecuteSearchPipelineAsync(options, queries, reporter);
// 4. Final Answer Streaming
await StreamFinalAnswerAsync(options, searchResult, reporter);
}
```
### Step 1: Status Reporter Setup
```csharp
using var reporter = new StatusReporter(options.Verbose);
reporter.StartSpinner();
```
- Creates `StatusReporter` (implements `IDisposable`)
- Starts spinner animation (unless verbose)
- `using` ensures disposal on exit
### Step 2: Query Generation
**When**: `options.Queries > 1` (user wants multiple search queries)
**Purpose**: Use LLM to generate diverse, optimized search queries from the original question
**System Prompt** (hardcoded in `OpenQuery.cs`):
```
You are an expert researcher. The user will ask a question. Your task is to
generate optimal search queries to gather comprehensive information.
Instructions:
1. Break down complex questions.
2. Use synonyms and alternative phrasing.
3. Target different aspects (entities, mechanisms, pros/cons, history).
CRITICAL: Output must be a valid JSON array of strings ONLY. No markdown,
explanations, or other text.
```
**Request**:
```csharp
var queryGenMessages = new List<Message>
{
new Message("system", systemPrompt),
new Message("user", $"Generate {options.Queries} distinct search queries for:\n{options.Question}")
};
var request = new ChatCompletionRequest(_model, queryGenMessages);
var response = await _client.CompleteAsync(request);
```
**Response Parsing**:
```csharp
var content = response.Choices.FirstOrDefault()?.Message.Content;
if (!string.IsNullOrEmpty(content))
{
// Remove markdown code fences if present
content = Regex.Replace(content, @"```json\s*|\s*```", "").Trim();
// Deserialize to List<string>
var generatedQueries = JsonSerializer.Deserialize(content, AppJsonContext.Default.ListString);
if (generatedQueries != null && generatedQueries.Count > 0)
{
queries = generatedQueries;
}
}
```
**Fallback**: If any step fails (exception, null, empty, invalid JSON), use `new List<string> { options.Question }` (single query = original)
**Note**: Query generation reuses the same model as final answer. This could be optimized:
- Use cheaper/faster model for query gen
- Separate model configuration
- Cache query generation results
### Step 3: Search Pipeline Execution
```csharp
var searchResult = await _searchTool.ExecuteAsync(
options.Question,
queries,
options.Results,
options.Chunks,
(progress) => {
if (options.Verbose)
reporter.WriteLine(progress);
else
reporter.UpdateStatus(parsedMessage);
},
options.Verbose);
```
**Parameters**:
- `originalQuery`: User's original question (used for final embedding)
- `generatedQueries`: From step 2 (or fallback)
- `maxResults`: `options.Results` (search results per query)
- `topChunksLimit`: `options.Chunks` (top N chunks to return)
- `onProgress`: Callback to update UI
- `verbose`: Passed through to `SearchTool`
**Returns**: `string context` - formatted context with source citations
**Progress Handling**:
- In verbose mode: all progress printed as lines (via `reporter.WriteLine()`)
- In compact mode: parse progress messages to show concise status (e.g., "Fetching articles 3/10...")
### Step 4: Final Answer Streaming
**Status Update**:
```csharp
if (!options.Verbose)
reporter.UpdateStatus("Asking AI...");
else
{
reporter.ClearStatus();
Console.WriteLine();
}
```
**Build System Prompt**:
```csharp
var systemPrompt = "You are a helpful AI assistant. Answer the user's question in depth, based on the provided context. Be precise and accurate. You can mention sources or citations.";
if (options.Short) systemPrompt += " Give a very short concise answer.";
if (options.Long) systemPrompt += " Give a long elaborate detailed answer.";
```
**Prompt Structure**:
```
System: {systemPrompt}
User: Context:
{searchResult}
Question: {options.Question}
```
Where `searchResult` is:
```
[Source 1: Title](URL)
Content chunk 1
[Source 2: Title](URL)
Content chunk 2
...
```
**Streaming**:
```csharp
var requestStream = new ChatCompletionRequest(_model, messages);
var assistantResponse = new StringBuilder();
var isFirstChunk = true;
using var streamCts = new CancellationTokenSource();
await foreach (var chunk in _client.StreamAsync(requestStream, streamCts.Token))
{
if (chunk.TextDelta == null) continue;
if (isFirstChunk)
{
reporter.StopSpinner();
if (!options.Verbose) reporter.ClearStatus();
else Console.Write("Assistant: ");
isFirstChunk = false;
}
Console.Write(chunk.TextDelta);
assistantResponse.Append(chunk.TextDelta);
}
```
**Key Points**:
- `StreamAsync` yields `StreamChunk` objects (text deltas)
- First chunk stops spinner and clears status line
- Each delta written to Console immediately (real-time feel)
- Entire response accumulated in `assistantResponse` (though not used elsewhere)
- `CancellationTokenSource` passed but not canceled (Ctrl+C would cancel from outside)
**Finally Block**:
```csharp
finally
{
reporter.StopSpinner();
}
```
Ensures spinner stops even if streaming fails.
**End**:
```csharp
Console.WriteLine(); // Newline after complete answer
```
## Error Handling
`RunAsync` itself does not catch exceptions. All exceptions propagate to `Program.cs`:
```csharp
try
{
var openQuery = new OpenQueryApp(client, searchTool, model);
await openQuery.RunAsync(options);
}
catch (HttpRequestException ex)
{
Console.Error.WriteLine($"\n[Error] Network request failed. Details: {ex.Message}");
Environment.Exit(1);
}
catch (Exception ex)
{
Console.Error.WriteLine($"\n[Error] An unexpected error occurred: {ex.Message}");
Environment.Exit(1);
}
```
**Common Exceptions**:
- `HttpRequestException` - network failures, API errors
- `JsonException` - malformed JSON from API
- `TaskCanceledException` - timeout or user interrupt
- `Exception` - anything else
**No Retries at This Level**: Fail fast; user sees error immediately. Lower-level retries exist (embedding service).
## Performance Characteristics
**Query Generation**:
- One non-streaming LLM call
- Takes 2-5 seconds depending on model
- Typically <1000 tokens
**Search Pipeline** (`SearchTool.ExecuteAsync`):
- See `SearchTool.md` for detailed timing breakdown
- Total 10-30 seconds typically
**Final Answer Streaming**:
- Streaming LLM call
- Time depends on answer length (typically 5-20 seconds)
- User sees words appear progressively
**Total End-to-End**: 15-50 seconds for typical query
## Design Decisions
### Why Not Stream Query Generation?
Query generation currently uses `CompleteAsync` (non-streaming). Could be streamed but:
- Queries are short (JSON array)
- Streaming offers no UX benefit (user doesn't see intermediate queries)
- Simpler to wait for all queries before proceeding
### Why Build Prompt Manually Instead of Templates?
Simple string concatenation is fine for few prompts. Pros:
- No template dependencies
- Easy to read and modify
- No runtime compilation overhead
Cons:
- No validation
- Could benefit from prompt engineering framework
### Why Accumulate `assistantResponse` StringBuilder?
Currently built but not used. Could be:
- Saved to file (future feature: `--output file.md`)
- Analyzed for token counting
- Removed if not needed
### Could Query Generation Be Cached?
Yes! For repeated questions (common in scripts), cache query results:
- `Dictionary<string, List<string>>` cache in memory
- Or persistent cache (Redis, file)
- Not implemented (low priority)
### Single Responsibility Violation?
`OpenQueryApp` does:
- Query generation
- Pipeline orchestration
- Answer streaming
That's 3 responsibilities, but they're tightly coupled to the "query → answer" workflow. Separating them would add complexity without clear benefit. Acceptable as "application coordinator".
## Extension Points
### Adding New Model for Query Generation
Currently uses same `_model` for queries and answer. To use different models:
1. Add `queryGenerationModel` parameter to constructor
2. Use it for query gen: `new ChatCompletionRequest(queryGenerationModel, queryGenMessages)`
3. Keep `_model` for final answer
Or make it configurable via environment variable: `OPENROUTER_QUERY_MODEL`
### Post-Processing Answer
Opportunity to add:
- Source citation formatting (footnotes, clickable links)
- Answer summarization
- Export to Markdown/JSON
- Text-to-speech
Add after streaming loop, before final newline.
### Progress UI Enhancement
Current `StatusReporter` is basic. Could add:
- Progress bar with percentage
- ETA calculation
- Colors (ANSI) for different message types
- Logging to file
- Web dashboard
Would require extending `StatusReporter` or replacing it.
## Testing Considerations
**Challenges**:
- `RunAsync` is cohesive (hard to unit test in isolation)
- Depends on many services (need mocks)
- Asynchronous and streaming
**Recommended Approach**:
1. Extract interfaces:
- `ISearchTool` (wrapper around `SearchTool`)
- `IOpenRouterClient` (wrapper around `OpenRouterClient`)
2. Mock interfaces in tests
3. Test query generation parsing separately
4. Test progress callback counting
5. Test final answer prompt construction
**Integration Tests**:
- End-to-end with real/mocked APIs
- Automated tests with test SearxNG/OpenRouter instances
## Related Components
- **[SearchTool](search-tool.md)** - pipeline executed by `OpenQueryApp`
- **[Program.cs](../Program.md)** - creates `OpenQueryApp`
- **[StatusReporter](../services/StatusReporter.md)** - progress UI used by `OpenQueryApp`
---
## Next Steps
- [SearchTool](search-tool.md) - See the pipeline in detail
- [Services](../services/overview.md) - Understand each service
- [CLI Reference](../../api/cli.md) - How users invoke this