docs: add comprehensive documentation with README and detailed guides
- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
This commit is contained in:
395
docs/components/openquery-app.md
Normal file
395
docs/components/openquery-app.md
Normal file
@@ -0,0 +1,395 @@
|
||||
# OpenQueryApp Component
|
||||
|
||||
Deep dive into the `OpenQueryApp` class - the main application orchestrator.
|
||||
|
||||
## Overview
|
||||
|
||||
`OpenQueryApp` is the heart of OpenQuery. It coordinates all components, manages the workflow from question to answer, and handles progress reporting.
|
||||
|
||||
## Location
|
||||
`OpenQuery.cs` in project root
|
||||
|
||||
## Class Definition
|
||||
|
||||
```csharp
|
||||
public class OpenQueryApp
|
||||
{
|
||||
private readonly OpenRouterClient _client;
|
||||
private readonly SearchTool _searchTool;
|
||||
private readonly string _model;
|
||||
|
||||
public OpenQueryApp(
|
||||
OpenRouterClient client,
|
||||
SearchTool searchTool,
|
||||
string model);
|
||||
|
||||
public async Task RunAsync(OpenQueryOptions options);
|
||||
}
|
||||
```
|
||||
|
||||
**Dependencies**:
|
||||
- `OpenRouterClient` - for query generation and final answer streaming
|
||||
- `SearchTool` - for search-retrieve-rank pipeline
|
||||
- `string _model` - model identifier to use for LLM calls
|
||||
|
||||
**Lifecycle**: Instantiated once per query execution in `Program.cs`, then `RunAsync()` called once.
|
||||
|
||||
## RunAsync Workflow
|
||||
|
||||
```csharp
|
||||
public async Task RunAsync(OpenQueryOptions options)
|
||||
{
|
||||
// 1. Setup
|
||||
using var reporter = new StatusReporter(options.Verbose);
|
||||
reporter.StartSpinner();
|
||||
|
||||
// 2. Query Generation (if needed)
|
||||
List<string> queries = await GenerateQueriesIfNeededAsync(options, reporter);
|
||||
|
||||
// 3. Search Pipeline
|
||||
string searchResult = await ExecuteSearchPipelineAsync(options, queries, reporter);
|
||||
|
||||
// 4. Final Answer Streaming
|
||||
await StreamFinalAnswerAsync(options, searchResult, reporter);
|
||||
}
|
||||
```
|
||||
|
||||
### Step 1: Status Reporter Setup
|
||||
|
||||
```csharp
|
||||
using var reporter = new StatusReporter(options.Verbose);
|
||||
reporter.StartSpinner();
|
||||
```
|
||||
|
||||
- Creates `StatusReporter` (implements `IDisposable`)
|
||||
- Starts spinner animation (unless verbose)
|
||||
- `using` ensures disposal on exit
|
||||
|
||||
### Step 2: Query Generation
|
||||
|
||||
**When**: `options.Queries > 1` (user wants multiple search queries)
|
||||
|
||||
**Purpose**: Use LLM to generate diverse, optimized search queries from the original question
|
||||
|
||||
**System Prompt** (hardcoded in `OpenQuery.cs`):
|
||||
```
|
||||
You are an expert researcher. The user will ask a question. Your task is to
|
||||
generate optimal search queries to gather comprehensive information.
|
||||
|
||||
Instructions:
|
||||
1. Break down complex questions.
|
||||
2. Use synonyms and alternative phrasing.
|
||||
3. Target different aspects (entities, mechanisms, pros/cons, history).
|
||||
|
||||
CRITICAL: Output must be a valid JSON array of strings ONLY. No markdown,
|
||||
explanations, or other text.
|
||||
```
|
||||
|
||||
**Request**:
|
||||
```csharp
|
||||
var queryGenMessages = new List<Message>
|
||||
{
|
||||
new Message("system", systemPrompt),
|
||||
new Message("user", $"Generate {options.Queries} distinct search queries for:\n{options.Question}")
|
||||
};
|
||||
var request = new ChatCompletionRequest(_model, queryGenMessages);
|
||||
var response = await _client.CompleteAsync(request);
|
||||
```
|
||||
|
||||
**Response Parsing**:
|
||||
```csharp
|
||||
var content = response.Choices.FirstOrDefault()?.Message.Content;
|
||||
if (!string.IsNullOrEmpty(content))
|
||||
{
|
||||
// Remove markdown code fences if present
|
||||
content = Regex.Replace(content, @"```json\s*|\s*```", "").Trim();
|
||||
|
||||
// Deserialize to List<string>
|
||||
var generatedQueries = JsonSerializer.Deserialize(content, AppJsonContext.Default.ListString);
|
||||
if (generatedQueries != null && generatedQueries.Count > 0)
|
||||
{
|
||||
queries = generatedQueries;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Fallback**: If any step fails (exception, null, empty, invalid JSON), use `new List<string> { options.Question }` (single query = original)
|
||||
|
||||
**Note**: Query generation reuses the same model as final answer. This could be optimized:
|
||||
- Use cheaper/faster model for query gen
|
||||
- Separate model configuration
|
||||
- Cache query generation results
|
||||
|
||||
### Step 3: Search Pipeline Execution
|
||||
|
||||
```csharp
|
||||
var searchResult = await _searchTool.ExecuteAsync(
|
||||
options.Question,
|
||||
queries,
|
||||
options.Results,
|
||||
options.Chunks,
|
||||
(progress) => {
|
||||
if (options.Verbose)
|
||||
reporter.WriteLine(progress);
|
||||
else
|
||||
reporter.UpdateStatus(parsedMessage);
|
||||
},
|
||||
options.Verbose);
|
||||
```
|
||||
|
||||
**Parameters**:
|
||||
- `originalQuery`: User's original question (used for final embedding)
|
||||
- `generatedQueries`: From step 2 (or fallback)
|
||||
- `maxResults`: `options.Results` (search results per query)
|
||||
- `topChunksLimit`: `options.Chunks` (top N chunks to return)
|
||||
- `onProgress`: Callback to update UI
|
||||
- `verbose`: Passed through to `SearchTool`
|
||||
|
||||
**Returns**: `string context` - formatted context with source citations
|
||||
|
||||
**Progress Handling**:
|
||||
- In verbose mode: all progress printed as lines (via `reporter.WriteLine()`)
|
||||
- In compact mode: parse progress messages to show concise status (e.g., "Fetching articles 3/10...")
|
||||
|
||||
### Step 4: Final Answer Streaming
|
||||
|
||||
**Status Update**:
|
||||
```csharp
|
||||
if (!options.Verbose)
|
||||
reporter.UpdateStatus("Asking AI...");
|
||||
else
|
||||
{
|
||||
reporter.ClearStatus();
|
||||
Console.WriteLine();
|
||||
}
|
||||
```
|
||||
|
||||
**Build System Prompt**:
|
||||
```csharp
|
||||
var systemPrompt = "You are a helpful AI assistant. Answer the user's question in depth, based on the provided context. Be precise and accurate. You can mention sources or citations.";
|
||||
if (options.Short) systemPrompt += " Give a very short concise answer.";
|
||||
if (options.Long) systemPrompt += " Give a long elaborate detailed answer.";
|
||||
```
|
||||
|
||||
**Prompt Structure**:
|
||||
```
|
||||
System: {systemPrompt}
|
||||
User: Context:
|
||||
{searchResult}
|
||||
|
||||
Question: {options.Question}
|
||||
```
|
||||
|
||||
Where `searchResult` is:
|
||||
```
|
||||
[Source 1: Title](URL)
|
||||
Content chunk 1
|
||||
|
||||
[Source 2: Title](URL)
|
||||
Content chunk 2
|
||||
|
||||
...
|
||||
```
|
||||
|
||||
**Streaming**:
|
||||
```csharp
|
||||
var requestStream = new ChatCompletionRequest(_model, messages);
|
||||
var assistantResponse = new StringBuilder();
|
||||
var isFirstChunk = true;
|
||||
|
||||
using var streamCts = new CancellationTokenSource();
|
||||
await foreach (var chunk in _client.StreamAsync(requestStream, streamCts.Token))
|
||||
{
|
||||
if (chunk.TextDelta == null) continue;
|
||||
|
||||
if (isFirstChunk)
|
||||
{
|
||||
reporter.StopSpinner();
|
||||
if (!options.Verbose) reporter.ClearStatus();
|
||||
else Console.Write("Assistant: ");
|
||||
isFirstChunk = false;
|
||||
}
|
||||
|
||||
Console.Write(chunk.TextDelta);
|
||||
assistantResponse.Append(chunk.TextDelta);
|
||||
}
|
||||
```
|
||||
|
||||
**Key Points**:
|
||||
- `StreamAsync` yields `StreamChunk` objects (text deltas)
|
||||
- First chunk stops spinner and clears status line
|
||||
- Each delta written to Console immediately (real-time feel)
|
||||
- Entire response accumulated in `assistantResponse` (though not used elsewhere)
|
||||
- `CancellationTokenSource` passed but not canceled (Ctrl+C would cancel from outside)
|
||||
|
||||
**Finally Block**:
|
||||
```csharp
|
||||
finally
|
||||
{
|
||||
reporter.StopSpinner();
|
||||
}
|
||||
```
|
||||
Ensures spinner stops even if streaming fails.
|
||||
|
||||
**End**:
|
||||
```csharp
|
||||
Console.WriteLine(); // Newline after complete answer
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
`RunAsync` itself does not catch exceptions. All exceptions propagate to `Program.cs`:
|
||||
|
||||
```csharp
|
||||
try
|
||||
{
|
||||
var openQuery = new OpenQueryApp(client, searchTool, model);
|
||||
await openQuery.RunAsync(options);
|
||||
}
|
||||
catch (HttpRequestException ex)
|
||||
{
|
||||
Console.Error.WriteLine($"\n[Error] Network request failed. Details: {ex.Message}");
|
||||
Environment.Exit(1);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
Console.Error.WriteLine($"\n[Error] An unexpected error occurred: {ex.Message}");
|
||||
Environment.Exit(1);
|
||||
}
|
||||
```
|
||||
|
||||
**Common Exceptions**:
|
||||
- `HttpRequestException` - network failures, API errors
|
||||
- `JsonException` - malformed JSON from API
|
||||
- `TaskCanceledException` - timeout or user interrupt
|
||||
- `Exception` - anything else
|
||||
|
||||
**No Retries at This Level**: Fail fast; user sees error immediately. Lower-level retries exist (embedding service).
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
**Query Generation**:
|
||||
- One non-streaming LLM call
|
||||
- Takes 2-5 seconds depending on model
|
||||
- Typically <1000 tokens
|
||||
|
||||
**Search Pipeline** (`SearchTool.ExecuteAsync`):
|
||||
- See `SearchTool.md` for detailed timing breakdown
|
||||
- Total 10-30 seconds typically
|
||||
|
||||
**Final Answer Streaming**:
|
||||
- Streaming LLM call
|
||||
- Time depends on answer length (typically 5-20 seconds)
|
||||
- User sees words appear progressively
|
||||
|
||||
**Total End-to-End**: 15-50 seconds for typical query
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### Why Not Stream Query Generation?
|
||||
|
||||
Query generation currently uses `CompleteAsync` (non-streaming). Could be streamed but:
|
||||
- Queries are short (JSON array)
|
||||
- Streaming offers no UX benefit (user doesn't see intermediate queries)
|
||||
- Simpler to wait for all queries before proceeding
|
||||
|
||||
### Why Build Prompt Manually Instead of Templates?
|
||||
|
||||
Simple string concatenation is fine for few prompts. Pros:
|
||||
- No template dependencies
|
||||
- Easy to read and modify
|
||||
- No runtime compilation overhead
|
||||
|
||||
Cons:
|
||||
- No validation
|
||||
- Could benefit from prompt engineering framework
|
||||
|
||||
### Why Accumulate `assistantResponse` StringBuilder?
|
||||
|
||||
Currently built but not used. Could be:
|
||||
- Saved to file (future feature: `--output file.md`)
|
||||
- Analyzed for token counting
|
||||
- Removed if not needed
|
||||
|
||||
### Could Query Generation Be Cached?
|
||||
|
||||
Yes! For repeated questions (common in scripts), cache query results:
|
||||
- `Dictionary<string, List<string>>` cache in memory
|
||||
- Or persistent cache (Redis, file)
|
||||
- Not implemented (low priority)
|
||||
|
||||
### Single Responsibility Violation?
|
||||
|
||||
`OpenQueryApp` does:
|
||||
- Query generation
|
||||
- Pipeline orchestration
|
||||
- Answer streaming
|
||||
|
||||
That's 3 responsibilities, but they're tightly coupled to the "query → answer" workflow. Separating them would add complexity without clear benefit. Acceptable as "application coordinator".
|
||||
|
||||
## Extension Points
|
||||
|
||||
### Adding New Model for Query Generation
|
||||
|
||||
Currently uses same `_model` for queries and answer. To use different models:
|
||||
|
||||
1. Add `queryGenerationModel` parameter to constructor
|
||||
2. Use it for query gen: `new ChatCompletionRequest(queryGenerationModel, queryGenMessages)`
|
||||
3. Keep `_model` for final answer
|
||||
|
||||
Or make it configurable via environment variable: `OPENROUTER_QUERY_MODEL`
|
||||
|
||||
### Post-Processing Answer
|
||||
|
||||
Opportunity to add:
|
||||
- Source citation formatting (footnotes, clickable links)
|
||||
- Answer summarization
|
||||
- Export to Markdown/JSON
|
||||
- Text-to-speech
|
||||
|
||||
Add after streaming loop, before final newline.
|
||||
|
||||
### Progress UI Enhancement
|
||||
|
||||
Current `StatusReporter` is basic. Could add:
|
||||
- Progress bar with percentage
|
||||
- ETA calculation
|
||||
- Colors (ANSI) for different message types
|
||||
- Logging to file
|
||||
- Web dashboard
|
||||
|
||||
Would require extending `StatusReporter` or replacing it.
|
||||
|
||||
## Testing Considerations
|
||||
|
||||
**Challenges**:
|
||||
- `RunAsync` is cohesive (hard to unit test in isolation)
|
||||
- Depends on many services (need mocks)
|
||||
- Asynchronous and streaming
|
||||
|
||||
**Recommended Approach**:
|
||||
1. Extract interfaces:
|
||||
- `ISearchTool` (wrapper around `SearchTool`)
|
||||
- `IOpenRouterClient` (wrapper around `OpenRouterClient`)
|
||||
2. Mock interfaces in tests
|
||||
3. Test query generation parsing separately
|
||||
4. Test progress callback counting
|
||||
5. Test final answer prompt construction
|
||||
|
||||
**Integration Tests**:
|
||||
- End-to-end with real/mocked APIs
|
||||
- Automated tests with test SearxNG/OpenRouter instances
|
||||
|
||||
## Related Components
|
||||
|
||||
- **[SearchTool](search-tool.md)** - pipeline executed by `OpenQueryApp`
|
||||
- **[Program.cs](../Program.md)** - creates `OpenQueryApp`
|
||||
- **[StatusReporter](../services/StatusReporter.md)** - progress UI used by `OpenQueryApp`
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [SearchTool](search-tool.md) - See the pipeline in detail
|
||||
- [Services](../services/overview.md) - Understand each service
|
||||
- [CLI Reference](../../api/cli.md) - How users invoke this
|
||||
Reference in New Issue
Block a user