- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
11 KiB
OpenQueryApp Component
Deep dive into the OpenQueryApp class - the main application orchestrator.
Overview
OpenQueryApp is the heart of OpenQuery. It coordinates all components, manages the workflow from question to answer, and handles progress reporting.
Location
OpenQuery.cs in project root
Class Definition
public class OpenQueryApp
{
private readonly OpenRouterClient _client;
private readonly SearchTool _searchTool;
private readonly string _model;
public OpenQueryApp(
OpenRouterClient client,
SearchTool searchTool,
string model);
public async Task RunAsync(OpenQueryOptions options);
}
Dependencies:
OpenRouterClient- for query generation and final answer streamingSearchTool- for search-retrieve-rank pipelinestring _model- model identifier to use for LLM calls
Lifecycle: Instantiated once per query execution in Program.cs, then RunAsync() called once.
RunAsync Workflow
public async Task RunAsync(OpenQueryOptions options)
{
// 1. Setup
using var reporter = new StatusReporter(options.Verbose);
reporter.StartSpinner();
// 2. Query Generation (if needed)
List<string> queries = await GenerateQueriesIfNeededAsync(options, reporter);
// 3. Search Pipeline
string searchResult = await ExecuteSearchPipelineAsync(options, queries, reporter);
// 4. Final Answer Streaming
await StreamFinalAnswerAsync(options, searchResult, reporter);
}
Step 1: Status Reporter Setup
using var reporter = new StatusReporter(options.Verbose);
reporter.StartSpinner();
- Creates
StatusReporter(implementsIDisposable) - Starts spinner animation (unless verbose)
usingensures disposal on exit
Step 2: Query Generation
When: options.Queries > 1 (user wants multiple search queries)
Purpose: Use LLM to generate diverse, optimized search queries from the original question
System Prompt (hardcoded in OpenQuery.cs):
You are an expert researcher. The user will ask a question. Your task is to
generate optimal search queries to gather comprehensive information.
Instructions:
1. Break down complex questions.
2. Use synonyms and alternative phrasing.
3. Target different aspects (entities, mechanisms, pros/cons, history).
CRITICAL: Output must be a valid JSON array of strings ONLY. No markdown,
explanations, or other text.
Request:
var queryGenMessages = new List<Message>
{
new Message("system", systemPrompt),
new Message("user", $"Generate {options.Queries} distinct search queries for:\n{options.Question}")
};
var request = new ChatCompletionRequest(_model, queryGenMessages);
var response = await _client.CompleteAsync(request);
Response Parsing:
var content = response.Choices.FirstOrDefault()?.Message.Content;
if (!string.IsNullOrEmpty(content))
{
// Remove markdown code fences if present
content = Regex.Replace(content, @"```json\s*|\s*```", "").Trim();
// Deserialize to List<string>
var generatedQueries = JsonSerializer.Deserialize(content, AppJsonContext.Default.ListString);
if (generatedQueries != null && generatedQueries.Count > 0)
{
queries = generatedQueries;
}
}
Fallback: If any step fails (exception, null, empty, invalid JSON), use new List<string> { options.Question } (single query = original)
Note: Query generation reuses the same model as final answer. This could be optimized:
- Use cheaper/faster model for query gen
- Separate model configuration
- Cache query generation results
Step 3: Search Pipeline Execution
var searchResult = await _searchTool.ExecuteAsync(
options.Question,
queries,
options.Results,
options.Chunks,
(progress) => {
if (options.Verbose)
reporter.WriteLine(progress);
else
reporter.UpdateStatus(parsedMessage);
},
options.Verbose);
Parameters:
originalQuery: User's original question (used for final embedding)generatedQueries: From step 2 (or fallback)maxResults:options.Results(search results per query)topChunksLimit:options.Chunks(top N chunks to return)onProgress: Callback to update UIverbose: Passed through toSearchTool
Returns: string context - formatted context with source citations
Progress Handling:
- In verbose mode: all progress printed as lines (via
reporter.WriteLine()) - In compact mode: parse progress messages to show concise status (e.g., "Fetching articles 3/10...")
Step 4: Final Answer Streaming
Status Update:
if (!options.Verbose)
reporter.UpdateStatus("Asking AI...");
else
{
reporter.ClearStatus();
Console.WriteLine();
}
Build System Prompt:
var systemPrompt = "You are a helpful AI assistant. Answer the user's question in depth, based on the provided context. Be precise and accurate. You can mention sources or citations.";
if (options.Short) systemPrompt += " Give a very short concise answer.";
if (options.Long) systemPrompt += " Give a long elaborate detailed answer.";
Prompt Structure:
System: {systemPrompt}
User: Context:
{searchResult}
Question: {options.Question}
Where searchResult is:
[Source 1: Title](URL)
Content chunk 1
[Source 2: Title](URL)
Content chunk 2
...
Streaming:
var requestStream = new ChatCompletionRequest(_model, messages);
var assistantResponse = new StringBuilder();
var isFirstChunk = true;
using var streamCts = new CancellationTokenSource();
await foreach (var chunk in _client.StreamAsync(requestStream, streamCts.Token))
{
if (chunk.TextDelta == null) continue;
if (isFirstChunk)
{
reporter.StopSpinner();
if (!options.Verbose) reporter.ClearStatus();
else Console.Write("Assistant: ");
isFirstChunk = false;
}
Console.Write(chunk.TextDelta);
assistantResponse.Append(chunk.TextDelta);
}
Key Points:
StreamAsyncyieldsStreamChunkobjects (text deltas)- First chunk stops spinner and clears status line
- Each delta written to Console immediately (real-time feel)
- Entire response accumulated in
assistantResponse(though not used elsewhere) CancellationTokenSourcepassed but not canceled (Ctrl+C would cancel from outside)
Finally Block:
finally
{
reporter.StopSpinner();
}
Ensures spinner stops even if streaming fails.
End:
Console.WriteLine(); // Newline after complete answer
Error Handling
RunAsync itself does not catch exceptions. All exceptions propagate to Program.cs:
try
{
var openQuery = new OpenQueryApp(client, searchTool, model);
await openQuery.RunAsync(options);
}
catch (HttpRequestException ex)
{
Console.Error.WriteLine($"\n[Error] Network request failed. Details: {ex.Message}");
Environment.Exit(1);
}
catch (Exception ex)
{
Console.Error.WriteLine($"\n[Error] An unexpected error occurred: {ex.Message}");
Environment.Exit(1);
}
Common Exceptions:
HttpRequestException- network failures, API errorsJsonException- malformed JSON from APITaskCanceledException- timeout or user interruptException- anything else
No Retries at This Level: Fail fast; user sees error immediately. Lower-level retries exist (embedding service).
Performance Characteristics
Query Generation:
- One non-streaming LLM call
- Takes 2-5 seconds depending on model
- Typically <1000 tokens
Search Pipeline (SearchTool.ExecuteAsync):
- See
SearchTool.mdfor detailed timing breakdown - Total 10-30 seconds typically
Final Answer Streaming:
- Streaming LLM call
- Time depends on answer length (typically 5-20 seconds)
- User sees words appear progressively
Total End-to-End: 15-50 seconds for typical query
Design Decisions
Why Not Stream Query Generation?
Query generation currently uses CompleteAsync (non-streaming). Could be streamed but:
- Queries are short (JSON array)
- Streaming offers no UX benefit (user doesn't see intermediate queries)
- Simpler to wait for all queries before proceeding
Why Build Prompt Manually Instead of Templates?
Simple string concatenation is fine for few prompts. Pros:
- No template dependencies
- Easy to read and modify
- No runtime compilation overhead
Cons:
- No validation
- Could benefit from prompt engineering framework
Why Accumulate assistantResponse StringBuilder?
Currently built but not used. Could be:
- Saved to file (future feature:
--output file.md) - Analyzed for token counting
- Removed if not needed
Could Query Generation Be Cached?
Yes! For repeated questions (common in scripts), cache query results:
Dictionary<string, List<string>>cache in memory- Or persistent cache (Redis, file)
- Not implemented (low priority)
Single Responsibility Violation?
OpenQueryApp does:
- Query generation
- Pipeline orchestration
- Answer streaming
That's 3 responsibilities, but they're tightly coupled to the "query → answer" workflow. Separating them would add complexity without clear benefit. Acceptable as "application coordinator".
Extension Points
Adding New Model for Query Generation
Currently uses same _model for queries and answer. To use different models:
- Add
queryGenerationModelparameter to constructor - Use it for query gen:
new ChatCompletionRequest(queryGenerationModel, queryGenMessages) - Keep
_modelfor final answer
Or make it configurable via environment variable: OPENROUTER_QUERY_MODEL
Post-Processing Answer
Opportunity to add:
- Source citation formatting (footnotes, clickable links)
- Answer summarization
- Export to Markdown/JSON
- Text-to-speech
Add after streaming loop, before final newline.
Progress UI Enhancement
Current StatusReporter is basic. Could add:
- Progress bar with percentage
- ETA calculation
- Colors (ANSI) for different message types
- Logging to file
- Web dashboard
Would require extending StatusReporter or replacing it.
Testing Considerations
Challenges:
RunAsyncis cohesive (hard to unit test in isolation)- Depends on many services (need mocks)
- Asynchronous and streaming
Recommended Approach:
- Extract interfaces:
ISearchTool(wrapper aroundSearchTool)IOpenRouterClient(wrapper aroundOpenRouterClient)
- Mock interfaces in tests
- Test query generation parsing separately
- Test progress callback counting
- Test final answer prompt construction
Integration Tests:
- End-to-end with real/mocked APIs
- Automated tests with test SearxNG/OpenRouter instances
Related Components
- SearchTool - pipeline executed by
OpenQueryApp - Program.cs - creates
OpenQueryApp - StatusReporter - progress UI used by
OpenQueryApp
Next Steps
- SearchTool - See the pipeline in detail
- Services - Understand each service
- CLI Reference - How users invoke this