OpenQuery/docs/components/openquery-app.md

# OpenQueryApp Component

Deep dive into the `OpenQueryApp` class - the main application orchestrator.

## Overview

`OpenQueryApp` is the heart of OpenQuery. It coordinates all components, manages the workflow from question to answer, and handles progress reporting.

## Location
`OpenQuery.cs` in project root

## Class Definition

```csharp
public class OpenQueryApp
{
    private readonly OpenRouterClient _client;
    private readonly SearchTool _searchTool;
    private readonly string _model;

    public OpenQueryApp(
        OpenRouterClient client,
        SearchTool searchTool,
        string model);

    public async Task RunAsync(OpenQueryOptions options);
}
```

**Dependencies**:
- `OpenRouterClient` - for query generation and final answer streaming
- `SearchTool` - for search-retrieve-rank pipeline
- `string _model` - model identifier to use for LLM calls

**Lifecycle**: Instantiated once per query execution in `Program.cs`, then `RunAsync()` called once.

## RunAsync Workflow

```csharp
public async Task RunAsync(OpenQueryOptions options)
{
    // 1. Setup
    using var reporter = new StatusReporter(options.Verbose);
    reporter.StartSpinner();

    // 2. Query Generation (if needed)
    List<string> queries = await GenerateQueriesIfNeededAsync(options, reporter);

    // 3. Search Pipeline
    string searchResult = await ExecuteSearchPipelineAsync(options, queries, reporter);

    // 4. Final Answer Streaming
    await StreamFinalAnswerAsync(options, searchResult, reporter);
}
```

### Step 1: Status Reporter Setup

```csharp
using var reporter = new StatusReporter(options.Verbose);
reporter.StartSpinner();
```

- Creates `StatusReporter` (implements `IDisposable`)
- Starts spinner animation (unless verbose)
- `using` ensures disposal on exit

### Step 2: Query Generation

**When**: `options.Queries > 1` (user wants multiple search queries)

**Purpose**: Use LLM to generate diverse, optimized search queries from the original question

**System Prompt** (hardcoded in `OpenQuery.cs`):
```
You are an expert researcher. The user will ask a question. Your task is to
generate optimal search queries to gather comprehensive information.

Instructions:
1. Break down complex questions.
2. Use synonyms and alternative phrasing.
3. Target different aspects (entities, mechanisms, pros/cons, history).

CRITICAL: Output must be a valid JSON array of strings ONLY. No markdown,
explanations, or other text.
```

**Request**:
```csharp
var queryGenMessages = new List<Message>
{
    new Message("system", systemPrompt),
    new Message("user", $"Generate {options.Queries} distinct search queries for:\n{options.Question}")
};
var request = new ChatCompletionRequest(_model, queryGenMessages);
var response = await _client.CompleteAsync(request);
```

**Response Parsing**:
```csharp
var content = response.Choices.FirstOrDefault()?.Message.Content;
if (!string.IsNullOrEmpty(content))
{
    // Remove markdown code fences if present
    content = Regex.Replace(content, @"```json\s*|\s*```", "").Trim();

    // Deserialize to List<string>
    var generatedQueries = JsonSerializer.Deserialize(content, AppJsonContext.Default.ListString);
    if (generatedQueries != null && generatedQueries.Count > 0)
    {
        queries = generatedQueries;
    }
}
```

**Fallback**: If any step fails (exception, null, empty, invalid JSON), use `new List<string> { options.Question }` (single query = original)

**Note**: Query generation reuses the same model as final answer. This could be optimized:
- Use cheaper/faster model for query gen
- Separate model configuration
- Cache query generation results

### Step 3: Search Pipeline Execution

```csharp
var searchResult = await _searchTool.ExecuteAsync(
    options.Question,
    queries,
    options.Results,
    options.Chunks,
    (progress) => {
        if (options.Verbose)
            reporter.WriteLine(progress);
        else
            reporter.UpdateStatus(parsedMessage);
    },
    options.Verbose);
```

**Parameters**:
- `originalQuery`: User's original question (used for final embedding)
- `generatedQueries`: From step 2 (or fallback)
- `maxResults`: `options.Results` (search results per query)
- `topChunksLimit`: `options.Chunks` (top N chunks to return)
- `onProgress`: Callback to update UI
- `verbose`: Passed through to `SearchTool`

**Returns**: `string context` - formatted context with source citations

**Progress Handling**:
- In verbose mode: all progress printed as lines (via `reporter.WriteLine()`)
- In compact mode: parse progress messages to show concise status (e.g., "Fetching articles 3/10...")

### Step 4: Final Answer Streaming

**Status Update**:
```csharp
if (!options.Verbose)
    reporter.UpdateStatus("Asking AI...");
else
{
    reporter.ClearStatus();
    Console.WriteLine();
}
```

**Build System Prompt**:
```csharp
var systemPrompt = "You are a helpful AI assistant. Answer the user's question in depth, based on the provided context. Be precise and accurate. You can mention sources or citations.";
if (options.Short) systemPrompt += " Give a very short concise answer.";
if (options.Long) systemPrompt += " Give a long elaborate detailed answer.";
```

**Prompt Structure**:
```
System: {systemPrompt}
User: Context:
{searchResult}

Question: {options.Question}
```

Where `searchResult` is:
```
[Source 1: Title](URL)
Content chunk 1

[Source 2: Title](URL)
Content chunk 2

...
```

**Streaming**:
```csharp
var requestStream = new ChatCompletionRequest(_model, messages);
var assistantResponse = new StringBuilder();
var isFirstChunk = true;

using var streamCts = new CancellationTokenSource();
await foreach (var chunk in _client.StreamAsync(requestStream, streamCts.Token))
{
    if (chunk.TextDelta == null) continue;

    if (isFirstChunk)
    {
        reporter.StopSpinner();
        if (!options.Verbose) reporter.ClearStatus();
        else Console.Write("Assistant: ");
        isFirstChunk = false;
    }

    Console.Write(chunk.TextDelta);
    assistantResponse.Append(chunk.TextDelta);
}
```

**Key Points**:
- `StreamAsync` yields `StreamChunk` objects (text deltas)
- First chunk stops spinner and clears status line
- Each delta written to Console immediately (real-time feel)
- Entire response accumulated in `assistantResponse` (though not used elsewhere)
- `CancellationTokenSource` passed but not canceled (Ctrl+C would cancel from outside)

**Finally Block**:
```csharp
finally
{
    reporter.StopSpinner();
}
```
Ensures spinner stops even if streaming fails.

**End**:
```csharp
Console.WriteLine(); // Newline after complete answer
```

## Error Handling

`RunAsync` itself does not catch exceptions. All exceptions propagate to `Program.cs`:

```csharp
try
{
    var openQuery = new OpenQueryApp(client, searchTool, model);
    await openQuery.RunAsync(options);
}
catch (HttpRequestException ex)
{
    Console.Error.WriteLine($"\n[Error] Network request failed. Details: {ex.Message}");
    Environment.Exit(1);
}
catch (Exception ex)
{
    Console.Error.WriteLine($"\n[Error] An unexpected error occurred: {ex.Message}");
    Environment.Exit(1);
}
```

**Common Exceptions**:
- `HttpRequestException` - network failures, API errors
- `JsonException` - malformed JSON from API
- `TaskCanceledException` - timeout or user interrupt
- `Exception` - anything else

**No Retries at This Level**: Fail fast; user sees error immediately. Lower-level retries exist (embedding service).

## Performance Characteristics

**Query Generation**:
- One non-streaming LLM call
- Takes 2-5 seconds depending on model
- Typically <1000 tokens

**Search Pipeline** (`SearchTool.ExecuteAsync`):
- See `SearchTool.md` for detailed timing breakdown
- Total 10-30 seconds typically

**Final Answer Streaming**:
- Streaming LLM call
- Time depends on answer length (typically 5-20 seconds)
- User sees words appear progressively

**Total End-to-End**: 15-50 seconds for typical query

## Design Decisions

### Why Not Stream Query Generation?

Query generation currently uses `CompleteAsync` (non-streaming). Could be streamed but:
- Queries are short (JSON array)
- Streaming offers no UX benefit (user doesn't see intermediate queries)
- Simpler to wait for all queries before proceeding

### Why Build Prompt Manually Instead of Templates?

Simple string concatenation is fine for few prompts. Pros:
- No template dependencies
- Easy to read and modify
- No runtime compilation overhead

Cons:
- No validation
- Could benefit from prompt engineering framework

### Why Accumulate `assistantResponse` StringBuilder?

Currently built but not used. Could be:
- Saved to file (future feature: `--output file.md`)
- Analyzed for token counting
- Removed if not needed

### Could Query Generation Be Cached?

Yes! For repeated questions (common in scripts), cache query results:
- `Dictionary<string, List<string>>` cache in memory
- Or persistent cache (Redis, file)
- Not implemented (low priority)

### Single Responsibility Violation?

`OpenQueryApp` does:
- Query generation
- Pipeline orchestration
- Answer streaming

That's 3 responsibilities, but they're tightly coupled to the "query → answer" workflow. Separating them would add complexity without clear benefit. Acceptable as "application coordinator".

## Extension Points

### Adding New Model for Query Generation

Currently uses same `_model` for queries and answer. To use different models:

1. Add `queryGenerationModel` parameter to constructor
2. Use it for query gen: `new ChatCompletionRequest(queryGenerationModel, queryGenMessages)`
3. Keep `_model` for final answer

Or make it configurable via environment variable: `OPENROUTER_QUERY_MODEL`

### Post-Processing Answer

Opportunity to add:
- Source citation formatting (footnotes, clickable links)
- Answer summarization
- Export to Markdown/JSON
- Text-to-speech

Add after streaming loop, before final newline.

### Progress UI Enhancement

Current `StatusReporter` is basic. Could add:
- Progress bar with percentage
- ETA calculation
- Colors (ANSI) for different message types
- Logging to file
- Web dashboard

Would require extending `StatusReporter` or replacing it.

## Testing Considerations

**Challenges**:
- `RunAsync` is cohesive (hard to unit test in isolation)
- Depends on many services (need mocks)
- Asynchronous and streaming

**Recommended Approach**:
1. Extract interfaces:
   - `ISearchTool` (wrapper around `SearchTool`)
   - `IOpenRouterClient` (wrapper around `OpenRouterClient`)
2. Mock interfaces in tests
3. Test query generation parsing separately
4. Test progress callback counting
5. Test final answer prompt construction

**Integration Tests**:
- End-to-end with real/mocked APIs
- Automated tests with test SearxNG/OpenRouter instances

## Related Components

- **[SearchTool](search-tool.md)** - pipeline executed by `OpenQueryApp`
- **[Program.cs](../Program.md)** - creates `OpenQueryApp`
- **[StatusReporter](../services/StatusReporter.md)** - progress UI used by `OpenQueryApp`

---

## Next Steps

- [SearchTool](search-tool.md) - See the pipeline in detail
- [Services](../services/overview.md) - Understand each service
- [CLI Reference](../../api/cli.md) - How users invoke this