OpenQuery/docs/index.md

# OpenQuery Documentation

Welcome to the comprehensive documentation for OpenQuery - the AI-powered search and answer system.

## 📚 Documentation Overview

### Getting Started
- **[Installation Guide](installation.md)** - Build, install, and setup instructions
- **[Configuration](configuration.md)** - Configure API keys, models, and settings
- **[Usage Guide](usage.md)** - Complete CLI reference with examples

### Deep Dive
- **[Architecture](architecture.md)** - System design, patterns, and data flow
- **[Components](components/overview.md)** - Detailed component documentation
  - [OpenQueryApp](components/openquery-app.md)
  - [SearchTool](components/search-tool.md)
  - [Services](components/services.md)
  - [Models](components/models.md)
- **[API Reference](api/cli.md)** - Complete command-line interface reference
  - [Environment Variables](api/environment-variables.md)
  - [Programmatic APIs](api/programmatic.md)

### Support
- **[Troubleshooting](troubleshooting.md)** - Common issues and solutions
- **[Performance](performance.md)** - Performance characteristics and optimization

## 🎯 Quick Links

### For Users
- [Install OpenQuery](installation.md) in 5 minutes
- [Configure your API key](configuration.md)
- [Learn the basics](usage.md)
- [Solve common problems](troubleshooting.md)

### For Developers
- [Understand the architecture](architecture.md)
- [Explore components](components/overview.md)
- [Use the APIs programmatically](api/programmatic.md)
- [Performance tuning](performance.md)

## 📋 Table of Contents

1. [Project Overview](#project-overview)
2. [Key Concepts](#key-concepts)
3. [Technology Stack](#technology-stack)
4. [System Workflow](#system-workflow)

## Project Overview

**OpenQuery** is a sophisticated CLI tool that combines the power of large language models with web search to provide accurate, well-sourced answers to complex questions.

### What It Does
- Takes a natural language question as input
- Generates multiple diverse search queries
- Searches the web via SearxNG
- Extracts and processes article content
- Uses semantic similarity to rank relevance
- Synthesizes a comprehensive AI-generated answer with citations

### Why Use OpenQuery?
- **Accuracy**: Multiple search queries reduce bias and increase coverage
- **Transparency**: Sources are cited in the final answer
- **Speed**: Parallel processing minimizes latency
- **Control**: Fine-tune every aspect from query count to chunk selection
- **Privacy**: SearxNG provides anonymous, aggregating search

## Key Concepts

### Search Queries
Instead of using your exact question, OpenQuery generates multiple optimized search queries (default: 3). For example, "What is quantum entanglement?" might become:
- "quantum entanglement definition"
- "how quantum entanglement works"
- "quantum entanglement experiments"

### Content Chunks
Long articles are split into ~500-character chunks. Each chunk is:
- Stored with its source URL and title
- Converted to a vector embedding (1536 dimensions)
- Scored against your query embedding

### Semantic Ranking
Using cosine similarity between embeddings, OpenQuery ranks chunks by relevance and selects the top N (default: 3) for the final context.

### Streaming Answer
The LLM receives your question plus the top chunks as context and streams the answer in real-time, citing sources like `[Source 1]`.

## Technology Stack

| Layer | Technology | Purpose |
|-------|------------|---------|
| Runtime | .NET 10.0 AOT | Native performance, minimal footprint |
| LLM | OpenRouter API | Chat completions and embeddings |
| Search | SearxNG | Metasearch engine |
| Content Extraction | SmartReader | Article text extraction |
| Vector Math | System.Numerics.Tensors | High-performance cosine similarity |
| Resilience | Polly | Retry and circuit breaker policies |
| CLI | System.CommandLine | Command parsing and help |
| JSON | System.Text.Json (source-gen) | Fast serialization |

## System Workflow

```
┌─────────────────────────────────────────────────────────────────┐
│                         OpenQuery Workflow                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                   │
│  1. User Query: "What is quantum entanglement?"                 │
│                                                                   │
│  2. Query Generation (Optional)                                 │
│     LLM generates: ["quantum entanglement physics",            │
│                      "quantum entanglement definition",         │
│                      "how does quantum entanglement work"]      │
│                                                                   │
│  3. Parallel Searches                                          │
│     ┌────────────┐  ┌────────────┐  ┌────────────┐           │
│     │ Query 1 →   │→ │ SearxNG    │→ │  Results   │           │
│     └────────────┘  └────────────┘  └────────────┘           │
│     ┌────────────┐  ┌────────────┐  ┌────────────┐           │
│     │ Query 2 →   │→ │ SearxNG    │→ │  Results   │           │
│     └────────────┘  └────────────┘  └────────────┘           │
│     ┌────────────┐  ┌────────────┐  ┌────────────┘           │
│     │ Query 3 →   │→ │ SearxNG    │→ │  Results (combined)   │
│     └────────────┘  └────────────┘  └────────────┘           │
│                                                                   │
│  4. Parallel Article Fetching                                  │
│     ┌──────────┐  ┌──────────┐  ┌──────────┐                │
│     │ URL 1 →  │→ │ Article  │→ │ Chunks   │                │
│     └──────────┘  └──────────┘  └──────────┘                │
│     ┌──────────┐  ┌──────────┐  ┌──────────┐                │
│     │ URL 2 →  │→ │ Article  │→ │ Chunks   │                │
│     └──────────┘  └──────────┘  └──────────┘                │
│     ... (concurrent, max 10 at a time)                         │
│                                                                   │
│  5. Parallel Embeddings                                        │
│     ┌──────────┐  ┌──────────┐  ┌──────────┐                │
│     │ Chunks   │→ │ Embed-   │→ │ Vectors  │                │
│     │  Batch 1 │  │ ding API │  │          │                │
│     └──────────┘  └──────────┘  └──────────┘                │
│     ┌──────────┐  ┌──────────┐  ┌──────────┐                │
│     │ Chunks   │→ │ Embed-   │→ │ Vectors  │                │
│     │  Batch 2 │  │ ding API │  │          │                │
│     └──────────┘  └──────────┘  └──────────┘                │
│     (batches of 300, up to 4 concurrent)                       │
│                                                                   │
│  6. Semantic Ranking                                          │
│     Query Embedding + Chunk Embeddings → Cosine Similarity →   │
│     Score → Sort Descending → Top 3 Chunks                     │
│                                                                   │
│  7. Final Answer Generation                                    │
│     ┌────────────────────────────────────────────┐             │
│     │ System: "Answer based on this context:"   │             │
│     │ Context: [Top 3 chunks with sources]      │             │
│     │ Question: "What is quantum entanglement?"  │             │
│     └────────────────────────────────────────────┘             │
│                           ↓                                      │
│                    LLM Streams Answer                           │
│                    "Quantum entanglement is..."                 │
│                    with citations like [Source 1]              │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘
```

## Next Steps

1. **[Install OpenQuery](installation.md)**
2. **[Configure it](configuration.md)**
3. **[Start asking questions](usage.md)**

For detailed technical information, continue to [the architecture guide](architecture.md).

---

**Need help?** Check the [Troubleshooting](troubleshooting.md) guide.