Files

OpenQuery Documentation 65ca2401ae docs: add comprehensive documentation with README and detailed guides

- Add user-friendly README.md with quick start guide
- Create docs/ folder with structured technical documentation:
  - installation.md: Build and setup instructions
  - configuration.md: Complete config reference
  - usage.md: CLI usage guide with examples
  - architecture.md: System design and patterns
  - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models)
  - api/: CLI reference, environment variables, programmatic API
  - troubleshooting.md: Common issues and solutions
  - performance.md: Latency, throughput, and optimization
- All documentation fully cross-referenced with internal links
- Covers project overview, architecture, components, APIs, and support

See individual files for complete documentation.

2026-03-19 10:01:58 +01:00

10 KiB

Raw Blame History

OpenQuery Documentation

Welcome to the comprehensive documentation for OpenQuery - the AI-powered search and answer system.

📚 Documentation Overview

Getting Started

Installation Guide - Build, install, and setup instructions
Configuration - Configure API keys, models, and settings
Usage Guide - Complete CLI reference with examples

Deep Dive

Architecture - System design, patterns, and data flow
Components - Detailed component documentation
API Reference - Complete command-line interface reference
- Environment Variables
- Programmatic APIs

Support

Troubleshooting - Common issues and solutions
Performance - Performance characteristics and optimization

🎯 Quick Links

For Users

For Developers

Project Overview

OpenQuery is a sophisticated CLI tool that combines the power of large language models with web search to provide accurate, well-sourced answers to complex questions.

What It Does

Takes a natural language question as input
Generates multiple diverse search queries
Searches the web via SearxNG
Extracts and processes article content
Uses semantic similarity to rank relevance
Synthesizes a comprehensive AI-generated answer with citations

Why Use OpenQuery?

Accuracy: Multiple search queries reduce bias and increase coverage
Transparency: Sources are cited in the final answer
Speed: Parallel processing minimizes latency
Control: Fine-tune every aspect from query count to chunk selection
Privacy: SearxNG provides anonymous, aggregating search

Key Concepts

Search Queries

Instead of using your exact question, OpenQuery generates multiple optimized search queries (default: 3). For example, "What is quantum entanglement?" might become:

"quantum entanglement definition"
"how quantum entanglement works"
"quantum entanglement experiments"

Content Chunks

Long articles are split into ~500-character chunks. Each chunk is:

Stored with its source URL and title
Converted to a vector embedding (1536 dimensions)
Scored against your query embedding

Semantic Ranking

Using cosine similarity between embeddings, OpenQuery ranks chunks by relevance and selects the top N (default: 3) for the final context.

Streaming Answer

The LLM receives your question plus the top chunks as context and streams the answer in real-time, citing sources like [Source 1].

Technology Stack

Layer	Technology	Purpose
Runtime	.NET 10.0 AOT	Native performance, minimal footprint
LLM	OpenRouter API	Chat completions and embeddings
Search	SearxNG	Metasearch engine
Content Extraction	SmartReader	Article text extraction
Vector Math	System.Numerics.Tensors	High-performance cosine similarity
Resilience	Polly	Retry and circuit breaker policies
CLI	System.CommandLine	Command parsing and help
JSON	System.Text.Json (source-gen)	Fast serialization

System Workflow

┌─────────────────────────────────────────────────────────────────┐
│                         OpenQuery Workflow                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                   │
│  1. User Query: "What is quantum entanglement?"                 │
│                                                                   │
│  2. Query Generation (Optional)                                 │
│     LLM generates: ["quantum entanglement physics",            │
│                      "quantum entanglement definition",         │
│                      "how does quantum entanglement work"]      │
│                                                                   │
│  3. Parallel Searches                                          │
│     ┌────────────┐  ┌────────────┐  ┌────────────┐           │
│     │ Query 1 →   │→ │ SearxNG    │→ │  Results   │           │
│     └────────────┘  └────────────┘  └────────────┘           │
│     ┌────────────┐  ┌────────────┐  ┌────────────┐           │
│     │ Query 2 →   │→ │ SearxNG    │→ │  Results   │           │
│     └────────────┘  └────────────┘  └────────────┘           │
│     ┌────────────┐  ┌────────────┐  ┌────────────┘           │
│     │ Query 3 →   │→ │ SearxNG    │→ │  Results (combined)   │
│     └────────────┘  └────────────┘  └────────────┘           │
│                                                                   │
│  4. Parallel Article Fetching                                  │
│     ┌──────────┐  ┌──────────┐  ┌──────────┐                │
│     │ URL 1 →  │→ │ Article  │→ │ Chunks   │                │
│     └──────────┘  └──────────┘  └──────────┘                │
│     ┌──────────┐  ┌──────────┐  ┌──────────┐                │
│     │ URL 2 →  │→ │ Article  │→ │ Chunks   │                │
│     └──────────┘  └──────────┘  └──────────┘                │
│     ... (concurrent, max 10 at a time)                         │
│                                                                   │
│  5. Parallel Embeddings                                        │
│     ┌──────────┐  ┌──────────┐  ┌──────────┐                │
│     │ Chunks   │→ │ Embed-   │→ │ Vectors  │                │
│     │  Batch 1 │  │ ding API │  │          │                │
│     └──────────┘  └──────────┘  └──────────┘                │
│     ┌──────────┐  ┌──────────┐  ┌──────────┐                │
│     │ Chunks   │→ │ Embed-   │→ │ Vectors  │                │
│     │  Batch 2 │  │ ding API │  │          │                │
│     └──────────┘  └──────────┘  └──────────┘                │
│     (batches of 300, up to 4 concurrent)                       │
│                                                                   │
│  6. Semantic Ranking                                          │
│     Query Embedding + Chunk Embeddings → Cosine Similarity →   │
│     Score → Sort Descending → Top 3 Chunks                     │
│                                                                   │
│  7. Final Answer Generation                                    │
│     ┌────────────────────────────────────────────┐             │
│     │ System: "Answer based on this context:"   │             │
│     │ Context: [Top 3 chunks with sources]      │             │
│     │ Question: "What is quantum entanglement?"  │             │
│     └────────────────────────────────────────────┘             │
│                           ↓                                      │
│                    LLM Streams Answer                           │
│                    "Quantum entanglement is..."                 │
│                    with citations like [Source 1]              │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Next Steps

For detailed technical information, continue to the architecture guide.

Need help? Check the Troubleshooting guide.

10 KiB Raw Blame History