docs: add comprehensive documentation with README and detailed guides
- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
This commit is contained in:
173
docs/index.md
Normal file
173
docs/index.md
Normal file
@@ -0,0 +1,173 @@
|
||||
# OpenQuery Documentation
|
||||
|
||||
Welcome to the comprehensive documentation for OpenQuery - the AI-powered search and answer system.
|
||||
|
||||
## 📚 Documentation Overview
|
||||
|
||||
### Getting Started
|
||||
- **[Installation Guide](installation.md)** - Build, install, and setup instructions
|
||||
- **[Configuration](configuration.md)** - Configure API keys, models, and settings
|
||||
- **[Usage Guide](usage.md)** - Complete CLI reference with examples
|
||||
|
||||
### Deep Dive
|
||||
- **[Architecture](architecture.md)** - System design, patterns, and data flow
|
||||
- **[Components](components/overview.md)** - Detailed component documentation
|
||||
- [OpenQueryApp](components/openquery-app.md)
|
||||
- [SearchTool](components/search-tool.md)
|
||||
- [Services](components/services.md)
|
||||
- [Models](components/models.md)
|
||||
- **[API Reference](api/cli.md)** - Complete command-line interface reference
|
||||
- [Environment Variables](api/environment-variables.md)
|
||||
- [Programmatic APIs](api/programmatic.md)
|
||||
|
||||
### Support
|
||||
- **[Troubleshooting](troubleshooting.md)** - Common issues and solutions
|
||||
- **[Performance](performance.md)** - Performance characteristics and optimization
|
||||
|
||||
## 🎯 Quick Links
|
||||
|
||||
### For Users
|
||||
- [Install OpenQuery](installation.md) in 5 minutes
|
||||
- [Configure your API key](configuration.md)
|
||||
- [Learn the basics](usage.md)
|
||||
- [Solve common problems](troubleshooting.md)
|
||||
|
||||
### For Developers
|
||||
- [Understand the architecture](architecture.md)
|
||||
- [Explore components](components/overview.md)
|
||||
- [Use the APIs programmatically](api/programmatic.md)
|
||||
- [Performance tuning](performance.md)
|
||||
|
||||
## 📋 Table of Contents
|
||||
|
||||
1. [Project Overview](#project-overview)
|
||||
2. [Key Concepts](#key-concepts)
|
||||
3. [Technology Stack](#technology-stack)
|
||||
4. [System Workflow](#system-workflow)
|
||||
|
||||
## Project Overview
|
||||
|
||||
**OpenQuery** is a sophisticated CLI tool that combines the power of large language models with web search to provide accurate, well-sourced answers to complex questions.
|
||||
|
||||
### What It Does
|
||||
- Takes a natural language question as input
|
||||
- Generates multiple diverse search queries
|
||||
- Searches the web via SearxNG
|
||||
- Extracts and processes article content
|
||||
- Uses semantic similarity to rank relevance
|
||||
- Synthesizes a comprehensive AI-generated answer with citations
|
||||
|
||||
### Why Use OpenQuery?
|
||||
- **Accuracy**: Multiple search queries reduce bias and increase coverage
|
||||
- **Transparency**: Sources are cited in the final answer
|
||||
- **Speed**: Parallel processing minimizes latency
|
||||
- **Control**: Fine-tune every aspect from query count to chunk selection
|
||||
- **Privacy**: SearxNG provides anonymous, aggregating search
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### Search Queries
|
||||
Instead of using your exact question, OpenQuery generates multiple optimized search queries (default: 3). For example, "What is quantum entanglement?" might become:
|
||||
- "quantum entanglement definition"
|
||||
- "how quantum entanglement works"
|
||||
- "quantum entanglement experiments"
|
||||
|
||||
### Content Chunks
|
||||
Long articles are split into ~500-character chunks. Each chunk is:
|
||||
- Stored with its source URL and title
|
||||
- Converted to a vector embedding (1536 dimensions)
|
||||
- Scored against your query embedding
|
||||
|
||||
### Semantic Ranking
|
||||
Using cosine similarity between embeddings, OpenQuery ranks chunks by relevance and selects the top N (default: 3) for the final context.
|
||||
|
||||
### Streaming Answer
|
||||
The LLM receives your question plus the top chunks as context and streams the answer in real-time, citing sources like `[Source 1]`.
|
||||
|
||||
## Technology Stack
|
||||
|
||||
| Layer | Technology | Purpose |
|
||||
|-------|------------|---------|
|
||||
| Runtime | .NET 10.0 AOT | Native performance, minimal footprint |
|
||||
| LLM | OpenRouter API | Chat completions and embeddings |
|
||||
| Search | SearxNG | Metasearch engine |
|
||||
| Content Extraction | SmartReader | Article text extraction |
|
||||
| Vector Math | System.Numerics.Tensors | High-performance cosine similarity |
|
||||
| Resilience | Polly | Retry and circuit breaker policies |
|
||||
| CLI | System.CommandLine | Command parsing and help |
|
||||
| JSON | System.Text.Json (source-gen) | Fast serialization |
|
||||
|
||||
## System Workflow
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ OpenQuery Workflow │
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ 1. User Query: "What is quantum entanglement?" │
|
||||
│ │
|
||||
│ 2. Query Generation (Optional) │
|
||||
│ LLM generates: ["quantum entanglement physics", │
|
||||
│ "quantum entanglement definition", │
|
||||
│ "how does quantum entanglement work"] │
|
||||
│ │
|
||||
│ 3. Parallel Searches │
|
||||
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
|
||||
│ │ Query 1 → │→ │ SearxNG │→ │ Results │ │
|
||||
│ └────────────┘ └────────────┘ └────────────┘ │
|
||||
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
|
||||
│ │ Query 2 → │→ │ SearxNG │→ │ Results │ │
|
||||
│ └────────────┘ └────────────┘ └────────────┘ │
|
||||
│ ┌────────────┐ ┌────────────┐ ┌────────────┘ │
|
||||
│ │ Query 3 → │→ │ SearxNG │→ │ Results (combined) │
|
||||
│ └────────────┘ └────────────┘ └────────────┘ │
|
||||
│ │
|
||||
│ 4. Parallel Article Fetching │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||
│ │ URL 1 → │→ │ Article │→ │ Chunks │ │
|
||||
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||
│ │ URL 2 → │→ │ Article │→ │ Chunks │ │
|
||||
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||
│ ... (concurrent, max 10 at a time) │
|
||||
│ │
|
||||
│ 5. Parallel Embeddings │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||
│ │ Chunks │→ │ Embed- │→ │ Vectors │ │
|
||||
│ │ Batch 1 │ │ ding API │ │ │ │
|
||||
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||
│ │ Chunks │→ │ Embed- │→ │ Vectors │ │
|
||||
│ │ Batch 2 │ │ ding API │ │ │ │
|
||||
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||
│ (batches of 300, up to 4 concurrent) │
|
||||
│ │
|
||||
│ 6. Semantic Ranking │
|
||||
│ Query Embedding + Chunk Embeddings → Cosine Similarity → │
|
||||
│ Score → Sort Descending → Top 3 Chunks │
|
||||
│ │
|
||||
│ 7. Final Answer Generation │
|
||||
│ ┌────────────────────────────────────────────┐ │
|
||||
│ │ System: "Answer based on this context:" │ │
|
||||
│ │ Context: [Top 3 chunks with sources] │ │
|
||||
│ │ Question: "What is quantum entanglement?" │ │
|
||||
│ └────────────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ LLM Streams Answer │
|
||||
│ "Quantum entanglement is..." │
|
||||
│ with citations like [Source 1] │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **[Install OpenQuery](installation.md)**
|
||||
2. **[Configure it](configuration.md)**
|
||||
3. **[Start asking questions](usage.md)**
|
||||
|
||||
For detailed technical information, continue to [the architecture guide](architecture.md).
|
||||
|
||||
---
|
||||
|
||||
**Need help?** Check the [Troubleshooting](troubleshooting.md) guide.
|
||||
Reference in New Issue
Block a user