docs: add comprehensive documentation with README and detailed guides
- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
This commit is contained in:
196
README.md
Normal file
196
README.md
Normal file
@@ -0,0 +1,196 @@
|
|||||||
|
# OpenQuery
|
||||||
|
|
||||||
|
**AI-powered search and answer system** that finds accurate, well-sourced answers to your questions by searching the web, extracting relevant content, and synthesizing intelligent responses.
|
||||||
|
|
||||||
|

|
||||||
|
[AOT](https://img.shields.io/badge/AOT-Compiled-green)
|
||||||
|
[License](https://img.shields.io/badge/license-MIT-green)
|
||||||
|
|
||||||
|
## ✨ Features
|
||||||
|
|
||||||
|
- 🤖 **Smart Query Generation** - Automatically creates multiple diverse search queries from your question
|
||||||
|
- ⚡ **Parallel Processing** - Fast concurrent searches, article fetching, and embedding generation
|
||||||
|
- 🎯 **Semantic Search** - Uses vector embeddings to find the most relevant information
|
||||||
|
- 📚 **Clean Article Extraction** - Intelligently extracts article content using SmartReader
|
||||||
|
- 🔄 **Streaming Responses** - Watch the AI answer generate in real-time
|
||||||
|
- ⚙️ **Fully Configurable** - Control queries, results, and context chunks
|
||||||
|
- 🛡️ **Production Ready** - Built with rate limiting, retries, and error handling
|
||||||
|
|
||||||
|
## 🚀 Quick Start
|
||||||
|
|
||||||
|
### 1. Prerequisites
|
||||||
|
|
||||||
|
- A **SearxNG** instance (Docker recommended):
|
||||||
|
```bash
|
||||||
|
docker run -d --name searxng -p 8002:8080 searxng/searxng:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
- A **OpenRouter API key** from [openrouter.ai](https://openrouter.ai)
|
||||||
|
|
||||||
|
### 2. Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clone and build
|
||||||
|
git clone <your-repo-url>
|
||||||
|
cd OpenQuery
|
||||||
|
chmod +x install.sh
|
||||||
|
./install.sh
|
||||||
|
|
||||||
|
# Or build manually
|
||||||
|
dotnet publish -c Release -r linux-x64 --self-contained true /p:PublishAot=true
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Configuration
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Interactive setup
|
||||||
|
openquery configure -i
|
||||||
|
|
||||||
|
# Or set environment variables
|
||||||
|
export OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
export SEARXNG_URL="http://localhost:8002" # default
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Ask a Question
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openquery "What is quantum entanglement and how does it work?"
|
||||||
|
```
|
||||||
|
|
||||||
|
That's it! The system will:
|
||||||
|
1. Generate 3 search queries (configurable)
|
||||||
|
2. Search the web via SearxNG
|
||||||
|
3. Extract and chunk relevant articles
|
||||||
|
4. Rank content by semantic relevance
|
||||||
|
5. Stream a comprehensive answer with citations
|
||||||
|
|
||||||
|
## 📖 Usage Examples
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Concise answer
|
||||||
|
openquery -s "Who won the 2024 US presidential election?"
|
||||||
|
|
||||||
|
# Detailed research
|
||||||
|
openquery -l -q 5 -r 10 "Explain quantum computing and its applications"
|
||||||
|
|
||||||
|
# See everything
|
||||||
|
openquery -v "What are the health benefits of meditation?"
|
||||||
|
|
||||||
|
# Customize
|
||||||
|
openquery -c 5 -r 8 "Current state of SpaceX Starship development"
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🔧 Options
|
||||||
|
|
||||||
|
```
|
||||||
|
-c, --chunks N Number of top context chunks (default: 3)
|
||||||
|
-r, --results N Search results per query (default: 5)
|
||||||
|
-q, --queries N Number of search queries to generate (default: 3)
|
||||||
|
-s, --short Give a concise answer
|
||||||
|
-l, --long Give a detailed answer
|
||||||
|
-v, --verbose Show detailed progress
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🌐 Supported Models
|
||||||
|
|
||||||
|
OpenQuery works with any OpenRouter model. Popular choices:
|
||||||
|
|
||||||
|
- `qwen/qwen3.5-flash-02-23` (default, fast & affordable)
|
||||||
|
- `google/gemini-3-flash-preview`
|
||||||
|
- `deepseek/deepseek-v3.2`
|
||||||
|
- `moonshotai/kimi-k2.5`
|
||||||
|
|
||||||
|
Configure your preferred model:
|
||||||
|
```bash
|
||||||
|
openquery configure --model "google/gemini-3-flash-preview"
|
||||||
|
```
|
||||||
|
|
||||||
|
## 📁 Project Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
OpenQuery/
|
||||||
|
├── README.md # This file
|
||||||
|
├── docs/ # Detailed documentation
|
||||||
|
│ ├── installation.md
|
||||||
|
│ ├── configuration.md
|
||||||
|
│ ├── usage.md
|
||||||
|
│ ├── architecture.md
|
||||||
|
│ ├── components/
|
||||||
|
│ └── troubleshooting.md
|
||||||
|
├── Program.cs # CLI entry point
|
||||||
|
├── OpenQuery.cs # Main application logic
|
||||||
|
├── Services/ # Business logic services
|
||||||
|
├── Models/ # Data models
|
||||||
|
├── Tools/ # Search orchestration
|
||||||
|
└── ConfigManager.cs # Configuration management
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🏗️ Architecture
|
||||||
|
|
||||||
|
OpenQuery uses a multi-stage pipeline:
|
||||||
|
|
||||||
|
```
|
||||||
|
Query → Multiple Searches → Article Fetching → Embeddings → Ranking → AI Answer
|
||||||
|
```
|
||||||
|
|
||||||
|
1. **Query Expansion**: LLM generates diverse search queries
|
||||||
|
2. **Parallel Search**: SearxNG executes all queries simultaneously
|
||||||
|
3. **Content Extraction**: SmartReader pulls clean article text
|
||||||
|
4. **Embedding Generation**: Vectorize query and chunks
|
||||||
|
5. **Semantic Ranking**: Cosine similarity scoring
|
||||||
|
6. **Answer Synthesis**: Final LLM response with sources
|
||||||
|
|
||||||
|
## 🔍 How It Works
|
||||||
|
|
||||||
|
1. **You ask a question** → OpenQuery generates 3 optimized search queries
|
||||||
|
2. **Searches the web** → All queries run in parallel via SearxNG
|
||||||
|
3. **Fetches articles** → Extracts clean content from top results
|
||||||
|
4. **Splits into chunks** → ~500 character pieces for embedding
|
||||||
|
5. **Ranks by relevance** → Semantic similarity to your question
|
||||||
|
6. **Synthesizes answer** → LLM reviews top 3 chunks and responds with citations
|
||||||
|
|
||||||
|
## 🐛 Troubleshooting
|
||||||
|
|
||||||
|
**Common issues** and solutions:
|
||||||
|
|
||||||
|
| Problem | Solution |
|
||||||
|
|---------|----------|
|
||||||
|
| "API Key is missing" | Run `openquery configure -i` or set `OPENROUTER_API_KEY` |
|
||||||
|
| No search results | Check your SearxNG instance is running (`curl http://localhost:8002`) |
|
||||||
|
| Slow performance | Reduce `--results` or `--queries` count |
|
||||||
|
| Articles failing to fetch | Some sites block scrapers; try different queries |
|
||||||
|
|
||||||
|
See [docs/troubleshooting.md](docs/troubleshooting.md) for detailed help.
|
||||||
|
|
||||||
|
## 📚 Documentation
|
||||||
|
|
||||||
|
- **[Installation Guide](docs/installation.md)** - Build and setup instructions
|
||||||
|
- **[Configuration](docs/configuration.md)** - All config options and environment variables
|
||||||
|
- **[Usage Guide](docs/usage.md)** - Complete CLI reference and examples
|
||||||
|
- **[Architecture](docs/architecture.md)** - System design and patterns
|
||||||
|
- **[Components](docs/components/)** - Deep dive into each module
|
||||||
|
- **[Troubleshooting](docs/troubleshooting.md)** - Solve common problems
|
||||||
|
- **[API Reference](docs/api-reference.md)** - Programmatic interfaces
|
||||||
|
|
||||||
|
## 🤝 Contributing
|
||||||
|
|
||||||
|
Contributions welcome! Please:
|
||||||
|
1. Fork the repository
|
||||||
|
2. Create a feature branch
|
||||||
|
3. Make your changes
|
||||||
|
4. Submit a pull request
|
||||||
|
|
||||||
|
## 📄 License
|
||||||
|
|
||||||
|
MIT License - see LICENSE file for details.
|
||||||
|
|
||||||
|
## 🙏 Acknowledgments
|
||||||
|
|
||||||
|
- [OpenRouter](https://openrouter.ai) - LLM API aggregation
|
||||||
|
- [SearxNG](https://searx.space) - Privacy-respecting metasearch
|
||||||
|
- [SmartReader](https://github.com/kfasten/SmartReader) - Article extraction
|
||||||
|
- Built with [.NET](https://dotnet.microsoft.com)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Need more details?** Check the comprehensive documentation in the [docs/](docs/) folder.
|
||||||
309
docs/api/cli.md
Normal file
309
docs/api/cli.md
Normal file
@@ -0,0 +1,309 @@
|
|||||||
|
# CLI Reference
|
||||||
|
|
||||||
|
Complete command-line interface reference for OpenQuery.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Command Structure](#command-structure)
|
||||||
|
2. [Main Command: `openquery`](#main-command-openquery)
|
||||||
|
3. [Configure Command: `openquery configure`](#configure-command-openquery-configure)
|
||||||
|
4. [Exit Codes](#exit-codes)
|
||||||
|
5. [Examples by Use Case](#examples-by-use-case)
|
||||||
|
6. [Shell Integration](#shell-integration)
|
||||||
|
|
||||||
|
## Command Structure
|
||||||
|
|
||||||
|
OpenQuery uses [System.CommandLine](https://learn.microsoft.com/dotnet/standard/commandline/) for CLI parsing.
|
||||||
|
|
||||||
|
### Syntax
|
||||||
|
```bash
|
||||||
|
openquery [GLOBAL-OPTIONS] <COMMAND> [COMMAND-OPTIONS] [ARGUMENTS]
|
||||||
|
```
|
||||||
|
|
||||||
|
If no command specified, `openquery` (main command) is assumed.
|
||||||
|
|
||||||
|
### Help
|
||||||
|
```bash
|
||||||
|
openquery --help
|
||||||
|
openquery configure --help
|
||||||
|
```
|
||||||
|
|
||||||
|
Shows usage, options, examples.
|
||||||
|
|
||||||
|
### Version
|
||||||
|
```bash
|
||||||
|
openquery --version # if implemented
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Main Command: `openquery`
|
||||||
|
|
||||||
|
Ask a question and get an AI-powered answer.
|
||||||
|
|
||||||
|
### Synopsis
|
||||||
|
```bash
|
||||||
|
openquery [OPTIONS] <question>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Arguments
|
||||||
|
|
||||||
|
| Name | Arity | Type | Description |
|
||||||
|
|------|-------|------|-------------|
|
||||||
|
| `question` | ZeroOrMore | `string[]` | The question to ask (positional, concatenated with spaces) |
|
||||||
|
|
||||||
|
**Notes**:
|
||||||
|
- `ZeroOrMore` means you can omit the question (shows help)
|
||||||
|
- Multiple words are combined: `openquery what is quantum` → `"what is quantum"`
|
||||||
|
- Use quotes for questions with special characters: `openquery "what's the weather?"`
|
||||||
|
|
||||||
|
### Options
|
||||||
|
|
||||||
|
| Option | Aliases | Type | Default | Description |
|
||||||
|
|--------|---------|------|---------|-------------|
|
||||||
|
| `--chunks` | `-c` | `int` | `DefaultChunks` (config) | Number of top context chunks to pass to LLM |
|
||||||
|
| `--results` | `-r` | `int` | `DefaultResults` (config) | Number of search results per query |
|
||||||
|
| `--queries` | `-q` | `int` | `DefaultQueries` (config) | Number of search queries to generate |
|
||||||
|
| `--short` | `-s` | `bool` | `false` | Request a concise answer |
|
||||||
|
| `--long` | `-l` | `bool` | `false` | Request a detailed answer |
|
||||||
|
| `--verbose` | `-v` | `bool` | `false` | Show detailed progress information |
|
||||||
|
|
||||||
|
**Option Notes**:
|
||||||
|
- `--short` and `--long` are flags; if both specified, `--long` takes precedence
|
||||||
|
- Integer options validate as positive numbers (parsed by System.CommandLine)
|
||||||
|
- Defaults come from config file or hardcoded (3, 5, 3 respectively)
|
||||||
|
|
||||||
|
### Behavior
|
||||||
|
|
||||||
|
1. Loads API key (env `OPENROUTER_API_KEY` or config file)
|
||||||
|
2. Loads model (env `OPENROUTER_MODEL` or config)
|
||||||
|
3. Executes workflow:
|
||||||
|
- Generate queries (if `--queries > 1`)
|
||||||
|
- Run search pipeline
|
||||||
|
- Stream final answer
|
||||||
|
4. Exits with code 0 on success, 1 on error
|
||||||
|
|
||||||
|
### Examples
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Basic
|
||||||
|
openquery "What is the capital of France?"
|
||||||
|
|
||||||
|
# With options
|
||||||
|
openquery -q 5 -r 10 -c 4 "Explain quantum computing"
|
||||||
|
|
||||||
|
# Short answer
|
||||||
|
openquery -s "Who won the 2024 election?"
|
||||||
|
|
||||||
|
# Verbose mode
|
||||||
|
openquery -v "How does photosynthesis work?"
|
||||||
|
|
||||||
|
# Combined
|
||||||
|
openquery -l -v -q 8 "History of the internet"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configure Command: `openquery configure`
|
||||||
|
|
||||||
|
Configure OpenQuery settings (API key, model, defaults).
|
||||||
|
|
||||||
|
### Synopsis
|
||||||
|
```bash
|
||||||
|
openquery configure [OPTIONS]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Options
|
||||||
|
|
||||||
|
| Option | Type | Description |
|
||||||
|
|--------|------|-------------|
|
||||||
|
| `--interactive` / `-i` | `bool` | Launch interactive configuration wizard |
|
||||||
|
| `--key` | `string` | Set OpenRouter API key |
|
||||||
|
| `--model` | `string` | Set default LLM model |
|
||||||
|
| `--queries` | `int?` | Set default number of queries |
|
||||||
|
| `--chunks` | `int?` | Set default number of chunks |
|
||||||
|
| `--results` | `int?` | Set default number of results |
|
||||||
|
|
||||||
|
**Note**: Nullable options (`int?`) only update if provided.
|
||||||
|
|
||||||
|
### Behavior
|
||||||
|
|
||||||
|
- **Interactive mode** (`-i`): Prompts for each setting with current defaults shown in brackets
|
||||||
|
- **Non-interactive**: Only updates provided options, leaves others untouched
|
||||||
|
- Writes to `~/.config/openquery/config` (creates directory if missing)
|
||||||
|
- Overwrites entire file (not incremental)
|
||||||
|
|
||||||
|
### Interactive Mode Details
|
||||||
|
|
||||||
|
Models presented with numbered menu:
|
||||||
|
|
||||||
|
```
|
||||||
|
Available models:
|
||||||
|
1. qwen/qwen3.5-flash-02-23
|
||||||
|
2. qwen/qwen3.5-122b-a10b
|
||||||
|
3. minimax/minimax-m2.5
|
||||||
|
4. google/gemini-3-flash-preview
|
||||||
|
5. deepseek/deepseek-v3.2
|
||||||
|
6. moonshotai/kuki-k2.5
|
||||||
|
Model [qwen/qwen3.5-flash-02-23]:
|
||||||
|
```
|
||||||
|
|
||||||
|
- Enter number (1-6) to select preset
|
||||||
|
- Or enter custom model string (any OpenRouter model)
|
||||||
|
|
||||||
|
### Examples
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Interactive wizard
|
||||||
|
openquery configure -i
|
||||||
|
|
||||||
|
# Set just API key
|
||||||
|
openquery configure --key "sk-or-xxxxxxxxxxxx"
|
||||||
|
|
||||||
|
# Set multiple defaults
|
||||||
|
openquery configure --model "google/gemini-3-flash-preview" --queries 5 --chunks 4
|
||||||
|
|
||||||
|
# Update model only
|
||||||
|
openquery configure --model "deepseek/deepseek-v3.2"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Exit Codes
|
||||||
|
|
||||||
|
| Code | Meaning |
|
||||||
|
|------|---------|
|
||||||
|
| `0` | Success - answer generated and streamed |
|
||||||
|
| `1` | Error - API key missing, network failure, or exception |
|
||||||
|
|
||||||
|
**Usage in scripts**:
|
||||||
|
```bash
|
||||||
|
openquery "question"
|
||||||
|
if [ $? -eq 0 ]; then
|
||||||
|
echo "Success"
|
||||||
|
else
|
||||||
|
echo "Failed" >&2
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Examples by Use Case
|
||||||
|
|
||||||
|
### Quick Facts
|
||||||
|
```bash
|
||||||
|
openquery -s "capital of France"
|
||||||
|
```
|
||||||
|
Fast, concise, minimal tokens.
|
||||||
|
|
||||||
|
### Research Paper
|
||||||
|
```bash
|
||||||
|
openquery -l -q 5 -r 10 -c 4 "quantum entanglement experiments"
|
||||||
|
```
|
||||||
|
Multiple angles, deep sources, detailed synthesis.
|
||||||
|
|
||||||
|
### News & Current Events
|
||||||
|
```bash
|
||||||
|
openquery -v "latest news about OpenAI"
|
||||||
|
```
|
||||||
|
See everything: queries, results, which sources fetched.
|
||||||
|
|
||||||
|
### Troubleshooting
|
||||||
|
```bash
|
||||||
|
# Reduce scope if errors
|
||||||
|
openquery -q 1 -r 2 "test question"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Save Answer to File
|
||||||
|
```bash
|
||||||
|
openquery "question" 2>/dev/null | sed 's/.\x08//g' > answer.md
|
||||||
|
```
|
||||||
|
|
||||||
|
(Removes spinner characters)
|
||||||
|
|
||||||
|
### Batch Processing
|
||||||
|
```bash
|
||||||
|
for q in $(cat questions.txt); do
|
||||||
|
echo "## $q" >> all-answers.md
|
||||||
|
openquery -s "$q" 2>/dev/null | sed 's/.\x08//g' >> all-answers.md
|
||||||
|
echo "" >> all-answers.md
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Shell Integration
|
||||||
|
|
||||||
|
### Aliases (add to ~/.bashrc or ~/.zshrc)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Short alias
|
||||||
|
alias oq='openquery'
|
||||||
|
|
||||||
|
# Presets
|
||||||
|
alias oqs='openquery -s' # short
|
||||||
|
alias oql='openquery -l' # long
|
||||||
|
alias oqv='openquery -v' # verbose
|
||||||
|
alias oqr='openquery -q 5 -r 10 -c 4' # research mode
|
||||||
|
|
||||||
|
# Config shortcuts
|
||||||
|
alias oqcfg='openquery configure -i'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Functions
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Save answer cleanly (removes spinner chars)
|
||||||
|
oqsave() {
|
||||||
|
local query="$*"
|
||||||
|
local filename="answer-$(date +%Y%m%d-%H%M%S).md"
|
||||||
|
openquery "$query" 2>/dev/null | sed 's/.\x08//g' > "$filename"
|
||||||
|
echo "Saved to $filename"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Search and grep results
|
||||||
|
oqgrep() {
|
||||||
|
openquery "$1" 2>/dev/null | sed 's/.\x08//g' | grep -i "$2"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Environment Setup Script
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# ~/.local/bin/openquery-env.sh
|
||||||
|
export OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
export OPENROUTER_MODEL="qwen/qwen3.5-flash-02-23"
|
||||||
|
export SEARXNG_URL="http://localhost:8002"
|
||||||
|
```
|
||||||
|
|
||||||
|
Source it: `source ~/.local/bin/openquery-env.sh`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- **[Configuration](configuration.md)** - Set up your environment
|
||||||
|
- **[Usage](usage.md)** - Learn usage patterns and tips
|
||||||
|
- **[Troubleshooting](troubleshooting.md)** - Fix common problems
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Quick Reference Card**
|
||||||
|
|
||||||
|
```
|
||||||
|
# Ask
|
||||||
|
openquery "question"
|
||||||
|
openquery -s "quick fact"
|
||||||
|
openquery -l -q 5 "deep research"
|
||||||
|
|
||||||
|
# Configure
|
||||||
|
openquery configure -i
|
||||||
|
openquery configure --key "..."
|
||||||
|
openquery configure --model "..."
|
||||||
|
|
||||||
|
# Debug
|
||||||
|
openquery -v "question"
|
||||||
|
|
||||||
|
# Help
|
||||||
|
openquery --help
|
||||||
|
```
|
||||||
235
docs/api/environment-variables.md
Normal file
235
docs/api/environment-variables.md
Normal file
@@ -0,0 +1,235 @@
|
|||||||
|
# Environment Variables
|
||||||
|
|
||||||
|
Reference for all environment variables used by OpenQuery.
|
||||||
|
|
||||||
|
## 📋 Summary
|
||||||
|
|
||||||
|
| Variable | Purpose | Required | Default | Example |
|
||||||
|
|----------|---------|----------|---------|---------|
|
||||||
|
| `OPENROUTER_API_KEY` | OpenRouter authentication | **Yes** | (none) | `sk-or-...` |
|
||||||
|
| `OPENROUTER_MODEL` | Override default LLM model | No | `qwen/qwen3.5-flash-02-23` | `google/gemini-3-flash-preview` |
|
||||||
|
| `SEARXNG_URL` | SearxNG instance URL | No | `http://localhost:8002` | `https://searx.example.com` |
|
||||||
|
|
||||||
|
## Detailed Reference
|
||||||
|
|
||||||
|
### `OPENROUTER_API_KEY`
|
||||||
|
|
||||||
|
**Purpose**: Your OpenRouter API authentication token.
|
||||||
|
|
||||||
|
**Required**: Yes, unless you have `ApiKey` set in config file.
|
||||||
|
|
||||||
|
**How to Obtain**:
|
||||||
|
1. Sign up at https://openrouter.ai
|
||||||
|
2. Go to Dashboard → API Keys
|
||||||
|
3. Copy your key (starts with `sk-or-`)
|
||||||
|
|
||||||
|
**Priority**: Overrides config file `ApiKey`.
|
||||||
|
|
||||||
|
**Setting**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Bash/Zsh
|
||||||
|
export OPENROUTER_API_KEY="sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
|
||||||
|
|
||||||
|
# Fish
|
||||||
|
set -x OPENROUTER_API_KEY "sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
|
||||||
|
|
||||||
|
# PowerShell
|
||||||
|
$env:OPENROUTER_API_KEY="sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
|
||||||
|
|
||||||
|
# Windows CMD
|
||||||
|
set OPENROUTER_API_KEY=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||||
|
```
|
||||||
|
|
||||||
|
**Security**:
|
||||||
|
- Never commit API key to version control
|
||||||
|
- Don't share key publicly
|
||||||
|
- Use environment variables or config file with restrictive permissions (600)
|
||||||
|
- Rotate key if accidentally exposed
|
||||||
|
|
||||||
|
**Validation**: OpenQuery checks if key is empty string and exits with error if missing:
|
||||||
|
|
||||||
|
```
|
||||||
|
[Error] API Key is missing. Set OPENROUTER_API_KEY environment variable or run 'configure -i' to set it up.
|
||||||
|
```
|
||||||
|
|
||||||
|
### `OPENROUTER_MODEL`
|
||||||
|
|
||||||
|
**Purpose**: Override the default LLM model used for both query generation and final answer.
|
||||||
|
|
||||||
|
**Required**: No.
|
||||||
|
|
||||||
|
**Default**: `qwen/qwen3.5-flash-02-23`
|
||||||
|
|
||||||
|
**Available Models** (from OpenRouter):
|
||||||
|
|
||||||
|
| Model | Provider | Context | Cost (Input/Output per 1M tokens) |
|
||||||
|
|-------|----------|---------|-----------------------------------|
|
||||||
|
| `qwen/qwen3.5-flash-02-23` | Alibaba | 200K | \$0.10 / \$0.20 |
|
||||||
|
| `qwen/qwen3.5-122b-a10b` | Alibaba | 200K | ~\$0.20 / ~\$0.40 |
|
||||||
|
| `minimax/minimax-m2.5` | MiniMax | 200K | ~\$0.20 / ~\$0.40 |
|
||||||
|
| `google/gemini-3-flash-preview` | Google | 1M | ~\$0.10 / ~\$0.40 |
|
||||||
|
| `deepseek/deepseek-v3.2` | DeepSeek | 200K | ~\$0.10 / ~\$0.30 |
|
||||||
|
| `moonshotai/kimi-k2.5` | Moonshot AI | 200K | ~\$0.10 / ~\$0.30 |
|
||||||
|
|
||||||
|
(See OpenRouter for current pricing.)
|
||||||
|
|
||||||
|
**Setting**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export OPENROUTER_MODEL="google/gemini-3-flash-preview"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Interactive Config Models**: The `configure -i` wizard shows only these 6 models for convenience, but you can set any OpenRouter model via environment variable or non-interactive configure.
|
||||||
|
|
||||||
|
**Note**: Different models have different:
|
||||||
|
- Speed (Flash models faster)
|
||||||
|
- Cost (check pricing)
|
||||||
|
- Quality (may vary by task)
|
||||||
|
- Context window size (Gemini 3 Flash has 1M tokens, others ~200K)
|
||||||
|
|
||||||
|
### `SEARXNG_URL`
|
||||||
|
|
||||||
|
**Purpose**: URL of the SearxNG metasearch instance.
|
||||||
|
|
||||||
|
**Required**: No.
|
||||||
|
|
||||||
|
**Default**: `http://localhost:8002`
|
||||||
|
|
||||||
|
**Format**: Must include protocol (`http://` or `https://`) and host:port.
|
||||||
|
|
||||||
|
**Setting**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Local Docker instance
|
||||||
|
export SEARXNG_URL="http://localhost:8002"
|
||||||
|
|
||||||
|
# Remote instance with HTTPS
|
||||||
|
export SEARXNG_URL="https://searx.example.com"
|
||||||
|
|
||||||
|
# Custom port
|
||||||
|
export SEARXNG_URL="http://localhost:8080"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Finding a Public Instance**:
|
||||||
|
- Visit https://searx.space for list of public instances
|
||||||
|
- Choose one with HTTPS and low latency
|
||||||
|
- Note: Public instances may have rate limits or require attribution
|
||||||
|
|
||||||
|
**Priority**: Overrides any default, but not config file (no config setting for SearxNG URL - only env var). Could be added to config in future.
|
||||||
|
|
||||||
|
**Test Your Instance**:
|
||||||
|
```bash
|
||||||
|
curl "$SEARXNG_URL/search?q=test&format=json" | head
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: JSON with `"results": [...]`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Configuration Priority Recap
|
||||||
|
|
||||||
|
When OpenQuery needs a value:
|
||||||
|
|
||||||
|
1. **Command-line option** (`--model`, `--key` from configure) - highest
|
||||||
|
2. **Environment variable** (`OPENROUTER_MODEL`, `OPENROUTER_API_KEY`, `SEARXNG_URL`)
|
||||||
|
3. **Configuration file** (`~/.config/openquery/config`: `Model`, `ApiKey`)
|
||||||
|
4. **Hard-coded default** (only for model)
|
||||||
|
|
||||||
|
**Example**:
|
||||||
|
```bash
|
||||||
|
# Config file: Model=qwen/qwen3.5-flash-02-23
|
||||||
|
export OPENROUTER_MODEL="deepseek/deepseek-v3.2"
|
||||||
|
openquery --model "google/gemini-3-flash-preview" "question"
|
||||||
|
# Uses: model=google (CLI override), overrides env and config
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshooting Environment Variables
|
||||||
|
|
||||||
|
### Variable Not Taking Effect
|
||||||
|
|
||||||
|
**Symptom**: `openquery` still uses old value after export.
|
||||||
|
|
||||||
|
**Causes**:
|
||||||
|
- Exported in different shell session
|
||||||
|
- Exported after running `openquery`
|
||||||
|
- Shell profile not reloaded
|
||||||
|
|
||||||
|
**Check**:
|
||||||
|
```bash
|
||||||
|
echo $OPENROUTER_API_KEY
|
||||||
|
# Should print the key (or blank if unset)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Fix**:
|
||||||
|
```bash
|
||||||
|
# Export in current session
|
||||||
|
export OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
|
||||||
|
# Or add to ~/.bashrc / ~/.zshrc and restart terminal
|
||||||
|
```
|
||||||
|
|
||||||
|
### Special Characters in Values
|
||||||
|
|
||||||
|
If your API key contains special characters (`$`, `!`, etc.), quote properly:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export OPENROUTER_API_KEY='sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
|
||||||
|
# Single quotes prevent shell expansion
|
||||||
|
```
|
||||||
|
|
||||||
|
### Variable Name Typos
|
||||||
|
|
||||||
|
`OPENROUTER_API_KEY` is all caps with underscores. `openrouter_api_key` (lowercase) won't work.
|
||||||
|
|
||||||
|
**Check spelling**:
|
||||||
|
```bash
|
||||||
|
env | grep -i openrouter
|
||||||
|
```
|
||||||
|
|
||||||
|
### Windows Environment Variables
|
||||||
|
|
||||||
|
On Windows, environment variables are set per-session or user-level:
|
||||||
|
|
||||||
|
**PowerShell** (current session):
|
||||||
|
```powershell
|
||||||
|
$env:OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
```
|
||||||
|
|
||||||
|
**Persistent** (PowerShell):
|
||||||
|
```powershell
|
||||||
|
[Environment]::SetEnvironmentVariable("OPENROUTER_API_KEY", "sk-or-...", "User")
|
||||||
|
```
|
||||||
|
|
||||||
|
**CMD**:
|
||||||
|
```cmd
|
||||||
|
set OPENROUTER_API_KEY=sk-or-...
|
||||||
|
```
|
||||||
|
|
||||||
|
**System Properties** → Advanced → Environment Variables (GUI)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- **[Configuration File](../configuration.md)** - Persistent configuration
|
||||||
|
- **[Usage Guide](../usage.md)** - How to use these variables
|
||||||
|
- **[Troubleshooting](../troubleshooting.md)** - Diagnose environment issues
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Quick Reference**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Required
|
||||||
|
export OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
|
||||||
|
# Optional (override defaults)
|
||||||
|
export OPENROUTER_MODEL="google/gemini-3-flash-preview"
|
||||||
|
export SEARXNG_URL="https://searx.example.com"
|
||||||
|
|
||||||
|
# Run
|
||||||
|
openquery "your question"
|
||||||
|
```
|
||||||
508
docs/api/programmatic.md
Normal file
508
docs/api/programmatic.md
Normal file
@@ -0,0 +1,508 @@
|
|||||||
|
# Programmatic API Reference
|
||||||
|
|
||||||
|
How to use OpenQuery components programmatically in your own C# code.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Overview](#overview)
|
||||||
|
2. [Using OpenQueryApp Programmatically](#using-openqueryapp-programmatically)
|
||||||
|
3. [Using Individual Services](#using-individual-services)
|
||||||
|
4. [Custom Implementations](#custom-implementations)
|
||||||
|
5. [Thread Safety](#thread-safety)
|
||||||
|
6. [Error Handling](#error-handling)
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
OpenQuery is designed as a library of composable services, not just a CLI tool. You can reference the project (or extract the core classes) and use them in your own applications.
|
||||||
|
|
||||||
|
### Core Interfaces
|
||||||
|
|
||||||
|
Currently, OpenQuery uses concrete classes rather than interfaces. To use programmatically:
|
||||||
|
|
||||||
|
1. Reference the `OpenQuery` project/dll
|
||||||
|
2. Add `using OpenQuery.Services;` and `using OpenQuery.Tools;`
|
||||||
|
3. Instantiate dependencies
|
||||||
|
4. Call methods
|
||||||
|
|
||||||
|
### Dependency Chain
|
||||||
|
|
||||||
|
```
|
||||||
|
Your Code
|
||||||
|
├── OpenRouterClient (LLM API)
|
||||||
|
├── SearxngClient (Search API)
|
||||||
|
├── EmbeddingService (requires OpenRouterClient)
|
||||||
|
└── SearchTool (requires SearxngClient + EmbeddingService)
|
||||||
|
└── (internally uses ArticleService, ChunkingService, RateLimiter)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Using OpenQueryApp Programmatically
|
||||||
|
|
||||||
|
### Minimal Example
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
using OpenQuery;
|
||||||
|
using OpenQuery.Services;
|
||||||
|
using OpenQuery.Tools;
|
||||||
|
using OpenQuery.Models;
|
||||||
|
|
||||||
|
// 1. Configure
|
||||||
|
string apiKey = Environment.GetEnvironmentVariable("OPENROUTER_API_KEY")
|
||||||
|
?? throw new InvalidOperationException("API key required");
|
||||||
|
string searxngUrl = Environment.GetEnvironmentVariable("SEARXNG_URL")
|
||||||
|
?? "http://localhost:8002";
|
||||||
|
string model = Environment.GetEnvironmentVariable("OPENROUTER_MODEL")
|
||||||
|
?? "qwen/qwen3.5-flash-02-23";
|
||||||
|
|
||||||
|
// 2. Instantiate services
|
||||||
|
var openRouterClient = new OpenRouterClient(apiKey);
|
||||||
|
var searxngClient = new SearxngClient(searxngUrl);
|
||||||
|
var embeddingService = new EmbeddingService(openRouterClient);
|
||||||
|
var searchTool = new SearchTool(searxngClient, embeddingService);
|
||||||
|
var openQuery = new OpenQueryApp(openRouterClient, searchTool, model);
|
||||||
|
|
||||||
|
// 3. Execute
|
||||||
|
var options = new OpenQueryOptions(
|
||||||
|
Chunks: 3,
|
||||||
|
Results: 5,
|
||||||
|
Queries: 3,
|
||||||
|
Short: false,
|
||||||
|
Long: false,
|
||||||
|
Verbose: false,
|
||||||
|
Question: "What is quantum entanglement?"
|
||||||
|
);
|
||||||
|
|
||||||
|
await openQuery.RunAsync(options);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Output**: Streams answer to `Console.Out` (hardcoded in `OpenQueryApp`). To capture output, modify `OpenQueryApp` or redirect console.
|
||||||
|
|
||||||
|
### Capturing Output
|
||||||
|
|
||||||
|
`OpenQueryApp.RunAsync` writes directly to `Console`. To capture:
|
||||||
|
|
||||||
|
**Option 1**: Redirect Console (hacky)
|
||||||
|
```csharp
|
||||||
|
var sw = new StringWriter();
|
||||||
|
Console.SetOut(sw);
|
||||||
|
await openQuery.RunAsync(options);
|
||||||
|
string answer = sw.ToString();
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option 2**: Modify OpenQueryApp to accept TextWriter (not currently supported)
|
||||||
|
|
||||||
|
**Option 3**: Reimplement using OpenQuery components without `OpenQueryApp`
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public async Task<string> GetAnswerAsync(string question, OpenQueryOptions options)
|
||||||
|
{
|
||||||
|
var sb = new StringBuilder();
|
||||||
|
var reporter = new StatusReporter(options.Verbose);
|
||||||
|
|
||||||
|
// Replicate OpenQueryApp.RunAsync but collect output
|
||||||
|
// ... (copy logic from OpenQuery.cs)
|
||||||
|
|
||||||
|
return sb.ToString();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Using Individual Services
|
||||||
|
|
||||||
|
### OpenRouterClient
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var client = new OpenRouterClient("your-api-key");
|
||||||
|
|
||||||
|
// Non-streaming chat completion
|
||||||
|
var request = new ChatCompletionRequest(
|
||||||
|
model: "qwen/qwen3.5-flash-02-23",
|
||||||
|
messages: new List<Message>
|
||||||
|
{
|
||||||
|
new Message("system", "You are a helpful assistant."),
|
||||||
|
new Message("user", "What is 2+2?")
|
||||||
|
}
|
||||||
|
);
|
||||||
|
|
||||||
|
var response = await client.CompleteAsync(request);
|
||||||
|
Console.WriteLine(response.Choices[0].Message.Content);
|
||||||
|
|
||||||
|
// Streaming chat completion
|
||||||
|
var streamRequest = request with { Stream = true };
|
||||||
|
await foreach (var chunk in client.StreamAsync(streamRequest))
|
||||||
|
{
|
||||||
|
if (chunk.TextDelta != null)
|
||||||
|
Console.Write(chunk.TextDelta);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Embeddings
|
||||||
|
var embeddingRequest = new EmbeddingRequest(
|
||||||
|
model: "openai/text-embedding-3-small",
|
||||||
|
input: new List<string> { "text 1", "text 2" }
|
||||||
|
);
|
||||||
|
float[][] embeddings = await client.EmbedAsync(embeddingRequest.Model, embeddingRequest.Input);
|
||||||
|
// embeddings[0] is vector for "text 1"
|
||||||
|
```
|
||||||
|
|
||||||
|
### SearxngClient
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var searxng = new SearxngClient("http://localhost:8002");
|
||||||
|
|
||||||
|
List<SearxngResult> results = await searxng.SearchAsync("quantum physics", limit: 5);
|
||||||
|
|
||||||
|
foreach (var result in results)
|
||||||
|
{
|
||||||
|
Console.WriteLine($"{result.Title}");
|
||||||
|
Console.WriteLine($"{result.Url}");
|
||||||
|
Console.WriteLine($"{result.Content}");
|
||||||
|
Console.WriteLine();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### EmbeddingService
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var client = new OpenRouterClient("your-api-key");
|
||||||
|
var embeddingService = new EmbeddingService(client); // default model: openai/text-embedding-3-small
|
||||||
|
|
||||||
|
// Single embedding
|
||||||
|
float[] embedding = await embeddingService.GetEmbeddingAsync("Hello world");
|
||||||
|
|
||||||
|
// Batch embeddings (with progress)
|
||||||
|
List<string> texts = new() { "text 1", "text 2", "text 3" };
|
||||||
|
float[][] embeddings = await embeddingService.GetEmbeddingsAsync(
|
||||||
|
texts,
|
||||||
|
onProgress: msg => Console.WriteLine(msg)
|
||||||
|
);
|
||||||
|
|
||||||
|
// Cosine similarity
|
||||||
|
float similarity = EmbeddingService.CosineSimilarity(embedding1, embedding2);
|
||||||
|
```
|
||||||
|
|
||||||
|
### ArticleService
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var article = await ArticleService.FetchArticleAsync("https://example.com/article");
|
||||||
|
Console.WriteLine(article.Title);
|
||||||
|
Console.WriteLine(article.TextContent);
|
||||||
|
Console.WriteLine($"Readable: {article.IsReadable}");
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: `Article` type comes from SmartReader library (not OpenQuery-specific).
|
||||||
|
|
||||||
|
### ChunkingService
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
List<string> chunks = ChunkingService.ChunkText("Long article text...");
|
||||||
|
|
||||||
|
foreach (var chunk in chunks)
|
||||||
|
{
|
||||||
|
Console.WriteLine($"Chunk ({chunk.Length} chars): {chunk.Substring(0, 50)}...");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### SearchTool (Orchestration)
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var searxngClient = new SearxngClient("http://localhost:8002");
|
||||||
|
var embeddingService = new EmbeddingService(openRouterClient);
|
||||||
|
var searchTool = new SearchTool(searxngClient, embeddingService);
|
||||||
|
|
||||||
|
string context = await searchTool.ExecuteAsync(
|
||||||
|
originalQuery: "What is quantum entanglement?",
|
||||||
|
generatedQueries: new List<string>
|
||||||
|
{
|
||||||
|
"quantum entanglement definition",
|
||||||
|
"how quantum entanglement works"
|
||||||
|
},
|
||||||
|
maxResults: 5,
|
||||||
|
topChunksLimit: 3,
|
||||||
|
onProgress: msg => Console.WriteLine(msg),
|
||||||
|
verbose: true
|
||||||
|
);
|
||||||
|
|
||||||
|
Console.WriteLine("Context:");
|
||||||
|
Console.WriteLine(context);
|
||||||
|
```
|
||||||
|
|
||||||
|
Output is a formatted string:
|
||||||
|
```
|
||||||
|
[Source 1: Title](https://example.com/1)
|
||||||
|
Content chunk...
|
||||||
|
|
||||||
|
[Source 2: Title](https://example.com/2)
|
||||||
|
Content chunk...
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Custom Implementations
|
||||||
|
|
||||||
|
### Custom Progress Reporter
|
||||||
|
|
||||||
|
`SearchTool.ExecuteAsync` accepts `Action<string>? onProgress`. Provide your own:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public class MyProgressReporter
|
||||||
|
{
|
||||||
|
public void Report(string message)
|
||||||
|
{
|
||||||
|
// Log to file
|
||||||
|
File.AppendAllText("log.txt", $"{DateTime.UtcNow}: {message}\n");
|
||||||
|
|
||||||
|
// Update UI
|
||||||
|
myLabel.Text = message;
|
||||||
|
|
||||||
|
// Send to telemetry
|
||||||
|
Telemetry.TrackEvent("OpenQueryProgress", new { message });
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Usage
|
||||||
|
var reporter = new MyProgressReporter();
|
||||||
|
await searchTool.ExecuteAsync(..., reporter.Report, verbose: false);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Custom Chunking Strategy
|
||||||
|
|
||||||
|
Extend `ChunkingService` or implement your own:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public static class MyChunkingService
|
||||||
|
{
|
||||||
|
public static List<string> ChunkText(string text, int maxSize = 500, int overlap = 50)
|
||||||
|
{
|
||||||
|
// Overlapping chunks for better context retrieval
|
||||||
|
var chunks = new List<string>();
|
||||||
|
int start = 0;
|
||||||
|
while (start < text.Length)
|
||||||
|
{
|
||||||
|
int end = Math.Min(start + maxSize, text.Length);
|
||||||
|
var chunk = text.Substring(start, end - start);
|
||||||
|
chunks.Add(chunk);
|
||||||
|
start += maxSize - overlap; // Slide window
|
||||||
|
}
|
||||||
|
return chunks;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Custom Rate Limiter
|
||||||
|
|
||||||
|
Implement `IAsyncDisposable` with your own strategy (token bucket, leaky bucket):
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public class TokenBucketRateLimiter : IAsyncDisposable
|
||||||
|
{
|
||||||
|
private readonly SemaphoreSlim _semaphore;
|
||||||
|
private readonly TimeSpan _refillPeriod;
|
||||||
|
private int _tokens;
|
||||||
|
private readonly int _maxTokens;
|
||||||
|
|
||||||
|
// Implementation details...
|
||||||
|
|
||||||
|
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken ct)
|
||||||
|
{
|
||||||
|
await WaitForTokenAsync(ct);
|
||||||
|
try
|
||||||
|
{
|
||||||
|
return await action();
|
||||||
|
}
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
// Return tokens or replenish bucket
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Thread Safety
|
||||||
|
|
||||||
|
**Thread-Safe Components**:
|
||||||
|
- `RateLimiter` - `SemaphoreSlim` is thread-safe
|
||||||
|
- `StatusReporter` - Channel is thread-safe
|
||||||
|
- Static utility classes (`ChunkingService`) - no state
|
||||||
|
|
||||||
|
**Not Thread-Safe** (instances should not be shared across threads):
|
||||||
|
- `OpenRouterClient` - wraps `HttpClient` (which is thread-safe but instance may have state)
|
||||||
|
- `SearxngClient` - `HttpClient` (thread-safe but reuse recommendations apply)
|
||||||
|
- `EmbeddingService` - has mutable fields (`_rateLimiter`, `_retryPipeline`)
|
||||||
|
- `SearchTool` - has mutable `_options`
|
||||||
|
|
||||||
|
**Recommendation**: Create new instances per operation or use locks if sharing.
|
||||||
|
|
||||||
|
### Example: Parallel Queries
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var tasks = questions.Select(async question =>
|
||||||
|
{
|
||||||
|
var options = new OpenQueryOptions(..., question: question);
|
||||||
|
var query = new OpenQueryApp(client, searchTool, model);
|
||||||
|
await query.RunAsync(options);
|
||||||
|
// Separate instances per task
|
||||||
|
});
|
||||||
|
|
||||||
|
await Task.WhenAll(tasks);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Better**: Create factory that spawns fresh instances.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
All public async methods may throw:
|
||||||
|
|
||||||
|
- `HttpRequestException` - network errors, non-2xx responses
|
||||||
|
- `TaskCanceledException` - timeout or cancellation
|
||||||
|
- `JsonException` - malformed JSON
|
||||||
|
- `Argument*Exception` - invalid arguments
|
||||||
|
- `Exception` - any other error
|
||||||
|
|
||||||
|
### Pattern: Try-Catch
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
try
|
||||||
|
{
|
||||||
|
var response = await client.CompleteAsync(request);
|
||||||
|
Console.WriteLine(response.Choices[0].Message.Content);
|
||||||
|
}
|
||||||
|
catch (HttpRequestException ex)
|
||||||
|
{
|
||||||
|
Console.Error.WriteLine($"Network error: {ex.Message}");
|
||||||
|
}
|
||||||
|
catch (Exception ex)
|
||||||
|
{
|
||||||
|
Console.Error.WriteLine($"Unexpected error: {ex.Message}");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pattern: Resilience with Polly
|
||||||
|
|
||||||
|
`EmbeddingService` already wraps `client.EmbedAsync` with Polly retry. For other calls, you can add your own:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var retryPolicy = Policy
|
||||||
|
.Handle<HttpRequestException>()
|
||||||
|
.WaitAndRetryAsync(3, attempt => TimeSpan.FromSeconds(Math.Pow(2, attempt)));
|
||||||
|
|
||||||
|
await retryPolicy.ExecuteAsync(async () =>
|
||||||
|
{
|
||||||
|
var response = await client.CompleteAsync(request);
|
||||||
|
// ...
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Advanced Usage
|
||||||
|
|
||||||
|
### Streaming Responses to Network
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var request = new ChatCompletionRequest(model, messages) { Stream = true };
|
||||||
|
var response = await client.StreamAsync(request);
|
||||||
|
|
||||||
|
await foreach (var chunk in response)
|
||||||
|
{
|
||||||
|
if (chunk.TextDelta != null)
|
||||||
|
{
|
||||||
|
await networkStream.WriteAsync(Encoding.UTF8.GetBytes(chunk.TextDelta));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Parallel Embedding Batches with Progress
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var texts = Enumerable.Range(0, 1000).Select(i => $"Text {i}").ToList();
|
||||||
|
|
||||||
|
await embeddingService.GetEmbeddingsAsync(texts,
|
||||||
|
onProgress: progress =>
|
||||||
|
{
|
||||||
|
Console.WriteLine(progress); // "[Generating embeddings: batch 5/4]"
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
### Custom Embedding Service with Different Model
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var client = new OpenRouterClient(apiKey);
|
||||||
|
var customService = new EmbeddingService(client, "your-embedding-model");
|
||||||
|
|
||||||
|
float[] embedding = await customService.GetEmbeddingAsync("text");
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
### No Interface-based Design
|
||||||
|
|
||||||
|
OpenQuery uses concrete classes. For mocking in tests, you'd need to create wrappers or use tools like JustMock/Moq that can mock non-virtual methods (not recommended). Better: define interfaces like `IOpenRouterClient` and have implementations.
|
||||||
|
|
||||||
|
### Hardcoded Concurrency Settings
|
||||||
|
|
||||||
|
`ParallelProcessingOptions` is instantiated in `SearchTool` with hardcoded defaults. To customize, you'd need to:
|
||||||
|
|
||||||
|
1. Subclass `SearchTool` and override access to `_options`
|
||||||
|
2. Or modify source to accept `ParallelProcessingOptions` in constructor
|
||||||
|
3. Or use reflection (hacky)
|
||||||
|
|
||||||
|
Suggested improvement: Add constructor parameter.
|
||||||
|
|
||||||
|
### Single Responsibility Blur
|
||||||
|
|
||||||
|
`OpenQueryApp` does query generation + pipeline + streaming. Could split:
|
||||||
|
- `IQueryGenerator` (for expanding queries)
|
||||||
|
- `IPipelineExecutor` (for search tool)
|
||||||
|
- `IAnswerStreamer` (for final LLM streaming)
|
||||||
|
|
||||||
|
Currently, `OpenQueryApp` is the facade.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- **[Components](../components/overview.md)** - Understand architecture
|
||||||
|
- **[CLI Reference](../api/cli.md)** - CLI that uses these APIs
|
||||||
|
- **[Source Code](../)** - Read implementation details
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Code Snippet: Full Programmatic Flow**
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
using OpenQuery.Services;
|
||||||
|
using OpenQuery.Tools;
|
||||||
|
using OpenQuery.Models;
|
||||||
|
|
||||||
|
async Task<string> Research(string question)
|
||||||
|
{
|
||||||
|
var apiKey = GetApiKey(); // your method
|
||||||
|
var client = new OpenRouterClient(apiKey);
|
||||||
|
var searxng = new SearxngClient("http://localhost:8002");
|
||||||
|
var embeddings = new EmbeddingService(client);
|
||||||
|
var search = new SearchTool(searxng, embeddings);
|
||||||
|
var app = new OpenQueryApp(client, search, "qwen/qwen3.5-flash-02-23");
|
||||||
|
|
||||||
|
var options = new OpenQueryOptions(
|
||||||
|
Chunks: 3,
|
||||||
|
Results: 5,
|
||||||
|
Queries: 3,
|
||||||
|
Short: false,
|
||||||
|
Long: false,
|
||||||
|
Verbose: false,
|
||||||
|
Question: question
|
||||||
|
);
|
||||||
|
|
||||||
|
// Capture output by redirecting Console or modifying OpenQueryApp
|
||||||
|
await app.RunAsync(options);
|
||||||
|
return "streamed to console"; // would need custom capture
|
||||||
|
}
|
||||||
|
```
|
||||||
682
docs/architecture.md
Normal file
682
docs/architecture.md
Normal file
@@ -0,0 +1,682 @@
|
|||||||
|
# Architecture
|
||||||
|
|
||||||
|
Deep dive into OpenQuery's system design, architectural patterns, and data flow.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [System Overview](#system-overview)
|
||||||
|
2. [Architectural Patterns](#architectural-patterns)
|
||||||
|
3. [Component Architecture](#component-architecture)
|
||||||
|
4. [Data Flow](#data-flow)
|
||||||
|
5. [Concurrency Model](#concurrency-model)
|
||||||
|
6. [Error Handling & Resilience](#error-handling--resilience)
|
||||||
|
7. [Performance Considerations](#performance-considerations)
|
||||||
|
8. [Design Decisions](#design-decisions)
|
||||||
|
|
||||||
|
## System Overview
|
||||||
|
|
||||||
|
OpenQuery is a **pipeline-based AI application** that orchestrates multiple external services (OpenRouter, SearxNG) to answer user questions with web-sourced, semantically-ranked content.
|
||||||
|
|
||||||
|
### Core Design Principles
|
||||||
|
|
||||||
|
1. **Separation of Concerns** - Each component has a single, well-defined responsibility
|
||||||
|
2. **Parallel First** - Wherever possible, operations are parallelized for speed
|
||||||
|
3. **Resilient by Default** - Built-in retries, rate limiting, and graceful degradation
|
||||||
|
4. **Configurable** - Most parameters can be adjusted without code changes
|
||||||
|
5. **Observable** - Progress reporting and verbose mode for debugging
|
||||||
|
|
||||||
|
### High-Level Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ User Layer │
|
||||||
|
│ CLI (System.CommandLine) → OpenQueryApp │
|
||||||
|
└─────────────────────────────┬───────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ Orchestration Layer │
|
||||||
|
│ OpenQueryApp → SearchTool (4-phase pipeline) │
|
||||||
|
└─────────────────────────────┬───────────────────────────────┘
|
||||||
|
│
|
||||||
|
┌─────────────────────┼─────────────────────┐
|
||||||
|
│ │ │
|
||||||
|
▼ ▼ ▼
|
||||||
|
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
|
||||||
|
│ Search Layer │ │ Processing Layer │ │ AI Layer │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ SearxngClient│ │ ArticleService │ │OpenRouterClient│
|
||||||
|
│ │ │ ChunkingService │ │ │
|
||||||
|
│ │ │ EmbeddingService │ │ │
|
||||||
|
└──────────────┘ └──────────────────┘ └──────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architectural Patterns
|
||||||
|
|
||||||
|
### 1. Pipeline Pattern
|
||||||
|
|
||||||
|
The main workflow (SearchTool.ExecuteAsync) implements a multi-stage pipeline:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
Phase 1: ExecuteParallelSearchesAsync
|
||||||
|
↓ (List<SearxngResult>)
|
||||||
|
Phase 2: ExecuteParallelArticleFetchingAsync
|
||||||
|
↓ (List<Chunk>)
|
||||||
|
Phase 3: ExecuteParallelEmbeddingsAsync
|
||||||
|
↓ ((queryEmbedding, chunkEmbeddings))
|
||||||
|
Phase 4: RankAndSelectTopChunks
|
||||||
|
↓ (List<Chunk> topChunks)
|
||||||
|
→ Formatted context string returned
|
||||||
|
```
|
||||||
|
|
||||||
|
Each phase:
|
||||||
|
- Accepts input from previous phase
|
||||||
|
- Processes in parallel where applicable
|
||||||
|
- Returns output to next phase
|
||||||
|
- Reports progress via callbacks
|
||||||
|
|
||||||
|
### 2. Service Layer Pattern
|
||||||
|
|
||||||
|
Services (`Services/` directory) are stateless classes that encapsulate specific operations:
|
||||||
|
|
||||||
|
- **Clients**: `OpenRouterClient`, `SearxngClient` (HTTP communication)
|
||||||
|
- **Processors**: `EmbeddingService`, `ChunkingService` (data transformation)
|
||||||
|
- **Extractors**: `ArticleService` (content extraction)
|
||||||
|
- **Infrastructure**: `RateLimiter`, `StatusReporter` (cross-cutting concerns)
|
||||||
|
|
||||||
|
All dependencies are explicit (constructor injection), making services easily testable.
|
||||||
|
|
||||||
|
### 3. Dependency Injection (Manual)
|
||||||
|
|
||||||
|
While not using a DI container, OpenQuery follows DI principles:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
// Program.cs: instantiate dependencies with explicit parameters
|
||||||
|
var client = new OpenRouterClient(apiKey);
|
||||||
|
var searxngClient = new SearxngClient(searxngUrl);
|
||||||
|
var embeddingService = new EmbeddingService(client);
|
||||||
|
var searchTool = new SearchTool(searxngClient, embeddingService);
|
||||||
|
var openQuery = new OpenQueryApp(client, searchTool, model);
|
||||||
|
```
|
||||||
|
|
||||||
|
Benefits:
|
||||||
|
- Clear dependency graph
|
||||||
|
- Easy to substitute mocks for testing
|
||||||
|
- No magic; visible construction
|
||||||
|
|
||||||
|
### 4. Observer Pattern (Progress Reporting)
|
||||||
|
|
||||||
|
`StatusReporter` and progress callbacks implement observer pattern:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
// SearchTool receives a progress callback
|
||||||
|
public Task<string> ExecuteAsync(..., Action<string>? onProgress = null, ...)
|
||||||
|
|
||||||
|
// Components invoke callback at key milestones
|
||||||
|
onProgress?.Invoke($"[Fetching article {current}/{total}: {domain}]");
|
||||||
|
|
||||||
|
// Caller (OpenQueryApp) provides reporter.StatusUpdate() as callback
|
||||||
|
_searchTool.ExecuteAsync(..., (progress) => reporter.WriteLine(progress), ...);
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Resilience Patterns (Polly)
|
||||||
|
|
||||||
|
`EmbeddingService` uses Polly's retry policy:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
_retryPipeline = new ResiliencePipelineBuilder()
|
||||||
|
.AddRetry(new RetryStrategyOptions
|
||||||
|
{
|
||||||
|
MaxRetryAttempts = 3,
|
||||||
|
Delay = TimeSpan.FromSeconds(1),
|
||||||
|
BackoffType = DelayBackoffType.Exponential,
|
||||||
|
ShouldHandle = new PredicateBuilder()
|
||||||
|
.Handle<HttpRequestException>()
|
||||||
|
})
|
||||||
|
.Build();
|
||||||
|
```
|
||||||
|
|
||||||
|
This automatically retries failed embedding requests with exponential backoff.
|
||||||
|
|
||||||
|
### 6. Producer-Consumer Pattern (Channel-based)
|
||||||
|
|
||||||
|
`StatusReporter` uses `System.Threading.Channels.Channel<string>` for asynchronous progress updates:
|
||||||
|
|
||||||
|
- Producer: `UpdateStatus()` writes messages to channel
|
||||||
|
- Consumer: Background task `ProcessStatusUpdatesAsync()` reads and displays
|
||||||
|
- Benefit: No blocking between progress generation and display
|
||||||
|
|
||||||
|
### 7. Disposable Pattern
|
||||||
|
|
||||||
|
Components that hold unmanaged resources implement `IDisposable` or `IAsyncDisposable`:
|
||||||
|
|
||||||
|
- `StatusReporter` - stops background spinner task
|
||||||
|
- `RateLimiter` - disposes semaphore
|
||||||
|
|
||||||
|
Used via `using` statements for deterministic cleanup.
|
||||||
|
|
||||||
|
## Component Architecture
|
||||||
|
|
||||||
|
### OpenQueryApp (OpenQuery.cs)
|
||||||
|
|
||||||
|
**Role**: Main orchestrator; coordinates the entire workflow
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Parse CLI options into `OpenQueryOptions`
|
||||||
|
- Load configuration and resolve API keys/models
|
||||||
|
- Optionally generate expanded search queries via LLM
|
||||||
|
- Invoke `SearchTool` with progress callbacks
|
||||||
|
- Stream final answer from LLM
|
||||||
|
|
||||||
|
**Key Methods**:
|
||||||
|
- `RunAsync(OpenQueryOptions)` - Main entry point
|
||||||
|
|
||||||
|
**Interactions**:
|
||||||
|
- Instantiates `OpenRouterClient` (for both query gen and final answer)
|
||||||
|
- Instantiates `SearxngClient` (passed to `SearchTool`)
|
||||||
|
- Instantiates `EmbeddingService` (passed to `SearchTool`)
|
||||||
|
- Instantiates `SearchTool` (orchestration)
|
||||||
|
- `StatusReporter` for UI updates
|
||||||
|
|
||||||
|
### SearchTool (Tools/SearchTool.cs)
|
||||||
|
|
||||||
|
**Role**: Core search-retrieve-rank pipeline orchestrator
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Execute 4-phase pipeline (search → fetch → embed → rank)
|
||||||
|
- Manage concurrency limits (via semaphores)
|
||||||
|
- Coordinate parallel operations
|
||||||
|
- Generate context string for final answer
|
||||||
|
|
||||||
|
**Interactions**:
|
||||||
|
- Uses `SearxngClient` for Phase 1
|
||||||
|
- Uses `ArticleService` + `ChunkingService` for Phase 2
|
||||||
|
- Uses `EmbeddingService` for Phase 3
|
||||||
|
- Has no external UI dependency (pure logic)
|
||||||
|
|
||||||
|
**Parallelization Strategy**:
|
||||||
|
- **Phase 1**: `Task.WhenAll` on search tasks (unbounded but limited by SearxNG instance)
|
||||||
|
- **Phase 2**: Semaphore (max 10 concurrent fetches)
|
||||||
|
- **Phase 3**: `Parallel.ForEachAsync` (max 4 concurrent embedding batches)
|
||||||
|
|
||||||
|
### EmbeddingService (Services/EmbeddingService.cs)
|
||||||
|
|
||||||
|
**Role**: Generate vector embeddings with batching, rate limiting, and retries
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Batch embedding requests (default: 300 per batch)
|
||||||
|
- Parallelize batches (default: 4 concurrent)
|
||||||
|
- Apply rate limiting (via `RateLimiter`)
|
||||||
|
- Retry failed requests (Polly)
|
||||||
|
- Calculate cosine similarity
|
||||||
|
|
||||||
|
**Key Methods**:
|
||||||
|
- `GetEmbeddingsAsync(List<string> texts, ...)` - batch with progress
|
||||||
|
- `GetEmbeddingAsync(string text)` - single embedding
|
||||||
|
- `CosineSimilarity(float[], float[])` - static vector math
|
||||||
|
|
||||||
|
**Design Notes**:
|
||||||
|
- Rate limiting is crucial to avoid overwhelming OpenRouter's embedding endpoint
|
||||||
|
- Batches of 300 reduce API overhead
|
||||||
|
- Polly retry handles transient failures (429, 500, network blips)
|
||||||
|
|
||||||
|
### OpenRouterClient (Services/OpenRouterClient.cs)
|
||||||
|
|
||||||
|
**Role**: HTTP client for OpenRouter API (completions + embeddings)
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Serialize requests to JSON (source-generated)
|
||||||
|
- Send HTTP with authorization header
|
||||||
|
- Stream responses for chat completions (IAsyncEnumerable)
|
||||||
|
- Return full responses for non-streaming
|
||||||
|
- Throw on non-2xx status codes
|
||||||
|
|
||||||
|
**Endpoints**:
|
||||||
|
- POST `/chat/completions` (stream and non-stream)
|
||||||
|
- POST `/embeddings`
|
||||||
|
|
||||||
|
**Configuration**:
|
||||||
|
- Base URL: `https://openrouter.ai/api/v1`
|
||||||
|
- Headers: `Authorization: Bearer {apiKey}`, `Accept: application/json`
|
||||||
|
|
||||||
|
**Design**:
|
||||||
|
- Low-level client; no retry logic (retry is in `EmbeddingService`)
|
||||||
|
- Thin wrapper around `HttpClient`
|
||||||
|
- Could be replaced with `HttpClientFactory` in larger apps
|
||||||
|
|
||||||
|
### SearxngClient (Services/SearxngClient.cs)
|
||||||
|
|
||||||
|
**Role**: HTTP client for SearxNG metasearch
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Construct search URL with query param
|
||||||
|
- GET request and deserialize JSON
|
||||||
|
- Limit results (`.Take(limit)`)
|
||||||
|
- Return empty list on failure (no exceptions)
|
||||||
|
|
||||||
|
**Endpoint**: `GET /search?q={query}&format=json`
|
||||||
|
|
||||||
|
**Design**:
|
||||||
|
- Very simple; no retry (failures are acceptable, OpenQuery continues with other queries)
|
||||||
|
- `DistinctBy(r => r.Url)` deduplication happens upstream
|
||||||
|
|
||||||
|
### ArticleService (Services/ArticleService.cs)
|
||||||
|
|
||||||
|
**Role**: Extract clean article content from URLs
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Call `SmartReader.ParseArticleAsync(url)`
|
||||||
|
- Return `Article` object with `Title`, `TextContent`, `IsReadable`
|
||||||
|
|
||||||
|
**Design**:
|
||||||
|
- Single responsibility: extraction only (no fetching, no chunking)
|
||||||
|
- SmartReader handles all complexity (HTML parsing, boilerplate removal)
|
||||||
|
- Exceptions propagate to `SearchTool` (handled there)
|
||||||
|
|
||||||
|
### ChunkingService (Services/ChunkingService.cs)
|
||||||
|
|
||||||
|
**Role**: Split long text into 500-char chunks at natural boundaries
|
||||||
|
|
||||||
|
**Algorithm**:
|
||||||
|
1. Start at index 0
|
||||||
|
2. Take up to 500 chars
|
||||||
|
3. If not at end, backtrack to last space/newline/period
|
||||||
|
4. Add chunk, advance start
|
||||||
|
5. Repeat until done
|
||||||
|
|
||||||
|
**Design**:
|
||||||
|
- Static class (stateless utility)
|
||||||
|
- No dependencies
|
||||||
|
- Pure function (input text → output chunks)
|
||||||
|
|
||||||
|
### RateLimiter (Services/RateLimiter.cs)
|
||||||
|
|
||||||
|
**Role**: Limit concurrent operations via semaphore
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Wrap actions with semaphore acquisition
|
||||||
|
- Support both sync and async actions
|
||||||
|
- Implement `IAsyncDisposable` for cleanup
|
||||||
|
|
||||||
|
**Usage Pattern**:
|
||||||
|
```csharp
|
||||||
|
await _rateLimiter.ExecuteAsync(async () =>
|
||||||
|
{
|
||||||
|
// operation limited by semaphore
|
||||||
|
return await SomeApiCall();
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
**Design**:
|
||||||
|
- Generic wrapper (can return TResult)
|
||||||
|
- `SemaphoreSlim` initialized at construction
|
||||||
|
- Used in `EmbeddingService` for parallel embedding batches
|
||||||
|
|
||||||
|
### StatusReporter (Services/StatusReporter.cs)
|
||||||
|
|
||||||
|
**Role**: Show real-time progress with spinner or verbose mode
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Maintain spinner animation (background task)
|
||||||
|
- Receive status updates via channel
|
||||||
|
- Display updates with appropriate formatting
|
||||||
|
- Stop spinner on completion
|
||||||
|
|
||||||
|
**Features**:
|
||||||
|
- **Spinner mode** (non-verbose): `⠋ Fetching...` with animated Braille chars
|
||||||
|
- **Verbose mode**: `[Fetching article 1/10: example.com]` on separate lines
|
||||||
|
- **Thread-safe**: Channel is safe for concurrent writes
|
||||||
|
- **Non-blocking**: Background spinner doesn't block updates
|
||||||
|
|
||||||
|
**Design**:
|
||||||
|
- `Channel<string>` for asynchronous producer-consumer
|
||||||
|
- Background task (`_statusProcessor`) reads from channel
|
||||||
|
- Spinner runs on its own task with 100ms delay per frame
|
||||||
|
- `IDisposable` ensures proper cleanup
|
||||||
|
|
||||||
|
### ConfigManager (ConfigManager.cs)
|
||||||
|
|
||||||
|
**Role**: Load and save configuration from/to file
|
||||||
|
|
||||||
|
**Responsibilities**:
|
||||||
|
- Get config path (XDG: `~/.config/openquery/config`)
|
||||||
|
- Parse key-value pairs (no INI library, manual parsing)
|
||||||
|
- Provide `AppConfig` object with defaults
|
||||||
|
- Save settings back to file
|
||||||
|
|
||||||
|
**Design**:
|
||||||
|
- Static class (no instances)
|
||||||
|
- Creates config directory if missing
|
||||||
|
- Line-by-line parsing (simple, no dependencies)
|
||||||
|
- Could be improved with proper INI parser or JSON
|
||||||
|
|
||||||
|
## Data Flow
|
||||||
|
|
||||||
|
### End-to-End Data Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
User: "What is quantum entanglement?"
|
||||||
|
|
||||||
|
1. OpenQueryOptions created
|
||||||
|
{ Question = "...", Queries = 3, Results = 5, Chunks = 3, ... }
|
||||||
|
|
||||||
|
2. Query Generation (if Queries > 1)
|
||||||
|
→ ChatCompletionRequest to OpenRouter (system prompt for JSON queries)
|
||||||
|
→ Deserialize to List<string> (generatedQueries)
|
||||||
|
|
||||||
|
3. Search Phase
|
||||||
|
generatedQueries → Parallel.SearxngClient.SearchAsync → ConcurrentBag<SearxngResult>
|
||||||
|
→ DistinctBy(Url) → List<SearxngResult> (15 results = 3 queries × 5 results)
|
||||||
|
|
||||||
|
4. Fetch Phase
|
||||||
|
searchResults → Parallel.ArticleService.FetchArticleAsync → Article
|
||||||
|
→ ChunkingService.ChunkText (split into ~500-char pieces)
|
||||||
|
→ ConcurrentBag<Chunk> (could be 50-100 chunks from 15 articles)
|
||||||
|
|
||||||
|
5. Embedding Phase
|
||||||
|
originalQuery → EmbeddingService.GetEmbeddingAsync → float[] (queryEmbedding)
|
||||||
|
chunk.Contents → EmbeddingService.GetEmbeddingsAsync → float[][] (chunkEmbeddings)
|
||||||
|
|
||||||
|
6. Ranking Phase
|
||||||
|
For each Chunk: Score = CosineSimilarity(queryEmbedding, chunkEmbedding)
|
||||||
|
OrderByDescending(Score).Take(3) → topChunks (final 3 chunks)
|
||||||
|
|
||||||
|
7. Answer Phase
|
||||||
|
context = string.Join("\n\n", topChunks.Select(...))
|
||||||
|
→ ChatCompletionRequest to OpenRouter with context + question
|
||||||
|
→ StreamAsync → Console.Write(delta) (real-time display)
|
||||||
|
|
||||||
|
Result: User sees answer with [Source N] citations
|
||||||
|
```
|
||||||
|
|
||||||
|
### Data Structures
|
||||||
|
|
||||||
|
**Chunk** - The core data structure flowing through the pipeline:
|
||||||
|
```csharp
|
||||||
|
public record Chunk(
|
||||||
|
string Content, // Text content (~500 chars)
|
||||||
|
string SourceUrl, // Where it came from
|
||||||
|
string? Title = null // Article title
|
||||||
|
)
|
||||||
|
{
|
||||||
|
public float[]? Embedding { get; set; } // Added in Phase 3
|
||||||
|
public float Score { get; set; } // Added in Phase 4
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Data Flow State**:
|
||||||
|
- Phase 1-2: `Chunk` without embedding
|
||||||
|
- Phase 3: `Chunk.Embedding` populated
|
||||||
|
- Phase 4: `Chunk.Score` populated
|
||||||
|
- Phase 5: Serialized into context string
|
||||||
|
|
||||||
|
### Memory Footprint
|
||||||
|
|
||||||
|
**Per 15-article run (approximate)**:
|
||||||
|
- Raw HTML (fetched): ~5MB (transient, discarded after extract)
|
||||||
|
- Articles: ~500KB (15 articles × ~30KB extracted text)
|
||||||
|
- Chunks: ~50-100 items × 500 chars ≈ 25-50KB text
|
||||||
|
- Embeddings: ~50-100 × 1536 floats × 4 bytes ≈ 300-600KB
|
||||||
|
- Total peak: ~1-2MB (excluding OpenRouter's memory usage)
|
||||||
|
|
||||||
|
**Note**: AOT compilation reduces runtime memory compared to JIT.
|
||||||
|
|
||||||
|
## Concurrency Model
|
||||||
|
|
||||||
|
OpenQuery uses multiple parallelization strategies:
|
||||||
|
|
||||||
|
### Unbounded Parallelism (Task.WhenAll)
|
||||||
|
|
||||||
|
**Where**: Search queries (Phase 1)
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var searchTasks = generatedQueries.Select(async query => { ... });
|
||||||
|
await Task.WhenAll(searchTasks);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Rationale**: SearxNG can handle concurrent queries; no need to limit (it's a local/single-user tool). SearxNG itself may throttle internally.
|
||||||
|
|
||||||
|
**Risk**: Could overwhelm SearxNG if `--queries` is set very high (100+). Default 3 is safe.
|
||||||
|
|
||||||
|
### Semaphore-Controlled Parallelism
|
||||||
|
|
||||||
|
**Where**: Article fetching (Phase 2)
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var semaphore = new SemaphoreSlim(_options.MaxConcurrentArticleFetches); // 10
|
||||||
|
await Task.WhenAll(fetchTasks); // Each task waits on semaphore
|
||||||
|
```
|
||||||
|
|
||||||
|
**Rationale**: Prevent flooding target websites with requests (DOS-like behavior). 10 concurrent is polite but fast.
|
||||||
|
|
||||||
|
**Configurable**: Yes, via `ParallelProcessingOptions.MaxConcurrentArticleFetches` (compile-time constant currently).
|
||||||
|
|
||||||
|
### Parallel.ForEachAsync with MaxDegreeOfParallelism
|
||||||
|
|
||||||
|
**Where**: Embedding batch processing (Phase 3)
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
await Parallel.ForEachAsync(
|
||||||
|
batchIndices,
|
||||||
|
new ParallelOptions { MaxDegreeOfParallelism = 4 },
|
||||||
|
async (batchIndex, ct) => { ... }
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Rationale**: Limit API concurrency to respect OpenRouter rate limits. 4 concurrent embedding requests is a safe default.
|
||||||
|
|
||||||
|
**Configurable**: Yes, via `ParallelProcessingOptions.MaxConcurrentEmbeddingRequests` (compile-time).
|
||||||
|
|
||||||
|
### Progress Reporting (Channel)
|
||||||
|
|
||||||
|
**Where**: All phases pass `onProgress` callback
|
||||||
|
|
||||||
|
**Implementation**:
|
||||||
|
- `StatusReporter.UpdateStatus()` → writes to channel
|
||||||
|
- Background task reads channel and displays
|
||||||
|
- Non-blocking; callbacks are fire-and-forget (TryWrite)
|
||||||
|
|
||||||
|
**Thread Safety**: Channel is thread-safe; multiple phases may write concurrently.
|
||||||
|
|
||||||
|
## Error Handling & Resilience
|
||||||
|
|
||||||
|
### HTTP Errors
|
||||||
|
|
||||||
|
**OpenRouterClient**:
|
||||||
|
- Calls `response.EnsureSuccessStatusCode()` → throws `HttpRequestException` on 4xx/5xx
|
||||||
|
- No retry (handled at higher level in `EmbeddingService`)
|
||||||
|
|
||||||
|
**SearxngClient**:
|
||||||
|
- Returns empty `List<SearxngResult>` on non-success
|
||||||
|
- No exception thrown (searches are non-critical; if some queries fail, others proceed)
|
||||||
|
|
||||||
|
### Retry Policy (Polly)
|
||||||
|
|
||||||
|
**Location**: `EmbeddingService` constructor
|
||||||
|
|
||||||
|
**Scope**: Only embedding requests (`_client.EmbedAsync`)
|
||||||
|
|
||||||
|
**Policy**:
|
||||||
|
- Max 3 attempts
|
||||||
|
- Exponential backoff: 1s, 2s, 4s
|
||||||
|
- Only retries `HttpRequestException` (network errors, 429, 5xx)
|
||||||
|
|
||||||
|
**Why not on chat completions?**
|
||||||
|
- Query generation and final answer are critical; failures should surface immediately
|
||||||
|
- Could be added in future if transient failures are common
|
||||||
|
|
||||||
|
### Graceful Degradation
|
||||||
|
|
||||||
|
**Query Generation Failure**:
|
||||||
|
```csharp
|
||||||
|
try { ... generate queries ... }
|
||||||
|
catch (Exception ex)
|
||||||
|
{
|
||||||
|
// Fall back to original question as sole query
|
||||||
|
if (options.Verbose) reporter.WriteLine($"[Failed to generate queries, falling back to original question]");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Embedding Batch Failure**:
|
||||||
|
```csharp
|
||||||
|
catch
|
||||||
|
{
|
||||||
|
// Return empty embeddings for this batch (chunk will be filtered out)
|
||||||
|
var emptyBatch = new float[batch.Count][];
|
||||||
|
// fill with empty arrays
|
||||||
|
results.Add((batchIndex, emptyBatch));
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Article Fetch Failure**:
|
||||||
|
```csharp
|
||||||
|
try { await ArticleService.FetchArticleAsync(url); }
|
||||||
|
catch (Exception ex)
|
||||||
|
{
|
||||||
|
if (verbose) Console.WriteLine($"Warning: Failed to fetch article {url}: {ex.Message}");
|
||||||
|
// Chunk not added; continue with others
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### User-Facing Errors
|
||||||
|
|
||||||
|
Top-level exception handler in `Program.cs`:
|
||||||
|
```csharp
|
||||||
|
try { await openQuery.RunAsync(options); }
|
||||||
|
catch (HttpRequestException ex)
|
||||||
|
{
|
||||||
|
Console.Error.WriteLine($"\n[Error] Network request failed. Details: {ex.Message}");
|
||||||
|
Environment.Exit(1);
|
||||||
|
}
|
||||||
|
catch (Exception ex)
|
||||||
|
{
|
||||||
|
Console.Error.WriteLine($"\n[Error] An unexpected error occurred: {ex.Message}");
|
||||||
|
Environment.Exit(1);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cancellation Support
|
||||||
|
|
||||||
|
`OpenRouterClient.StreamAsync` and `EmbeddingService` methods accept `CancellationToken`.
|
||||||
|
|
||||||
|
Used in:
|
||||||
|
- Streaming answer (Ctrl+C stops immediately)
|
||||||
|
- Parallel embeddings (can be cancelled)
|
||||||
|
|
||||||
|
## Performance Considerations
|
||||||
|
|
||||||
|
### Latency Breakdown (Typical)
|
||||||
|
|
||||||
|
| Stage | Time | Description |
|
||||||
|
|-------|------|-------------|
|
||||||
|
| Query generation | 2-5s | LLM generates 3-5 queries |
|
||||||
|
| Searches | 3-8s | 3-5 parallel SearxNG queries |
|
||||||
|
| Article fetching | 5-15s | 10-20 parallel fetches (network + parse) |
|
||||||
|
| Embeddings | 2-4s | 50-100 chunks in 4-parallel batches |
|
||||||
|
| Final answer | 5-20s | Depends on answer length (streaming) |
|
||||||
|
| **Total** | **15-50s** | Varies widely based on network & content |
|
||||||
|
|
||||||
|
### Bottlenecks
|
||||||
|
|
||||||
|
1. **Network I/O** (article fetching, API calls) - can't be CPU-bound
|
||||||
|
2. **OpenRouter API latency** - varies by model and load
|
||||||
|
3. **SmartReader parsing** - CPU-bound for large HTML
|
||||||
|
4. **Embedding API rate** - OpenRouter may rate limit if too many concurrent
|
||||||
|
|
||||||
|
### Optimization Strategies
|
||||||
|
|
||||||
|
- **Parallelism**: Already maximized within API constraints
|
||||||
|
- **Caching**: Not implemented; future enhancement could cache embeddings per URL
|
||||||
|
- **Batching**: 300-chunk batches reduce API overhead
|
||||||
|
- **AOT**: Native compilation reduces startup overhead vs JIT
|
||||||
|
|
||||||
|
### Scalability Limits
|
||||||
|
|
||||||
|
- **Memory**: Scales with number of chunks. 100 chunks × 1536 floats × 4 bytes ≈ 600KB. Can handle thousands but CPU for similarity calculation becomes O(n).
|
||||||
|
- **API Limits**: OpenRouter rate limits per API key; may need to reduce concurrency if hitting 429s.
|
||||||
|
- **SearxNG Limits**: Single SearxNG instance can handle ~10-50 QPS; above that may need load balancing (not in scope).
|
||||||
|
|
||||||
|
## Design Decisions
|
||||||
|
|
||||||
|
### Why Not Use a DI Container?
|
||||||
|
|
||||||
|
OpenQuery manually wires dependencies in `Program.cs`. For a small CLI app, this is:
|
||||||
|
- Simpler (no container configuration)
|
||||||
|
- More explicit (easy to trace dependency graph)
|
||||||
|
- No runtime overhead
|
||||||
|
- Easier to understand for contributors
|
||||||
|
|
||||||
|
Container-based DI would be overengineering.
|
||||||
|
|
||||||
|
### Why AOT?
|
||||||
|
|
||||||
|
.NET 10 AOT provides:
|
||||||
|
- **Fast startup** (<100ms vs ~500ms JIT)
|
||||||
|
- **Smaller footprint** (trimmed, no JIT)
|
||||||
|
- **No runtime dependencies** (self-contained)
|
||||||
|
- Better for CLI tools distributed to users
|
||||||
|
|
||||||
|
Trade-offs:
|
||||||
|
- Longer build time
|
||||||
|
- Some reflection-based APIs not supported (not needed here)
|
||||||
|
- Less flexible (can't load dynamic assemblies, but not needed)
|
||||||
|
|
||||||
|
### Why SmartReader for Article Extraction?
|
||||||
|
|
||||||
|
SmartReader uses Readability-based algorithm similar to Firefox Reader View:
|
||||||
|
- Removes ads, navigation, comments, boilerplate
|
||||||
|
- Extracts main article content
|
||||||
|
- Handles malformed HTML gracefully
|
||||||
|
- Zero dependencies (pure .NET)
|
||||||
|
|
||||||
|
Alternatives considered:
|
||||||
|
- `HtmlAgilityPack` (too low-level, need to implement extraction logic)
|
||||||
|
- `AngleSharp` (similar, still need extraction)
|
||||||
|
- External services (like diffbot) - require API keys, costs money
|
||||||
|
|
||||||
|
SmartReader is the sweet spot: free, good quality, easy integration.
|
||||||
|
|
||||||
|
### Why Embeddings + Cosine Similarity vs Full-Text Search?
|
||||||
|
|
||||||
|
Full-text search (like Lucene) would:
|
||||||
|
- Require inverted index, more complex
|
||||||
|
- Be faster for exact keyword matching
|
||||||
|
- Not understand semantic similarity
|
||||||
|
|
||||||
|
Embeddings provide:
|
||||||
|
- Semantic similarity (understand meaning, not just keywords)
|
||||||
|
- Simple math (cosine similarity of float arrays)
|
||||||
|
- No index to maintain (just compute on-the-fly)
|
||||||
|
|
||||||
|
Trade-off: Embedding API cost and latency vs LRU cache potential.
|
||||||
|
|
||||||
|
### Why Not RAG (Retrieval Augmented Generation) Framework?
|
||||||
|
|
||||||
|
OpenQuery is essentially a lightweight custom RAG system. Using a full framework (like LangChain) would:
|
||||||
|
- Add dependency bloat
|
||||||
|
- Reduce control
|
||||||
|
- Increase abstraction complexity
|
||||||
|
|
||||||
|
Custom implementation is ~1000 LOC and perfectly matches needs.
|
||||||
|
|
||||||
|
### Why System.CommandLine?
|
||||||
|
|
||||||
|
Provides:
|
||||||
|
- Native-like CLI help (`openquery --help`)
|
||||||
|
- Strongly-typed options
|
||||||
|
- Command hierarchy (main + subcommands)
|
||||||
|
- Good error messages
|
||||||
|
|
||||||
|
Alternative: `CommandLineParser` (older) or manual parsing. System.CommandLine is modern and actively developed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [Components](components/overview.md) - Deep dive into each module
|
||||||
|
- [API Reference](api/cli.md) - Complete command documentation
|
||||||
|
- [Troubleshooting](troubleshooting.md) - Debug issues
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Want to extend OpenQuery?** Check the [Components](components/overview.md) guide to understand each piece.
|
||||||
528
docs/components/models.md
Normal file
528
docs/components/models.md
Normal file
@@ -0,0 +1,528 @@
|
|||||||
|
# Models Reference
|
||||||
|
|
||||||
|
Complete reference for all data models, DTOs, and records in OpenQuery.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Core Data Models](#core-data-models)
|
||||||
|
2. [OpenRouter API Models](#openrouter-api-models)
|
||||||
|
3. [SearxNG API Models](#searxng-api-models)
|
||||||
|
4. [JSON Serialization](#json-serialization)
|
||||||
|
5. [Model Relationships](#model-relationships)
|
||||||
|
|
||||||
|
## Core Data Models
|
||||||
|
|
||||||
|
### OpenQueryOptions
|
||||||
|
|
||||||
|
**Location**: `Models/OpenQueryOptions.cs`
|
||||||
|
**Type**: `record`
|
||||||
|
**Purpose**: Immutable options object for a single query execution
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public record OpenQueryOptions(
|
||||||
|
int Chunks, // Number of top chunks to include in context
|
||||||
|
int Results, // Search results per generated query
|
||||||
|
int Queries, // Number of search queries to generate (if >1)
|
||||||
|
bool Short, // Request concise answer
|
||||||
|
bool Long, // Request detailed answer
|
||||||
|
bool Verbose, // Enable verbose logging
|
||||||
|
string Question // Original user question (required)
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Lifecycle**:
|
||||||
|
- Created in `Program.cs` by combining CLI options, config defaults, and environment variables
|
||||||
|
- Passed to `OpenQueryApp.RunAsync(options)`
|
||||||
|
|
||||||
|
**Validation**: None (assumes valid values from CLI parser/config)
|
||||||
|
|
||||||
|
**Example**:
|
||||||
|
```csharp
|
||||||
|
var options = new OpenQueryOptions(
|
||||||
|
Chunks: 3,
|
||||||
|
Results: 5,
|
||||||
|
Queries: 3,
|
||||||
|
Short: false,
|
||||||
|
Long: false,
|
||||||
|
Verbose: true,
|
||||||
|
Question: "What is quantum entanglement?"
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Chunk
|
||||||
|
|
||||||
|
**Location**: `Models/Chunk.cs`
|
||||||
|
**Type**: `record`
|
||||||
|
**Purpose**: Content chunk with metadata, embedding, and relevance score
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public record Chunk(
|
||||||
|
string Content, // Text content (typically ~500 chars)
|
||||||
|
string SourceUrl, // Original article URL
|
||||||
|
string? Title = null // Article title (optional, may be null)
|
||||||
|
)
|
||||||
|
{
|
||||||
|
public float[]? Embedding { get; set; } // Vector embedding (1536-dim for text-embedding-3-small)
|
||||||
|
public float Score { get; set; } // Relevance score (0-1, higher = more relevant)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Lifecycle**:
|
||||||
|
1. **Created** in `SearchTool.ExecuteParallelArticleFetchingAsync`:
|
||||||
|
```csharp
|
||||||
|
chunks.Add(new Chunk(chunkText, result.Url, article.Title));
|
||||||
|
```
|
||||||
|
At this point: `Embedding = null`, `Score = 0`
|
||||||
|
|
||||||
|
2. **Embedded** in `SearchTool.ExecuteParallelEmbeddingsAsync`:
|
||||||
|
```csharp
|
||||||
|
validChunks[i].Embedding = validEmbeddings[i];
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Scored** in `SearchTool.RankAndSelectTopChunks`:
|
||||||
|
```csharp
|
||||||
|
chunk.Score = EmbeddingService.CosineSimilarity(queryEmbedding, chunk.Embedding!);
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Formatted** into context string:
|
||||||
|
```csharp
|
||||||
|
$"[Source {i+1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Properties**:
|
||||||
|
- `Content`: Never null/empty (filters empty chunks in `ChunkingService`)
|
||||||
|
- `SourceUrl`: Always provided (from `SearxngResult.Url`)
|
||||||
|
- `Title`: May be null if article extraction failed to get title
|
||||||
|
- `Embedding`: Null until phase 3; may remain null if embedding failed
|
||||||
|
- `Score`: 0 until phase 4; irrelevant for non-embedded chunks
|
||||||
|
|
||||||
|
**Equality**: Records use value equality (all properties compared). Two chunks with same content/url/title are equal; embeddings and scores ignored for equality (as they're mutable).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ParallelProcessingOptions
|
||||||
|
|
||||||
|
**Location**: `Models/ParallelOptions.cs`
|
||||||
|
**Type**: `class`
|
||||||
|
**Purpose**: Configuration for parallel/concurrent operations
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public class ParallelProcessingOptions
|
||||||
|
{
|
||||||
|
public int MaxConcurrentArticleFetches { get; set; } = 10;
|
||||||
|
public int MaxConcurrentEmbeddingRequests { get; set; } = 4;
|
||||||
|
public int EmbeddingBatchSize { get; set; } = 300;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Usage**:
|
||||||
|
- Instantiated in `SearchTool` constructor (hardcoded new)
|
||||||
|
- Passed to `EmbeddingService` constructor
|
||||||
|
- Read by `SearchTool` for article fetching semaphore
|
||||||
|
|
||||||
|
**Default Values**:
|
||||||
|
| Property | Default | Effect |
|
||||||
|
|----------|---------|--------|
|
||||||
|
| `MaxConcurrentArticleFetches` | 10 | Up to 10 articles fetched simultaneously |
|
||||||
|
| `MaxConcurrentEmbeddingRequests` | 4 | Up to 4 embedding batches in parallel |
|
||||||
|
| `EmbeddingBatchSize` | 300 | Each embedding API call handles up to 300 texts |
|
||||||
|
|
||||||
|
**Current Limitation**: These are **compile-time defaults** (hardcoded in `SearchTool.cs`). To make them configurable:
|
||||||
|
1. Add to `AppConfig`
|
||||||
|
2. Read in `ConfigManager`
|
||||||
|
3. Pass through `SearchTool` constructor
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## OpenRouter API Models
|
||||||
|
|
||||||
|
**Location**: `Models/OpenRouter.cs`
|
||||||
|
**Purpose**: DTOs for OpenRouter's REST API (JSON serialization)
|
||||||
|
|
||||||
|
### Chat Completion
|
||||||
|
|
||||||
|
#### `ChatCompletionRequest`
|
||||||
|
```csharp
|
||||||
|
public record ChatCompletionRequest(
|
||||||
|
[property: JsonPropertyName("model")] string Model,
|
||||||
|
[property: JsonPropertyName("messages")] List<Message> Messages,
|
||||||
|
[property: JsonPropertyName("tools")] List<ToolDefinition>? Tools = null,
|
||||||
|
[property: JsonPropertyName("stream")] bool Stream = false
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model": "qwen/qwen3.5-flash-02-23",
|
||||||
|
"messages": [
|
||||||
|
{ "role": "system", "content": "You are a helpful assistant." },
|
||||||
|
{ "role": "user", "content": "What is 2+2?" }
|
||||||
|
],
|
||||||
|
"stream": true
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### `Message`
|
||||||
|
```csharp
|
||||||
|
public record Message(
|
||||||
|
[property: JsonPropertyName("role")] string Role,
|
||||||
|
[property: JsonPropertyName("content")] string? Content = null,
|
||||||
|
[property: JsonPropertyName("tool_calls")] List<ToolCall>? ToolCalls = null,
|
||||||
|
[property: JsonPropertyName("tool_call_id")] string? ToolCallId = null
|
||||||
|
)
|
||||||
|
{
|
||||||
|
// Factory method for tool responses
|
||||||
|
public static Message FromTool(string content, string toolCallId) =>
|
||||||
|
new Message("tool", content, null, toolCallId);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Roles**: `"system"`, `"user"`, `"assistant"`, `"tool"`
|
||||||
|
|
||||||
|
**Usage**:
|
||||||
|
- `Content` for text messages
|
||||||
|
- `ToolCalls` when assistant requests tool use
|
||||||
|
- `ToolCallId` when responding to tool call
|
||||||
|
|
||||||
|
#### `ChatCompletionResponse`
|
||||||
|
```csharp
|
||||||
|
public record ChatCompletionResponse(
|
||||||
|
[property: JsonPropertyName("choices")] List<Choice> Choices,
|
||||||
|
[property: JsonPropertyName("usage")] Usage? Usage = null
|
||||||
|
);
|
||||||
|
|
||||||
|
public record Choice(
|
||||||
|
[property: JsonPropertyName("message")] Message Message,
|
||||||
|
[property: JsonPropertyName("finish_reason")] string? FinishReason = null
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response Example**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"message": {
|
||||||
|
"role": "assistant",
|
||||||
|
"content": "Answer text..."
|
||||||
|
},
|
||||||
|
"finish_reason": "stop"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"usage": {
|
||||||
|
"prompt_tokens": 100,
|
||||||
|
"completion_tokens": 50,
|
||||||
|
"total_tokens": 150
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### `Usage`
|
||||||
|
```csharp
|
||||||
|
public record Usage(
|
||||||
|
[property: JsonPropertyName("prompt_tokens")] int PromptTokens,
|
||||||
|
[property: JsonPropertyName("completion_tokens")] int CompletionTokens,
|
||||||
|
[property: JsonPropertyName("total_tokens")] int TotalTokens
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tool Calling (Not Currently Used)
|
||||||
|
|
||||||
|
#### `ToolDefinition` / `ToolFunction`
|
||||||
|
```csharp
|
||||||
|
public record ToolDefinition(
|
||||||
|
[property: JsonPropertyName("type")] string Type, // e.g., "function"
|
||||||
|
[property: JsonPropertyName("function")] ToolFunction Function
|
||||||
|
);
|
||||||
|
|
||||||
|
public record ToolFunction(
|
||||||
|
[property: JsonPropertyName("name")] string Name,
|
||||||
|
[property: JsonPropertyName("description")] string Description,
|
||||||
|
[property: JsonPropertyName("parameters")] JsonElement Parameters // JSON Schema
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
#### `ToolCall` / `FunctionCall`
|
||||||
|
```csharp
|
||||||
|
public record ToolCall(
|
||||||
|
[property: JsonPropertyName("id")] string Id,
|
||||||
|
[property: JsonPropertyName("type")] string Type,
|
||||||
|
[property: JsonPropertyName("function")] FunctionCall Function
|
||||||
|
);
|
||||||
|
|
||||||
|
public record FunctionCall(
|
||||||
|
[property: JsonPropertyName("name")] string Name,
|
||||||
|
[property: JsonPropertyName("arguments")] string Arguments // JSON string
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note**: OpenQuery doesn't use tools currently, but models are defined for future tool-calling capability.
|
||||||
|
|
||||||
|
### Streaming
|
||||||
|
|
||||||
|
#### `StreamChunk`
|
||||||
|
```csharp
|
||||||
|
public record StreamChunk(
|
||||||
|
string? TextDelta = null,
|
||||||
|
ClientToolCall? Tool = null
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
Yielded by `OpenRouterClient.StreamAsync()` for each SSE event.
|
||||||
|
|
||||||
|
#### `ChatCompletionChunk` (Server Response)
|
||||||
|
```csharp
|
||||||
|
public record ChatCompletionChunk(
|
||||||
|
[property: JsonPropertyName("choices")] List<ChunkChoice> Choices
|
||||||
|
);
|
||||||
|
|
||||||
|
public record ChunkChoice(
|
||||||
|
[property: JsonPropertyName("delta")] ChunkDelta Delta
|
||||||
|
);
|
||||||
|
|
||||||
|
public record ChunkDelta(
|
||||||
|
[property: JsonPropertyName("content")] string? Content = null,
|
||||||
|
[property: JsonPropertyName("tool_calls")] List<ToolCall>? ToolCalls = null
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Streaming Response Example** (SSE):
|
||||||
|
```
|
||||||
|
data: {"choices":[{"delta":{"content":"Hello"}}]}
|
||||||
|
data: {"choices":[{"delta":{"content":" world"}}]}
|
||||||
|
data: [DONE]
|
||||||
|
```
|
||||||
|
|
||||||
|
`OpenRouterClient.StreamAsync` parses and yields `StreamChunk` with non-null `TextDelta` for content.
|
||||||
|
|
||||||
|
### Embeddings
|
||||||
|
|
||||||
|
#### `EmbeddingRequest`
|
||||||
|
```csharp
|
||||||
|
public record EmbeddingRequest(
|
||||||
|
[property: JsonPropertyName("model")] string Model,
|
||||||
|
[property: JsonPropertyName("input")] List<string> Input
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model": "openai/text-embedding-3-small",
|
||||||
|
"input": ["text 1", "text 2", ...]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### `EmbeddingResponse`
|
||||||
|
```csharp
|
||||||
|
public record EmbeddingResponse(
|
||||||
|
[property: JsonPropertyName("data")] List<EmbeddingData> Data,
|
||||||
|
[property: JsonPropertyName("usage")] Usage Usage
|
||||||
|
);
|
||||||
|
|
||||||
|
public record EmbeddingData(
|
||||||
|
[property: JsonPropertyName("embedding")] float[] Embedding,
|
||||||
|
[property: JsonPropertyName("index")] int Index
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response Example**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"data": [
|
||||||
|
{ "embedding": [0.1, 0.2, ...], "index": 0 },
|
||||||
|
{ "embedding": [0.3, 0.4, ...], "index": 1 }
|
||||||
|
],
|
||||||
|
"usage": {
|
||||||
|
"prompt_tokens": 100,
|
||||||
|
"total_tokens": 100
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note**: `_client.EmbedAsync` orders by `index` to match input order.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## SearxNG API Models
|
||||||
|
|
||||||
|
**Location**: `Models/Searxng.cs`
|
||||||
|
**Purpose**: DTOs for SearxNG's JSON response format
|
||||||
|
|
||||||
|
### `SearxngRoot`
|
||||||
|
```csharp
|
||||||
|
public record SearxngRoot(
|
||||||
|
[property: JsonPropertyName("results")] List<SearxngResult> Results
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
Top-level response object.
|
||||||
|
|
||||||
|
### `SearxngResult`
|
||||||
|
```csharp
|
||||||
|
public record SearxngResult(
|
||||||
|
[property: JsonPropertyName("title")] string Title,
|
||||||
|
[property: JsonPropertyName("url")] string Url,
|
||||||
|
[property: JsonPropertyName("content")] string Content // Snippet/description
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Fields**:
|
||||||
|
- `Title`: Result title (from page `<title>` or OpenGraph)
|
||||||
|
- `Url`: Absolute URL to article
|
||||||
|
- `Content`: Short snippet (~200 chars) from search engine
|
||||||
|
|
||||||
|
**Usage**:
|
||||||
|
- `Url` passed to `ArticleService.FetchArticleAsync`
|
||||||
|
- `Title` used as fallback if article extraction fails
|
||||||
|
- `Content` currently unused (could be for quick answer without fetching)
|
||||||
|
|
||||||
|
**Example Response**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"results": [
|
||||||
|
{
|
||||||
|
"title": "Quantum Entanglement - Wikipedia",
|
||||||
|
"url": "https://en.wikipedia.org/wiki/Quantum_entanglement",
|
||||||
|
"content": "Quantum entanglement is a physical phenomenon..."
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## JSON Serialization
|
||||||
|
|
||||||
|
### JsonContext (Source Generation)
|
||||||
|
|
||||||
|
**Location**: `Models/JsonContexts.cs`
|
||||||
|
**Purpose**: Provide source-generated JSON serializer context for AOT compatibility
|
||||||
|
|
||||||
|
#### Declaration
|
||||||
|
```csharp
|
||||||
|
[JsonSerializable(typeof(ChatCompletionRequest))]
|
||||||
|
[JsonSerializable(typeof(ChatCompletionResponse))]
|
||||||
|
[JsonSerializable(typeof(ChatCompletionChunk))]
|
||||||
|
[JsonSerializable(typeof(EmbeddingRequest))]
|
||||||
|
[JsonSerializable(typeof(EmbeddingResponse))]
|
||||||
|
[JsonSerializable(typeof(SearxngRoot))]
|
||||||
|
[JsonJsonSerializer(typeof(List<string>))]
|
||||||
|
internal partial class AppJsonContext : JsonSerializerContext
|
||||||
|
{
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Usage**:
|
||||||
|
```csharp
|
||||||
|
var json = JsonSerializer.Serialize(request, AppJsonContext.Default.ChatCompletionRequest);
|
||||||
|
var response = JsonSerializer.Deserialize(json, AppJsonContext.Default.ChatCompletionResponse);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Benefits**:
|
||||||
|
- **AOT-compatible**: No reflection, works with PublishAot=true
|
||||||
|
- **Performance**: Pre-compiled serializers are faster
|
||||||
|
- **Trimming safe**: Unused serializers trimmed automatically
|
||||||
|
|
||||||
|
**Generated**: Partial class compiled by source generator (no manual implementation)
|
||||||
|
|
||||||
|
**Important**: Must include ALL types that will be serialized/deserialized in `[JsonSerializable]` attributes, otherwise runtime exception in AOT.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Model Relationships
|
||||||
|
|
||||||
|
### Object Graph (Typical Execution)
|
||||||
|
|
||||||
|
```
|
||||||
|
OpenQueryOptions
|
||||||
|
↓
|
||||||
|
OpenQueryApp.RunAsync()
|
||||||
|
│
|
||||||
|
├─ queryGenerationMessages (List<Message>)
|
||||||
|
│ ├─ system: "You are an expert researcher..."
|
||||||
|
│ └─ user: "Generate N queries for: {question}"
|
||||||
|
│ ↓
|
||||||
|
│ ChatCompletionRequest → OpenRouter → ChatCompletionResponse
|
||||||
|
│ ↓
|
||||||
|
│ List<string> generatedQueries
|
||||||
|
│
|
||||||
|
├─ SearchTool.ExecuteAsync()
|
||||||
|
│ ↓
|
||||||
|
│ ┌─────────────────────────────────────┐
|
||||||
|
│ │ Phase 1: Parallel Searches │
|
||||||
|
│ │ SearxngClient.SearchAsync(query) × N
|
||||||
|
│ │ → List<SearxngResult> │
|
||||||
|
│ │ (Title, Url, Content) │
|
||||||
|
│ └─────────────────────────────────────┘
|
||||||
|
│ ↓
|
||||||
|
│ ┌─────────────────────────────────────┐
|
||||||
|
│ │ Phase 2: Article Fetch & Chunking │
|
||||||
|
│ │ ArticleService.FetchAsync(Url) × M
|
||||||
|
│ │ → Article (TextContent, Title)
|
||||||
|
│ │ → ChunkingService.ChunkText → List<string> chunks
|
||||||
|
│ │ → Chunk(content, url, title) × K │
|
||||||
|
│ └─────────────────────────────────────┘
|
||||||
|
│ ↓
|
||||||
|
│ ┌─────────────────────────────────────┐
|
||||||
|
│ │ Phase 3: Embeddings │
|
||||||
|
│ │ EmbeddingService.GetEmbeddingsAsync(chunkContents)
|
||||||
|
│ │ → float[][] chunkEmbeddings │
|
||||||
|
│ │ → Set chunk.Embedding for each │
|
||||||
|
│ │ Also: GetEmbeddingAsync(question) → float[] queryEmbedding
|
||||||
|
│ └─────────────────────────────────────┘
|
||||||
|
│ ↓
|
||||||
|
│ ┌─────────────────────────────────────┐
|
||||||
|
│ │ Phase 4: Ranking │
|
||||||
|
│ │ For each chunk: Score = CosineSimilarity(queryEmbedding, chunk.Embedding)
|
||||||
|
│ │ → Set chunk.Score │
|
||||||
|
│ │ → OrderByDescending(Score) │
|
||||||
|
│ │ → Take(topChunksLimit) → topChunks (List<Chunk>)
|
||||||
|
│ └─────────────────────────────────────┘
|
||||||
|
│ ↓
|
||||||
|
│ Context string: formatted topChunks
|
||||||
|
│ ↓
|
||||||
|
└─ OpenQueryApp → final ChatCompletionRequest
|
||||||
|
System: "Answer based on context..."
|
||||||
|
User: "Context:\n{context}\n\nQuestion: {question}"
|
||||||
|
↓
|
||||||
|
StreamAsync() → StreamChunk.TextDelta → Console
|
||||||
|
```
|
||||||
|
|
||||||
|
### Record Immutability
|
||||||
|
|
||||||
|
Most DTOs are `record` types:
|
||||||
|
- **Immutable**: Properties are init-only (`{ get; init; }`)
|
||||||
|
- **Value semantics**: Equality based on content
|
||||||
|
- **Thread-safe**: Can be shared across threads
|
||||||
|
|
||||||
|
**Exception**:
|
||||||
|
- `Chunk`: Has mutable properties `Embedding` and `Score` (set during pipeline)
|
||||||
|
- `ParallelProcessingOptions`: Class with mutable setters
|
||||||
|
- `AppConfig`: Class with mutable setters
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- **[API Reference](../../api/cli.md)** - How these models are used in CLI commands
|
||||||
|
- **[OpenRouterClient](../../services/OpenRouterClient.md)** - Uses OpenRouter models
|
||||||
|
- **[SearxngClient](../../services/SearxngClient.md)** - Uses Searxng models
|
||||||
|
- **[SearchTool](../../components/search-tool.md)** - Orchestrates all models
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Quick Reference Table**
|
||||||
|
|
||||||
|
| Model | Category | Purpose | Mutable? |
|
||||||
|
|-------|----------|---------|----------|
|
||||||
|
| `OpenQueryOptions` | Core | CLI options | No (record) |
|
||||||
|
| `Chunk` | Core | Content + metadata + ranking | Partially (Embedding, Score) |
|
||||||
|
| `ParallelProcessingOptions` | Config | Concurrency settings | Yes (class) |
|
||||||
|
| `ChatCompletionRequest/Response` | OpenRouter | LLM API | No |
|
||||||
|
| `EmbeddingRequest/Response` | OpenRouter | Embeddings API | No |
|
||||||
|
| `SearxngRoot/Result` | SearxNG | Search results | No |
|
||||||
|
| `AppJsonContext` | Internal | JSON serialization | No (generated partial) |
|
||||||
395
docs/components/openquery-app.md
Normal file
395
docs/components/openquery-app.md
Normal file
@@ -0,0 +1,395 @@
|
|||||||
|
# OpenQueryApp Component
|
||||||
|
|
||||||
|
Deep dive into the `OpenQueryApp` class - the main application orchestrator.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
`OpenQueryApp` is the heart of OpenQuery. It coordinates all components, manages the workflow from question to answer, and handles progress reporting.
|
||||||
|
|
||||||
|
## Location
|
||||||
|
`OpenQuery.cs` in project root
|
||||||
|
|
||||||
|
## Class Definition
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public class OpenQueryApp
|
||||||
|
{
|
||||||
|
private readonly OpenRouterClient _client;
|
||||||
|
private readonly SearchTool _searchTool;
|
||||||
|
private readonly string _model;
|
||||||
|
|
||||||
|
public OpenQueryApp(
|
||||||
|
OpenRouterClient client,
|
||||||
|
SearchTool searchTool,
|
||||||
|
string model);
|
||||||
|
|
||||||
|
public async Task RunAsync(OpenQueryOptions options);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Dependencies**:
|
||||||
|
- `OpenRouterClient` - for query generation and final answer streaming
|
||||||
|
- `SearchTool` - for search-retrieve-rank pipeline
|
||||||
|
- `string _model` - model identifier to use for LLM calls
|
||||||
|
|
||||||
|
**Lifecycle**: Instantiated once per query execution in `Program.cs`, then `RunAsync()` called once.
|
||||||
|
|
||||||
|
## RunAsync Workflow
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public async Task RunAsync(OpenQueryOptions options)
|
||||||
|
{
|
||||||
|
// 1. Setup
|
||||||
|
using var reporter = new StatusReporter(options.Verbose);
|
||||||
|
reporter.StartSpinner();
|
||||||
|
|
||||||
|
// 2. Query Generation (if needed)
|
||||||
|
List<string> queries = await GenerateQueriesIfNeededAsync(options, reporter);
|
||||||
|
|
||||||
|
// 3. Search Pipeline
|
||||||
|
string searchResult = await ExecuteSearchPipelineAsync(options, queries, reporter);
|
||||||
|
|
||||||
|
// 4. Final Answer Streaming
|
||||||
|
await StreamFinalAnswerAsync(options, searchResult, reporter);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 1: Status Reporter Setup
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
using var reporter = new StatusReporter(options.Verbose);
|
||||||
|
reporter.StartSpinner();
|
||||||
|
```
|
||||||
|
|
||||||
|
- Creates `StatusReporter` (implements `IDisposable`)
|
||||||
|
- Starts spinner animation (unless verbose)
|
||||||
|
- `using` ensures disposal on exit
|
||||||
|
|
||||||
|
### Step 2: Query Generation
|
||||||
|
|
||||||
|
**When**: `options.Queries > 1` (user wants multiple search queries)
|
||||||
|
|
||||||
|
**Purpose**: Use LLM to generate diverse, optimized search queries from the original question
|
||||||
|
|
||||||
|
**System Prompt** (hardcoded in `OpenQuery.cs`):
|
||||||
|
```
|
||||||
|
You are an expert researcher. The user will ask a question. Your task is to
|
||||||
|
generate optimal search queries to gather comprehensive information.
|
||||||
|
|
||||||
|
Instructions:
|
||||||
|
1. Break down complex questions.
|
||||||
|
2. Use synonyms and alternative phrasing.
|
||||||
|
3. Target different aspects (entities, mechanisms, pros/cons, history).
|
||||||
|
|
||||||
|
CRITICAL: Output must be a valid JSON array of strings ONLY. No markdown,
|
||||||
|
explanations, or other text.
|
||||||
|
```
|
||||||
|
|
||||||
|
**Request**:
|
||||||
|
```csharp
|
||||||
|
var queryGenMessages = new List<Message>
|
||||||
|
{
|
||||||
|
new Message("system", systemPrompt),
|
||||||
|
new Message("user", $"Generate {options.Queries} distinct search queries for:\n{options.Question}")
|
||||||
|
};
|
||||||
|
var request = new ChatCompletionRequest(_model, queryGenMessages);
|
||||||
|
var response = await _client.CompleteAsync(request);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response Parsing**:
|
||||||
|
```csharp
|
||||||
|
var content = response.Choices.FirstOrDefault()?.Message.Content;
|
||||||
|
if (!string.IsNullOrEmpty(content))
|
||||||
|
{
|
||||||
|
// Remove markdown code fences if present
|
||||||
|
content = Regex.Replace(content, @"```json\s*|\s*```", "").Trim();
|
||||||
|
|
||||||
|
// Deserialize to List<string>
|
||||||
|
var generatedQueries = JsonSerializer.Deserialize(content, AppJsonContext.Default.ListString);
|
||||||
|
if (generatedQueries != null && generatedQueries.Count > 0)
|
||||||
|
{
|
||||||
|
queries = generatedQueries;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Fallback**: If any step fails (exception, null, empty, invalid JSON), use `new List<string> { options.Question }` (single query = original)
|
||||||
|
|
||||||
|
**Note**: Query generation reuses the same model as final answer. This could be optimized:
|
||||||
|
- Use cheaper/faster model for query gen
|
||||||
|
- Separate model configuration
|
||||||
|
- Cache query generation results
|
||||||
|
|
||||||
|
### Step 3: Search Pipeline Execution
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var searchResult = await _searchTool.ExecuteAsync(
|
||||||
|
options.Question,
|
||||||
|
queries,
|
||||||
|
options.Results,
|
||||||
|
options.Chunks,
|
||||||
|
(progress) => {
|
||||||
|
if (options.Verbose)
|
||||||
|
reporter.WriteLine(progress);
|
||||||
|
else
|
||||||
|
reporter.UpdateStatus(parsedMessage);
|
||||||
|
},
|
||||||
|
options.Verbose);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Parameters**:
|
||||||
|
- `originalQuery`: User's original question (used for final embedding)
|
||||||
|
- `generatedQueries`: From step 2 (or fallback)
|
||||||
|
- `maxResults`: `options.Results` (search results per query)
|
||||||
|
- `topChunksLimit`: `options.Chunks` (top N chunks to return)
|
||||||
|
- `onProgress`: Callback to update UI
|
||||||
|
- `verbose`: Passed through to `SearchTool`
|
||||||
|
|
||||||
|
**Returns**: `string context` - formatted context with source citations
|
||||||
|
|
||||||
|
**Progress Handling**:
|
||||||
|
- In verbose mode: all progress printed as lines (via `reporter.WriteLine()`)
|
||||||
|
- In compact mode: parse progress messages to show concise status (e.g., "Fetching articles 3/10...")
|
||||||
|
|
||||||
|
### Step 4: Final Answer Streaming
|
||||||
|
|
||||||
|
**Status Update**:
|
||||||
|
```csharp
|
||||||
|
if (!options.Verbose)
|
||||||
|
reporter.UpdateStatus("Asking AI...");
|
||||||
|
else
|
||||||
|
{
|
||||||
|
reporter.ClearStatus();
|
||||||
|
Console.WriteLine();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Build System Prompt**:
|
||||||
|
```csharp
|
||||||
|
var systemPrompt = "You are a helpful AI assistant. Answer the user's question in depth, based on the provided context. Be precise and accurate. You can mention sources or citations.";
|
||||||
|
if (options.Short) systemPrompt += " Give a very short concise answer.";
|
||||||
|
if (options.Long) systemPrompt += " Give a long elaborate detailed answer.";
|
||||||
|
```
|
||||||
|
|
||||||
|
**Prompt Structure**:
|
||||||
|
```
|
||||||
|
System: {systemPrompt}
|
||||||
|
User: Context:
|
||||||
|
{searchResult}
|
||||||
|
|
||||||
|
Question: {options.Question}
|
||||||
|
```
|
||||||
|
|
||||||
|
Where `searchResult` is:
|
||||||
|
```
|
||||||
|
[Source 1: Title](URL)
|
||||||
|
Content chunk 1
|
||||||
|
|
||||||
|
[Source 2: Title](URL)
|
||||||
|
Content chunk 2
|
||||||
|
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Streaming**:
|
||||||
|
```csharp
|
||||||
|
var requestStream = new ChatCompletionRequest(_model, messages);
|
||||||
|
var assistantResponse = new StringBuilder();
|
||||||
|
var isFirstChunk = true;
|
||||||
|
|
||||||
|
using var streamCts = new CancellationTokenSource();
|
||||||
|
await foreach (var chunk in _client.StreamAsync(requestStream, streamCts.Token))
|
||||||
|
{
|
||||||
|
if (chunk.TextDelta == null) continue;
|
||||||
|
|
||||||
|
if (isFirstChunk)
|
||||||
|
{
|
||||||
|
reporter.StopSpinner();
|
||||||
|
if (!options.Verbose) reporter.ClearStatus();
|
||||||
|
else Console.Write("Assistant: ");
|
||||||
|
isFirstChunk = false;
|
||||||
|
}
|
||||||
|
|
||||||
|
Console.Write(chunk.TextDelta);
|
||||||
|
assistantResponse.Append(chunk.TextDelta);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key Points**:
|
||||||
|
- `StreamAsync` yields `StreamChunk` objects (text deltas)
|
||||||
|
- First chunk stops spinner and clears status line
|
||||||
|
- Each delta written to Console immediately (real-time feel)
|
||||||
|
- Entire response accumulated in `assistantResponse` (though not used elsewhere)
|
||||||
|
- `CancellationTokenSource` passed but not canceled (Ctrl+C would cancel from outside)
|
||||||
|
|
||||||
|
**Finally Block**:
|
||||||
|
```csharp
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
reporter.StopSpinner();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Ensures spinner stops even if streaming fails.
|
||||||
|
|
||||||
|
**End**:
|
||||||
|
```csharp
|
||||||
|
Console.WriteLine(); // Newline after complete answer
|
||||||
|
```
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
`RunAsync` itself does not catch exceptions. All exceptions propagate to `Program.cs`:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
try
|
||||||
|
{
|
||||||
|
var openQuery = new OpenQueryApp(client, searchTool, model);
|
||||||
|
await openQuery.RunAsync(options);
|
||||||
|
}
|
||||||
|
catch (HttpRequestException ex)
|
||||||
|
{
|
||||||
|
Console.Error.WriteLine($"\n[Error] Network request failed. Details: {ex.Message}");
|
||||||
|
Environment.Exit(1);
|
||||||
|
}
|
||||||
|
catch (Exception ex)
|
||||||
|
{
|
||||||
|
Console.Error.WriteLine($"\n[Error] An unexpected error occurred: {ex.Message}");
|
||||||
|
Environment.Exit(1);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Common Exceptions**:
|
||||||
|
- `HttpRequestException` - network failures, API errors
|
||||||
|
- `JsonException` - malformed JSON from API
|
||||||
|
- `TaskCanceledException` - timeout or user interrupt
|
||||||
|
- `Exception` - anything else
|
||||||
|
|
||||||
|
**No Retries at This Level**: Fail fast; user sees error immediately. Lower-level retries exist (embedding service).
|
||||||
|
|
||||||
|
## Performance Characteristics
|
||||||
|
|
||||||
|
**Query Generation**:
|
||||||
|
- One non-streaming LLM call
|
||||||
|
- Takes 2-5 seconds depending on model
|
||||||
|
- Typically <1000 tokens
|
||||||
|
|
||||||
|
**Search Pipeline** (`SearchTool.ExecuteAsync`):
|
||||||
|
- See `SearchTool.md` for detailed timing breakdown
|
||||||
|
- Total 10-30 seconds typically
|
||||||
|
|
||||||
|
**Final Answer Streaming**:
|
||||||
|
- Streaming LLM call
|
||||||
|
- Time depends on answer length (typically 5-20 seconds)
|
||||||
|
- User sees words appear progressively
|
||||||
|
|
||||||
|
**Total End-to-End**: 15-50 seconds for typical query
|
||||||
|
|
||||||
|
## Design Decisions
|
||||||
|
|
||||||
|
### Why Not Stream Query Generation?
|
||||||
|
|
||||||
|
Query generation currently uses `CompleteAsync` (non-streaming). Could be streamed but:
|
||||||
|
- Queries are short (JSON array)
|
||||||
|
- Streaming offers no UX benefit (user doesn't see intermediate queries)
|
||||||
|
- Simpler to wait for all queries before proceeding
|
||||||
|
|
||||||
|
### Why Build Prompt Manually Instead of Templates?
|
||||||
|
|
||||||
|
Simple string concatenation is fine for few prompts. Pros:
|
||||||
|
- No template dependencies
|
||||||
|
- Easy to read and modify
|
||||||
|
- No runtime compilation overhead
|
||||||
|
|
||||||
|
Cons:
|
||||||
|
- No validation
|
||||||
|
- Could benefit from prompt engineering framework
|
||||||
|
|
||||||
|
### Why Accumulate `assistantResponse` StringBuilder?
|
||||||
|
|
||||||
|
Currently built but not used. Could be:
|
||||||
|
- Saved to file (future feature: `--output file.md`)
|
||||||
|
- Analyzed for token counting
|
||||||
|
- Removed if not needed
|
||||||
|
|
||||||
|
### Could Query Generation Be Cached?
|
||||||
|
|
||||||
|
Yes! For repeated questions (common in scripts), cache query results:
|
||||||
|
- `Dictionary<string, List<string>>` cache in memory
|
||||||
|
- Or persistent cache (Redis, file)
|
||||||
|
- Not implemented (low priority)
|
||||||
|
|
||||||
|
### Single Responsibility Violation?
|
||||||
|
|
||||||
|
`OpenQueryApp` does:
|
||||||
|
- Query generation
|
||||||
|
- Pipeline orchestration
|
||||||
|
- Answer streaming
|
||||||
|
|
||||||
|
That's 3 responsibilities, but they're tightly coupled to the "query → answer" workflow. Separating them would add complexity without clear benefit. Acceptable as "application coordinator".
|
||||||
|
|
||||||
|
## Extension Points
|
||||||
|
|
||||||
|
### Adding New Model for Query Generation
|
||||||
|
|
||||||
|
Currently uses same `_model` for queries and answer. To use different models:
|
||||||
|
|
||||||
|
1. Add `queryGenerationModel` parameter to constructor
|
||||||
|
2. Use it for query gen: `new ChatCompletionRequest(queryGenerationModel, queryGenMessages)`
|
||||||
|
3. Keep `_model` for final answer
|
||||||
|
|
||||||
|
Or make it configurable via environment variable: `OPENROUTER_QUERY_MODEL`
|
||||||
|
|
||||||
|
### Post-Processing Answer
|
||||||
|
|
||||||
|
Opportunity to add:
|
||||||
|
- Source citation formatting (footnotes, clickable links)
|
||||||
|
- Answer summarization
|
||||||
|
- Export to Markdown/JSON
|
||||||
|
- Text-to-speech
|
||||||
|
|
||||||
|
Add after streaming loop, before final newline.
|
||||||
|
|
||||||
|
### Progress UI Enhancement
|
||||||
|
|
||||||
|
Current `StatusReporter` is basic. Could add:
|
||||||
|
- Progress bar with percentage
|
||||||
|
- ETA calculation
|
||||||
|
- Colors (ANSI) for different message types
|
||||||
|
- Logging to file
|
||||||
|
- Web dashboard
|
||||||
|
|
||||||
|
Would require extending `StatusReporter` or replacing it.
|
||||||
|
|
||||||
|
## Testing Considerations
|
||||||
|
|
||||||
|
**Challenges**:
|
||||||
|
- `RunAsync` is cohesive (hard to unit test in isolation)
|
||||||
|
- Depends on many services (need mocks)
|
||||||
|
- Asynchronous and streaming
|
||||||
|
|
||||||
|
**Recommended Approach**:
|
||||||
|
1. Extract interfaces:
|
||||||
|
- `ISearchTool` (wrapper around `SearchTool`)
|
||||||
|
- `IOpenRouterClient` (wrapper around `OpenRouterClient`)
|
||||||
|
2. Mock interfaces in tests
|
||||||
|
3. Test query generation parsing separately
|
||||||
|
4. Test progress callback counting
|
||||||
|
5. Test final answer prompt construction
|
||||||
|
|
||||||
|
**Integration Tests**:
|
||||||
|
- End-to-end with real/mocked APIs
|
||||||
|
- Automated tests with test SearxNG/OpenRouter instances
|
||||||
|
|
||||||
|
## Related Components
|
||||||
|
|
||||||
|
- **[SearchTool](search-tool.md)** - pipeline executed by `OpenQueryApp`
|
||||||
|
- **[Program.cs](../Program.md)** - creates `OpenQueryApp`
|
||||||
|
- **[StatusReporter](../services/StatusReporter.md)** - progress UI used by `OpenQueryApp`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [SearchTool](search-tool.md) - See the pipeline in detail
|
||||||
|
- [Services](../services/overview.md) - Understand each service
|
||||||
|
- [CLI Reference](../../api/cli.md) - How users invoke this
|
||||||
603
docs/components/overview.md
Normal file
603
docs/components/overview.md
Normal file
@@ -0,0 +1,603 @@
|
|||||||
|
# Components Overview
|
||||||
|
|
||||||
|
Detailed documentation for each major component in the OpenQuery system.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Component Hierarchy](#component-hierarchy)
|
||||||
|
2. [Core Components](#core-components)
|
||||||
|
3. [Services](#services)
|
||||||
|
4. [Data Models](#data-models)
|
||||||
|
5. [Component Interactions](#component-interactions)
|
||||||
|
|
||||||
|
## Component Hierarchy
|
||||||
|
|
||||||
|
```
|
||||||
|
OpenQuery/
|
||||||
|
├── Program.cs [Entry Point, CLI]
|
||||||
|
├── OpenQuery.cs [OpenQueryApp - Orchestrator]
|
||||||
|
├── Tools/
|
||||||
|
│ └── SearchTool.cs [Pipeline Orchestration]
|
||||||
|
├── Services/
|
||||||
|
│ ├── OpenRouterClient.cs [LLM & Embedding API]
|
||||||
|
│ ├── SearxngClient.cs [Search API]
|
||||||
|
│ ├── EmbeddingService.cs [Embedding Generation + Math]
|
||||||
|
│ ├── ChunkingService.cs [Text Splitting]
|
||||||
|
│ ├── ArticleService.cs [Content Extraction]
|
||||||
|
│ ├── RateLimiter.cs [Concurrency Control]
|
||||||
|
│ └── StatusReporter.cs [Progress Display]
|
||||||
|
├── Models/
|
||||||
|
│ ├── OpenQueryOptions.cs [CLI Options Record]
|
||||||
|
│ ├── Chunk.cs [Content + Metadata]
|
||||||
|
│ ├── ParallelOptions.cs [Concurrency Settings]
|
||||||
|
│ ├── OpenRouter.cs [API DTOs]
|
||||||
|
│ ├── Searxng.cs [Search Result DTOs]
|
||||||
|
│ └── JsonContexts.cs [JSON Context]
|
||||||
|
└── ConfigManager.cs [Configuration Persistence]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Core Components
|
||||||
|
|
||||||
|
### 1. Program.cs
|
||||||
|
|
||||||
|
**Type**: Console Application Entry Point
|
||||||
|
**Responsibilities**: CLI parsing, dependency wiring, error handling
|
||||||
|
|
||||||
|
**Key Elements**:
|
||||||
|
- `RootCommand` from System.CommandLine
|
||||||
|
- Options: `--chunks`, `--results`, `--queries`, `--short`, `--long`, `--verbose`
|
||||||
|
- Subcommand: `configure` (with interactive mode)
|
||||||
|
- Configuration loading via `ConfigManager.Load()`
|
||||||
|
- Environment variable resolution
|
||||||
|
- Service instantiation and coordination
|
||||||
|
- Top-level try-catch for error reporting
|
||||||
|
|
||||||
|
**Code Flow**:
|
||||||
|
1. Load config file
|
||||||
|
2. Define CLI options and commands
|
||||||
|
3. Set handler for root command
|
||||||
|
4. Handler: resolve API key/model → instantiate services → call `OpenQueryApp.RunAsync()`
|
||||||
|
5. Set handler for configure command (writes config file)
|
||||||
|
6. Invoke command parser: `await rootCommand.InvokeAsync(args)`
|
||||||
|
|
||||||
|
**Exit Codes**:
|
||||||
|
- 0 = success
|
||||||
|
- 1 = error
|
||||||
|
|
||||||
|
### 2. OpenQueryApp (OpenQuery.cs)
|
||||||
|
|
||||||
|
**Type**: Main Application Class
|
||||||
|
**Responsibilities**: Workflow orchestration, query generation, answer streaming
|
||||||
|
|
||||||
|
**Constructor Parameters**:
|
||||||
|
- `OpenRouterClient client` - for query gen and final answer
|
||||||
|
- `SearchTool searchTool` - for search-retrieve-rank pipeline
|
||||||
|
- `string model` - LLM model identifier
|
||||||
|
|
||||||
|
**Main Method**: `RunAsync(OpenQueryOptions options)`
|
||||||
|
|
||||||
|
**Workflow Steps**:
|
||||||
|
1. Create `StatusReporter` (for progress UI)
|
||||||
|
2. **Optional Query Generation** (if `options.Queries > 1`):
|
||||||
|
- Create system message instructing JSON array output
|
||||||
|
- Create user message with `options.Question`
|
||||||
|
- Call `client.CompleteAsync()` with query gen model
|
||||||
|
- Parse JSON response; fall back to original question on failure
|
||||||
|
- Result: `List<string> queries` (1 or many)
|
||||||
|
3. **Execute Search Pipeline**:
|
||||||
|
- Call `_searchTool.ExecuteAsync()` with queries, options
|
||||||
|
- Receive `string context` (formatted context with source citations)
|
||||||
|
- Progress reported via callback to `StatusReporter`
|
||||||
|
4. **Generate Final Answer**:
|
||||||
|
- Build system prompt (append "short" or "long" modifier)
|
||||||
|
- Create user message with `Context:\n{context}\n\nQuestion: {options.Question}`
|
||||||
|
- Stream answer via `client.StreamAsync()`
|
||||||
|
- Write each `chunk.TextDelta` to Console as it arrives
|
||||||
|
- Stop spinner on first chunk, continue streaming
|
||||||
|
5. Dispose reporter
|
||||||
|
|
||||||
|
**Error Handling**:
|
||||||
|
- Exceptions propagate to `Program.cs` top-level handler
|
||||||
|
- `HttpRequestException` vs generic `Exception`
|
||||||
|
|
||||||
|
**Note**: Query generation uses the same model as final answer; could be separated for cost/performance.
|
||||||
|
|
||||||
|
### 3. SearchTool (Tools/SearchTool.cs)
|
||||||
|
|
||||||
|
**Type**: Pipeline Orchestrator
|
||||||
|
**Responsibilities**: Execute 4-phase search-retrieve-rank-return workflow
|
||||||
|
|
||||||
|
**Constructor Parameters**:
|
||||||
|
- `SearxngClient searxngClient`
|
||||||
|
- `EmbeddingService embeddingService`
|
||||||
|
|
||||||
|
**Main Method**: `ExecuteAsync(originalQuery, generatedQueries, maxResults, topChunksLimit, onProgress, verbose)`
|
||||||
|
|
||||||
|
**Returns**: `Task<string>` - formatted context string with source citations
|
||||||
|
|
||||||
|
**Pipeline Phases**:
|
||||||
|
|
||||||
|
#### Phase 1: ExecuteParallelSearchesAsync
|
||||||
|
- Parallelize `searxngClient.SearchAsync(query, maxResults)` for each query
|
||||||
|
- Collect all results in `ConcurrentBag<SearxngResult>`
|
||||||
|
- Deduplicate by `DistinctBy(r => r.Url)`
|
||||||
|
|
||||||
|
**Output**: `List<SearxngResult>` (aggregated, unique)
|
||||||
|
|
||||||
|
#### Phase 2: ExecuteParallelArticleFetchingAsync
|
||||||
|
- Semaphore: `MaxConcurrentArticleFetches` (default 10)
|
||||||
|
- For each `SearxngResult`: fetch URL via `ArticleService.FetchArticleAsync()`
|
||||||
|
- Extract article text, title
|
||||||
|
- Chunk via `ChunkingService.ChunkText(article.TextContent)`
|
||||||
|
- Add each chunk as new `Chunk(content, url, title)`
|
||||||
|
|
||||||
|
**Output**: `List<Chunk>` (potentially 50-100 chunks)
|
||||||
|
|
||||||
|
#### Phase 3: ExecuteParallelEmbeddingsAsync
|
||||||
|
- Start two parallel tasks:
|
||||||
|
1. Query embedding: `embeddingService.GetEmbeddingAsync(originalQuery)`
|
||||||
|
2. Chunk embeddings: `embeddingService.GetEmbeddingsWithRateLimitAsync(chunkTexts, onProgress)`
|
||||||
|
- `Parallel.ForEachAsync` with `MaxConcurrentEmbeddingRequests` (default 4)
|
||||||
|
- Batch size: 300 chunks per embedding API call
|
||||||
|
- Filter chunks with empty embeddings (failed batches)
|
||||||
|
|
||||||
|
**Output**: `(float[] queryEmbedding, float[][] chunkEmbeddings)`
|
||||||
|
|
||||||
|
#### Phase 4: RankAndSelectTopChunks
|
||||||
|
- Calculate cosine similarity for each chunk vs query
|
||||||
|
- Assign `chunk.Score`
|
||||||
|
- Order by descending score
|
||||||
|
- Take `topChunksLimit` (from `--chunks` option)
|
||||||
|
- Return `List<Chunk>` (top N)
|
||||||
|
|
||||||
|
**Formatting**:
|
||||||
|
```csharp
|
||||||
|
string context = string.Join("\n\n", topChunks.Select((c, i) =>
|
||||||
|
$"[Source {i+1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}"));
|
||||||
|
```
|
||||||
|
|
||||||
|
**Progress Callbacks**: Invoked at each major step for UI feedback
|
||||||
|
|
||||||
|
## Services
|
||||||
|
|
||||||
|
### OpenRouterClient
|
||||||
|
|
||||||
|
**Purpose**: HTTP client for OpenRouter API (chat completions + embeddings)
|
||||||
|
|
||||||
|
**Base URL**: `https://openrouter.ai/api/v1`
|
||||||
|
|
||||||
|
**Authentication**: `Authorization: Bearer {apiKey}`
|
||||||
|
|
||||||
|
**Methods**:
|
||||||
|
|
||||||
|
#### `StreamAsync(ChatCompletionRequest request, CancellationToken)`
|
||||||
|
- Sets `request.Stream = true`
|
||||||
|
- POST to `/chat/completions`
|
||||||
|
- Reads SSE stream line-by-line
|
||||||
|
- Parses `data: {json}` chunks
|
||||||
|
- Yields `StreamChunk` (text delta or tool call)
|
||||||
|
- Supports cancellation
|
||||||
|
|
||||||
|
#### `CompleteAsync(ChatCompletionRequest request)`
|
||||||
|
- Sets `request.Stream = false`
|
||||||
|
- POST to `/chat/completions`
|
||||||
|
- Deserializes full response
|
||||||
|
- Returns `ChatCompletionResponse`
|
||||||
|
|
||||||
|
#### `EmbedAsync(string model, List<string> inputs)`
|
||||||
|
- POST to `/embeddings`
|
||||||
|
- Returns `float[][]` (ordered by input index)
|
||||||
|
|
||||||
|
**Error Handling**: `EnsureSuccessStatusCode()` throws `HttpRequestException` on failure
|
||||||
|
|
||||||
|
**Design**: Thin wrapper; no retry logic (delegated to EmbeddingService)
|
||||||
|
|
||||||
|
### SearxngClient
|
||||||
|
|
||||||
|
**Purpose**: HTTP client for SearxNG metasearch
|
||||||
|
|
||||||
|
**Base URL**: Configurable (default `http://localhost:8002`)
|
||||||
|
|
||||||
|
**Methods**:
|
||||||
|
|
||||||
|
#### `SearchAsync(string query, int limit = 10)`
|
||||||
|
- GET `{baseUrl}/search?q={query}&format=json`
|
||||||
|
- Deserializes to `SearxngRoot`
|
||||||
|
- Returns `Results.Take(limit).ToList()`
|
||||||
|
- On failure: returns empty `List<SearxngResult>` (no exception)
|
||||||
|
|
||||||
|
**Design**: Very simple; failures are tolerated (OpenQuery continues with other queries)
|
||||||
|
|
||||||
|
### EmbeddingService
|
||||||
|
|
||||||
|
**Purpose**: Batch embedding generation with rate limiting, parallelization, and retries
|
||||||
|
|
||||||
|
**Configuration** (from `ParallelProcessingOptions`):
|
||||||
|
- `MaxConcurrentEmbeddingRequests` = 4
|
||||||
|
- `EmbeddingBatchSize` = 300
|
||||||
|
|
||||||
|
**Default Embedding Model**: `openai/text-embedding-3-small`
|
||||||
|
|
||||||
|
**Methods**:
|
||||||
|
|
||||||
|
#### `GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`
|
||||||
|
- Splits `texts` into batches of `EmbeddingBatchSize`
|
||||||
|
- Parallelizes batches with `Parallel.ForEachAsync` + `MaxConcurrentEmbeddingRequests`
|
||||||
|
- Each batch: rate-limited + retry-wrapped `client.EmbedAsync(model, batch)`
|
||||||
|
- Collects results in order (by batch index)
|
||||||
|
- Returns `float[][]` (same order as input texts)
|
||||||
|
- Failed batches return empty `float[]` for each text
|
||||||
|
|
||||||
|
#### `GetEmbeddingAsync(string text, CancellationToken)`
|
||||||
|
- Wraps single-text call in rate limiter + retry
|
||||||
|
- Returns `float[]`
|
||||||
|
|
||||||
|
#### `CosineSimilarity(float[] v1, float[] v2)`
|
||||||
|
- Static method using `TensorPrimitives.CosineSimilarity`
|
||||||
|
- Returns float between -1 and 1 (typically 0-1 for normalized embeddings)
|
||||||
|
|
||||||
|
**Retry Policy** (Polly):
|
||||||
|
- Max 3 attempts
|
||||||
|
- 1s base delay, exponential backoff
|
||||||
|
- Only `HttpRequestException`
|
||||||
|
|
||||||
|
**Rate Limiting**: `RateLimiter` semaphore with `MaxConcurrentEmbeddingRequests`
|
||||||
|
|
||||||
|
**Design Notes**:
|
||||||
|
- Two similar methods (`GetEmbeddingsAsync` and `GetEmbeddingsWithRateLimitAsync`) - could be consolidated
|
||||||
|
- Uses Polly for resilience (good pattern)
|
||||||
|
- Concurrency control prevents overwhelming OpenRouter
|
||||||
|
|
||||||
|
### ChunkingService
|
||||||
|
|
||||||
|
**Purpose**: Split long text into manageable pieces
|
||||||
|
|
||||||
|
**Static Class** (no dependencies, pure function)
|
||||||
|
|
||||||
|
**Algorithm** (in `ChunkText(string text)`):
|
||||||
|
- Constant `MAX_CHUNK_SIZE = 500`
|
||||||
|
- While remaining text:
|
||||||
|
- Take up to 500 chars
|
||||||
|
- If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']`
|
||||||
|
- Trim and add non-empty chunk
|
||||||
|
- Advance start position
|
||||||
|
|
||||||
|
**Rationale**: 500 chars is a sweet spot for embeddings - long enough for context, short enough for semantic coherence.
|
||||||
|
|
||||||
|
**Edge Cases**: Handles text shorter than 500 chars, empty text, text with no natural breaks.
|
||||||
|
|
||||||
|
### ArticleService
|
||||||
|
|
||||||
|
**Purpose**: Extract clean article content from URLs
|
||||||
|
|
||||||
|
**Method**: `FetchArticleAsync(string url)`
|
||||||
|
|
||||||
|
**Implementation**: Delegates to `SmartReader.ParseArticleAsync(url)`
|
||||||
|
|
||||||
|
**Returns**: `Article` object (from SmartReader)
|
||||||
|
- `Title` (string)
|
||||||
|
- `TextContent` (string) - cleaned article body
|
||||||
|
- `IsReadable` (bool) - quality indicator
|
||||||
|
- Other metadata (author, date, etc.)
|
||||||
|
|
||||||
|
**Error Handling**: Exceptions propagate (handled by `SearchTool`)
|
||||||
|
|
||||||
|
**Design**: Thin wrapper around third-party library. Could be extended to add caching, custom extraction rules, etc.
|
||||||
|
|
||||||
|
### RateLimiter
|
||||||
|
|
||||||
|
**Purpose**: Limit concurrent operations via semaphore
|
||||||
|
|
||||||
|
**Interface**:
|
||||||
|
```csharp
|
||||||
|
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken);
|
||||||
|
public async Task ExecuteAsync(Func<Task> action, CancellationToken);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Implementation**: `SemaphoreSlim` with `WaitAsync` and `Release`
|
||||||
|
|
||||||
|
**Disposal**: `IAsyncDisposable` (awaits semaphore disposal)
|
||||||
|
|
||||||
|
**Usage**: Wrap API calls that need concurrency control
|
||||||
|
```csharp
|
||||||
|
var result = await _rateLimiter.ExecuteAsync(async () =>
|
||||||
|
await _client.EmbedAsync(model, batch), cancellationToken);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Design**: Simple, reusable. Could be replaced with `Polly.RateLimiting` policy but this is lightweight.
|
||||||
|
|
||||||
|
### StatusReporter
|
||||||
|
|
||||||
|
**Purpose**: Real-time progress UI with spinner and verbose modes
|
||||||
|
|
||||||
|
**Architecture**:
|
||||||
|
- Producer: UpdateStatus(text) → writes to `Channel<string>`
|
||||||
|
- Consumer: Background task `ProcessStatusUpdatesAsync()` reads from channel
|
||||||
|
- Spinner: Separate task animates Braille characters every 100ms
|
||||||
|
|
||||||
|
**Modes**:
|
||||||
|
|
||||||
|
**Verbose Mode** (`_verbose = true`):
|
||||||
|
- All progress messages written as `Console.WriteLine()`
|
||||||
|
- No spinner
|
||||||
|
- Full audit trail
|
||||||
|
|
||||||
|
**Compact Mode** (default):
|
||||||
|
- Status line with spinner (overwrites same line)
|
||||||
|
- Only latest status visible
|
||||||
|
- Example: `⠋ Fetching articles 3/10...`
|
||||||
|
|
||||||
|
**Key Methods**:
|
||||||
|
- `UpdateStatus(message)` - fire-and-forget, non-blocking
|
||||||
|
- `WriteLine(text)` - stops spinner temporarily, writes full line
|
||||||
|
- `StartSpinner()` / `StopSpinner()` - manual control
|
||||||
|
- `ClearStatus()` - ANSI escape `\r\x1b[K` to clear line
|
||||||
|
- `Dispose()` - completes channel, waits for background tasks
|
||||||
|
|
||||||
|
**Spinner Chars**: `['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏']` (Braille patterns, smooth animation)
|
||||||
|
|
||||||
|
**ANSI Codes**: `\r` (carriage return), `\x1b[K` (erase to end of line)
|
||||||
|
|
||||||
|
**Thread Safety**: Channel is thread-safe; multiple components can write concurrently without locks
|
||||||
|
|
||||||
|
**Design**: Well-encapsulated; could be reused in other CLI projects.
|
||||||
|
|
||||||
|
### ConfigManager
|
||||||
|
|
||||||
|
**Purpose**: Load/save configuration from XDG-compliant location
|
||||||
|
|
||||||
|
**Config Path**:
|
||||||
|
- `Environment.SpecialFolder.UserProfile` → `~/.config/openquery/config`
|
||||||
|
|
||||||
|
**Schema** (`AppConfig`):
|
||||||
|
```csharp
|
||||||
|
public class AppConfig
|
||||||
|
{
|
||||||
|
public string ApiKey { get; set; } = "";
|
||||||
|
public string Model { get; set; } = "qwen/qwen3.5-flash-02-23";
|
||||||
|
public int DefaultQueries { get; set; } = 3;
|
||||||
|
public int DefaultChunks { get; set; } = 3;
|
||||||
|
public int DefaultResults { get; set; } = 5;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Format**: Simple `key=value` (no INI parser, manual line split)
|
||||||
|
|
||||||
|
**Methods**:
|
||||||
|
- `Load()` → reads file if exists, returns `AppConfig` (with defaults)
|
||||||
|
- `Save(AppConfig)` → writes all 5 keys, overwrites existing
|
||||||
|
|
||||||
|
**Design**:
|
||||||
|
- Static class (no instances)
|
||||||
|
- Creates directory if missing
|
||||||
|
- No validation (writes whatever values given)
|
||||||
|
- Could be improved with JSON format (but keep simple)
|
||||||
|
|
||||||
|
## Data Models
|
||||||
|
|
||||||
|
### OpenQueryOptions
|
||||||
|
|
||||||
|
**Location**: `Models/OpenQueryOptions.cs`
|
||||||
|
|
||||||
|
**Type**: `record`
|
||||||
|
|
||||||
|
**Purpose**: Immutable options object passed through workflow
|
||||||
|
|
||||||
|
**Properties**:
|
||||||
|
- `int Chunks` - top N chunks for context
|
||||||
|
- `int Results` - search results per query
|
||||||
|
- `int Queries` - number of expanded queries to generate
|
||||||
|
- `bool Short` - concise answer flag
|
||||||
|
- `bool Long` - detailed answer flag
|
||||||
|
- `bool Verbose` - verbose logging flag
|
||||||
|
- `string Question` - original user question
|
||||||
|
|
||||||
|
**Created**: In `Program.cs` from CLI options + config defaults
|
||||||
|
|
||||||
|
**Used By**: `OpenQueryApp.RunAsync()`
|
||||||
|
|
||||||
|
### Chunk
|
||||||
|
|
||||||
|
**Location**: `Models/Chunk.cs`
|
||||||
|
|
||||||
|
**Type**: `record`
|
||||||
|
|
||||||
|
**Purpose**: Content chunk with metadata and embedding
|
||||||
|
|
||||||
|
**Properties**:
|
||||||
|
- `string Content` - extracted text (~500 chars)
|
||||||
|
- `string SourceUrl` - article URL
|
||||||
|
- `string? Title` - article title (nullable)
|
||||||
|
- `float[]? Embedding` - vector embedding (populated by EmbeddingService)
|
||||||
|
- `float Score` - relevance score (populated during ranking)
|
||||||
|
|
||||||
|
**Lifecycle**:
|
||||||
|
1. Instantiated in `SearchTool.ExecuteParallelArticleFetchingAsync` with content, url, title
|
||||||
|
2. `Embedding` set in `ExecuteParallelEmbeddingsAsync` after batch processing
|
||||||
|
3. `Score` set in `RankAndSelectTopChunks` after cosine similarity
|
||||||
|
4. Serialized into context string for final answer
|
||||||
|
|
||||||
|
**Equality**: Records provide value equality (based on all properties)
|
||||||
|
|
||||||
|
### ParallelProcessingOptions
|
||||||
|
|
||||||
|
**Location**: `Models/ParallelOptions.cs`
|
||||||
|
|
||||||
|
**Type**: `class` (mutable)
|
||||||
|
|
||||||
|
**Purpose**: Concurrency settings for parallel operations
|
||||||
|
|
||||||
|
**Properties** (with defaults):
|
||||||
|
- `MaxConcurrentArticleFetches` = 10
|
||||||
|
- `MaxConcurrentEmbeddingRequests` = 4
|
||||||
|
- `EmbeddingBatchSize` = 300
|
||||||
|
|
||||||
|
**Used By**: `EmbeddingService` (for embeddings), `SearchTool` (for article fetching)
|
||||||
|
|
||||||
|
**Currently**: Hardcoded in `SearchTool` constructor; could be made configurable
|
||||||
|
|
||||||
|
### OpenRouter Models (Models/OpenRouter.cs)
|
||||||
|
|
||||||
|
**Purpose**: DTOs for OpenRouter API (JSON serializable)
|
||||||
|
|
||||||
|
**Chat Completion**:
|
||||||
|
- `ChatCompletionRequest` (model, messages, tools, stream)
|
||||||
|
- `ChatCompletionResponse` (choices[], usage[])
|
||||||
|
- `Message` (role, content, tool_calls, tool_call_id)
|
||||||
|
- `ToolDefinition`, `ToolFunction`, `ToolCall`, `FunctionCall`
|
||||||
|
- `Choice`, `Usage`
|
||||||
|
|
||||||
|
**Embedding**:
|
||||||
|
- `EmbeddingRequest` (model, input[])
|
||||||
|
- `EmbeddingResponse` (data[], usage)
|
||||||
|
- `EmbeddingData` (embedding[], index)
|
||||||
|
|
||||||
|
**Streaming**:
|
||||||
|
- `StreamChunk` (TextDelta, Tool)
|
||||||
|
- `ChatCompletionChunk`, `ChunkChoice`, `ChunkDelta`
|
||||||
|
|
||||||
|
**JSON Properties**: Uses `[JsonPropertyName]` to match API
|
||||||
|
|
||||||
|
**Serialization**: System.Text.Json with source generation (AppJsonContext)
|
||||||
|
|
||||||
|
### Searxng Models (Models/Searxng.cs)
|
||||||
|
|
||||||
|
**Purpose**: DTOs for SearxNG search results
|
||||||
|
|
||||||
|
**Records**:
|
||||||
|
- `SearxngRoot` with `List<SearxngResult> Results`
|
||||||
|
- `SearxngResult` with `Title`, `Url`, `Content` (snippet)
|
||||||
|
|
||||||
|
**Usage**: Deserialized from SearxNG's JSON response
|
||||||
|
|
||||||
|
### JsonContexts
|
||||||
|
|
||||||
|
**Location**: `Models/JsonContexts.cs`
|
||||||
|
|
||||||
|
**Purpose**: Source-generated JSON serializer context for AOT compatibility
|
||||||
|
|
||||||
|
**Pattern**:
|
||||||
|
```csharp
|
||||||
|
[JsonSerializable(typeof(ChatCompletionRequest))]
|
||||||
|
[JsonSerializable(typeof(ChatCompletionResponse))]
|
||||||
|
... etc ...
|
||||||
|
internal partial class AppJsonContext : JsonSerializerContext
|
||||||
|
{
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Generated**: Partial class compiled by source generator
|
||||||
|
|
||||||
|
**Used By**: All `JsonSerializer.Serialize/Deserialize` calls with `AppJsonContext.Default.{Type}`
|
||||||
|
|
||||||
|
**Benefits**:
|
||||||
|
- AOT-compatible (no reflection)
|
||||||
|
- Faster serialization (compiled delegates)
|
||||||
|
- Smaller binary (trimming-safe)
|
||||||
|
|
||||||
|
## Component Interactions
|
||||||
|
|
||||||
|
### Dependencies Graph
|
||||||
|
|
||||||
|
```
|
||||||
|
Program.cs
|
||||||
|
├── ConfigManager (load/save)
|
||||||
|
├── OpenRouterClient ──┐
|
||||||
|
├── SearxngClient ─────┤
|
||||||
|
├── EmbeddingService ──┤
|
||||||
|
└── SearchTool ────────┤
|
||||||
|
│
|
||||||
|
OpenQueryApp ◄──────────┘
|
||||||
|
│
|
||||||
|
├── OpenRouterClient (query gen + answer streaming)
|
||||||
|
├── SearchTool (pipeline)
|
||||||
|
│ ├── SearxngClient (searches)
|
||||||
|
│ ├── ArticleService (fetch)
|
||||||
|
│ ├── ChunkingService (split)
|
||||||
|
│ ├── EmbeddingService (embeddings)
|
||||||
|
│ ├── RateLimiter (concurrency)
|
||||||
|
│ └── StatusReporter (progress via callback)
|
||||||
|
└── StatusReporter (UI)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Data Flow Between Components
|
||||||
|
|
||||||
|
```
|
||||||
|
OpenQueryOptions
|
||||||
|
↓
|
||||||
|
OpenQueryApp
|
||||||
|
├─ Query Generation
|
||||||
|
│ └─ OpenRouterClient.CompleteAsync()
|
||||||
|
│ → List<string> generatedQueries
|
||||||
|
│
|
||||||
|
├─ Search Pipeline
|
||||||
|
│ └─ SearchTool.ExecuteAsync(originalQuery, generatedQueries, ...)
|
||||||
|
│ ↓
|
||||||
|
│ Phase 1: SearxngClient.SearchAsync(query) × N
|
||||||
|
│ → ConcurrentBag<SearxngResult>
|
||||||
|
│ → List<SearxngResult> (unique)
|
||||||
|
│ ↓
|
||||||
|
│ Phase 2: ArticleService.FetchArticleAsync(url) × M
|
||||||
|
│ → ChunkingService.ChunkText(article.TextContent)
|
||||||
|
│ → ConcurrentBag<Chunk> (content, url, title)
|
||||||
|
│ ↓
|
||||||
|
│ Phase 3: EmbeddingService.GetEmbeddingsAsync(chunkContents)
|
||||||
|
│ → (queryEmbedding, chunkEmbeddings)
|
||||||
|
│ ↓
|
||||||
|
│ Phase 4: CosineSimilarity + Rank
|
||||||
|
│ → List<Chunk> topChunks (with Score, Embedding set)
|
||||||
|
│ ↓
|
||||||
|
│ Format: context string with [Source N: Title](Url)
|
||||||
|
│ → return context string
|
||||||
|
│
|
||||||
|
└─ Final Answer
|
||||||
|
└─ OpenRouterClient.StreamAsync(prompt with context)
|
||||||
|
→ stream deltas to Console
|
||||||
|
```
|
||||||
|
|
||||||
|
### Interface Contracts
|
||||||
|
|
||||||
|
**SearchTool → Progress**:
|
||||||
|
```csharp
|
||||||
|
// Invoked as: onProgress?.Invoke("[Fetching article 1/10: example.com]")
|
||||||
|
Action<string>? onProgress
|
||||||
|
```
|
||||||
|
|
||||||
|
**StatusReporter ← Progress**:
|
||||||
|
```csharp
|
||||||
|
// Handler in OpenQueryApp:
|
||||||
|
(progress) => {
|
||||||
|
if (options.Verbose) reporter.WriteLine(progress);
|
||||||
|
else reporter.UpdateStatus(parsedShorterMessage);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**SearchTool → ArticleService**:
|
||||||
|
```csharp
|
||||||
|
Article article = await ArticleService.FetchArticleAsync(url);
|
||||||
|
```
|
||||||
|
|
||||||
|
**SearchTool → EmbeddingService**:
|
||||||
|
```csharp
|
||||||
|
(float[] queryEmbedding, float[][] chunkEmbeddings) = await ExecuteParallelEmbeddingsAsync(...);
|
||||||
|
// Also: embeddingService.GetEmbeddingAsync(text), GetEmbeddingsWithRateLimitAsync(...)
|
||||||
|
```
|
||||||
|
|
||||||
|
**SearchTool → ChunkingService**:
|
||||||
|
```csharp
|
||||||
|
List<string> chunks = ChunkingService.ChunkText(article.TextContent);
|
||||||
|
```
|
||||||
|
|
||||||
|
**SearchTool → RateLimiter**:
|
||||||
|
```csharp
|
||||||
|
await _rateLimiter.ExecuteAsync(async () => await _client.EmbedAsync(...), ct);
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [OpenQueryApp](openquery-app.md) - Main orchestrator details
|
||||||
|
- [SearchTool](search-tool.md) - Pipeline implementation
|
||||||
|
- [Services](services.md) - All service classes documented
|
||||||
|
- [Models](models.md) - Complete data model reference
|
||||||
555
docs/components/search-tool.md
Normal file
555
docs/components/search-tool.md
Normal file
@@ -0,0 +1,555 @@
|
|||||||
|
# SearchTool Component
|
||||||
|
|
||||||
|
Deep dive into `SearchTool` - the core pipeline orchestrator that implements the 4-phase search-retrieve-rank workflow.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
`SearchTool` is the workhorse of OpenQuery. It Takes search queries, fetches articles, generates embeddings, ranks by relevance, and returns formatted context for the final AI answer.
|
||||||
|
|
||||||
|
## Location
|
||||||
|
`Tools/SearchTool.cs`
|
||||||
|
|
||||||
|
## Class Definition
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public class SearchTool
|
||||||
|
{
|
||||||
|
private readonly SearxngClient _searxngClient;
|
||||||
|
private readonly EmbeddingService _embeddingService;
|
||||||
|
private readonly ParallelProcessingOptions _options;
|
||||||
|
|
||||||
|
public static string Name => "search";
|
||||||
|
public static string Description => "Search the web for information on a topic";
|
||||||
|
|
||||||
|
public SearchTool(
|
||||||
|
SearxngClient searxngClient,
|
||||||
|
EmbeddingService embeddingService);
|
||||||
|
|
||||||
|
public Task<string> ExecuteAsync(
|
||||||
|
string originalQuery,
|
||||||
|
List<string> generatedQueries,
|
||||||
|
int maxResults,
|
||||||
|
int topChunksLimit,
|
||||||
|
Action<string>? onProgress = null,
|
||||||
|
bool verbose = true);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Dependencies**:
|
||||||
|
- `SearxngClient` - for web searches
|
||||||
|
- `EmbeddingService` - for vector generation
|
||||||
|
- `ParallelProcessingOptions` - concurrency settings (hardcoded new instance)
|
||||||
|
|
||||||
|
**Static Properties**:
|
||||||
|
- `Name` - tool identifier (currently "search")
|
||||||
|
- `Description` - tool description
|
||||||
|
|
||||||
|
## ExecuteAsync Method
|
||||||
|
|
||||||
|
**Signature**:
|
||||||
|
```csharp
|
||||||
|
public async Task<string> ExecuteAsync(
|
||||||
|
string originalQuery, // User's original question
|
||||||
|
List<string> generatedQueries, // Expanded search queries
|
||||||
|
int maxResults, // Results per query
|
||||||
|
int topChunksLimit, // Top N chunks to return
|
||||||
|
Action<string>? onProgress, // Progress callback
|
||||||
|
bool verbose) // Verbose mode flag
|
||||||
|
```
|
||||||
|
|
||||||
|
**Returns**: `Task<string>` - formatted context with source citations
|
||||||
|
|
||||||
|
**Contract**:
|
||||||
|
- Never returns `null` (returns "No search results found." on zero results)
|
||||||
|
- Progress callback may be invoked frequently (many phases)
|
||||||
|
- `verbose` passed to sub-components for their own logging
|
||||||
|
|
||||||
|
## The 4-Phase Pipeline
|
||||||
|
|
||||||
|
```
|
||||||
|
ExecuteAsync()
|
||||||
|
│
|
||||||
|
├─ Phase 1: ExecuteParallelSearchesAsync
|
||||||
|
│ Input: generatedQueries × maxResults
|
||||||
|
│ Output: List<SearxngResult> (deduplicated)
|
||||||
|
│
|
||||||
|
├─ Phase 2: ExecuteParallelArticleFetchingAsync
|
||||||
|
│ Input: List<SearxngResult>
|
||||||
|
│ Output: List<Chunk> (with content, url, title)
|
||||||
|
│
|
||||||
|
├─ Phase 3: ExecuteParallelEmbeddingsAsync
|
||||||
|
│ Input: originalQuery + List<Chunk>
|
||||||
|
│ Output: (queryEmbedding, chunkEmbeddings)
|
||||||
|
│ (also sets Chunk.Embedding for valid chunks)
|
||||||
|
│
|
||||||
|
├─ Phase 4: RankAndSelectTopChunks
|
||||||
|
│ Input: List<Chunk> + queryEmbedding + chunkEmbeddings
|
||||||
|
│ Output: List<Chunk> topChunks (with Score set)
|
||||||
|
│
|
||||||
|
└─ Format Context → return string
|
||||||
|
```
|
||||||
|
|
||||||
|
### Phase 1: ExecuteParallelSearchesAsync
|
||||||
|
|
||||||
|
**Purpose**: Execute all search queries in parallel, collect and deduplicate results.
|
||||||
|
|
||||||
|
**Implementation**:
|
||||||
|
```csharp
|
||||||
|
var allResults = new ConcurrentBag<SearxngResult>();
|
||||||
|
|
||||||
|
var searchTasks = generatedQueries.Select(async query =>
|
||||||
|
{
|
||||||
|
onProgress?.Invoke($"[Searching web for '{query}'...]");
|
||||||
|
try
|
||||||
|
{
|
||||||
|
var results = await _searsult in results)
|
||||||
|
{
|
||||||
|
allResults.Add(result);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
catch (Exception ex)
|
||||||
|
{
|
||||||
|
if (verbose)
|
||||||
|
Console.WriteLine($"Warning: Search failed for query '{query}': {ex.Message}");
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
await Task.WhenAll(searchTasks);
|
||||||
|
|
||||||
|
var uniqueResults = allResults.DistinctBy(r => r.Url).ToList();
|
||||||
|
return uniqueResults;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Details**:
|
||||||
|
- `ConcurrentBag<SearxngResult>` collects results thread-safely
|
||||||
|
- `Task.WhenAll` - unbounded parallelism (parallel to `generatedQueries.Count`)
|
||||||
|
- Each task: calls `_searxngClient.SearchAsync(query, maxResults)`
|
||||||
|
- Errors caught and logged (verbose only); other queries continue
|
||||||
|
- `DistinctBy(r => r.Url)` removes duplicates
|
||||||
|
|
||||||
|
**Return**: `List<SearxngResult>` (unique URLs only)
|
||||||
|
|
||||||
|
**Progress**: `[Searching web for '{query}'...]`
|
||||||
|
|
||||||
|
**Potential Issues**:
|
||||||
|
- Could overwhelm local SearxNG if `generatedQueries` is large (100+)
|
||||||
|
- SearxNG itself may have its own rate limiting
|
||||||
|
|
||||||
|
**Future Enhancement**:
|
||||||
|
- Add semaphore to limit search concurrency
|
||||||
|
- Add timeout per search task
|
||||||
|
- Cache search results (same query across runs)
|
||||||
|
|
||||||
|
### Phase 2: ExecuteParallelArticleFetchingAsync
|
||||||
|
|
||||||
|
**Purpose**: Fetch each search result URL, extract article content, split into chunks.
|
||||||
|
|
||||||
|
**Implementation**:
|
||||||
|
```csharp
|
||||||
|
var chunks = new ConcurrentBag<Chunk>();
|
||||||
|
var completedFetches = 0;
|
||||||
|
var totalFetches = searchResults.Count;
|
||||||
|
var semaphore = new SemaphoreSlim(_options.MaxConcurrentArticleFetches); // 10
|
||||||
|
|
||||||
|
var fetchTasks = searchResults.Select(async result =>
|
||||||
|
{
|
||||||
|
await semaphore.WaitAsync();
|
||||||
|
try
|
||||||
|
{
|
||||||
|
var current = Interlocked.Increment(ref completedFetches);
|
||||||
|
var uri = new Uri(result.Url);
|
||||||
|
var domain = uri.Host;
|
||||||
|
onProgress?.Invoke($"[Fetching article {current}/{totalFetches}: {domain}]");
|
||||||
|
|
||||||
|
try
|
||||||
|
{
|
||||||
|
var article = await ArticleService.FetchArticleAsync(result.Url);
|
||||||
|
if (!article.IsReadable || string.IsNullOrEmpty(article.TextContent))
|
||||||
|
return;
|
||||||
|
|
||||||
|
var textChunks = ChunkingService.ChunkText(article.TextContent);
|
||||||
|
foreach (var chunkText in textChunks)
|
||||||
|
{
|
||||||
|
chunks.Add(new Chunk(chunkText, result.Url, article.Title));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
catch (Exception ex)
|
||||||
|
{
|
||||||
|
if (verbose)
|
||||||
|
Console.WriteLine($"Warning: Failed to fetch article {result.Url}: {ex.Message}");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
semaphore.Release();
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
await Task.WhenAll(fetchTasks);
|
||||||
|
return chunks.ToList();
|
||||||
|
```
|
||||||
|
|
||||||
|
**Details**:
|
||||||
|
- `SemaphoreSlim` limits concurrency to `MaxConcurrentArticleFetches` (10)
|
||||||
|
- `Interlocked.Increment` for thread-safe progress counting
|
||||||
|
- Progress: `[Fetching article X/Y: domain]` (extracts host from URL)
|
||||||
|
- `ArticleService.FetchArticleAsync` uses SmartReader
|
||||||
|
- Article must be `IsReadable` and have `TextContent`
|
||||||
|
- `ChunkingService.ChunkText` splits into ~500-char pieces
|
||||||
|
- Each chunk becomes a `Chunk(content, url, article.Title)`
|
||||||
|
- Errors logged (verbose only); failed URLs yield no chunks
|
||||||
|
|
||||||
|
**Return**: `List<Chunk>` (potentially many per article)
|
||||||
|
|
||||||
|
**Chunk Count Estimate**:
|
||||||
|
- 15 articles × average 3000 chars/article = 45,000 chars
|
||||||
|
- With 500-char chunks ≈ 90 chunks
|
||||||
|
- With natural breaks → maybe 70-80 chunks
|
||||||
|
|
||||||
|
**Potential Issues**:
|
||||||
|
- Some sites block SmartReader (JS-heavy, paywalls)
|
||||||
|
- Slow article fetches may cause long tail latency
|
||||||
|
- Large articles create many chunks → memory + embedding cost
|
||||||
|
|
||||||
|
**Future Enhancements**:
|
||||||
|
- Add per-URL timeout
|
||||||
|
- Filter chunks by length threshold (skip tiny chunks)
|
||||||
|
- Deduplicate chunks across articles (same content on different sites)
|
||||||
|
- Cache article fetches by URL
|
||||||
|
|
||||||
|
### Phase 3: ExecuteParallelEmbeddingsAsync
|
||||||
|
|
||||||
|
**Purpose**: Generate embeddings for the original query and all chunks, with batching, rate limiting, and concurrency control.
|
||||||
|
|
||||||
|
**Implementation**:
|
||||||
|
```csharp
|
||||||
|
onProgress?.Invoke($"[Generating embeddings for {chunks.Count} chunks and query...]");
|
||||||
|
|
||||||
|
// Start query embedding (single) and chunk embeddings (batch) concurrently
|
||||||
|
var queryEmbeddingTask = _embeddingService.GetEmbeddingAsync(originalQuery);
|
||||||
|
|
||||||
|
var chunkTexts = chunks.Select(c => c.Embedding).ToList(); // WRONG in original code?
|
||||||
|
// Actually: chunks.Select(c => c.Content).ToList();
|
||||||
|
var chunkEmbeddingsTask = _embeddingService.GetEmbeddingsWithRateLimitAsync(
|
||||||
|
chunkTexts, onProgress);
|
||||||
|
|
||||||
|
await Task.WhenAll(queryEmbeddingTask, chunkEmbeddingsTask);
|
||||||
|
|
||||||
|
var queryEmbedding = await queryEmbeddingTask;
|
||||||
|
var chunkEmbeddings = await chunkEmbeddingsTask;
|
||||||
|
|
||||||
|
// Filter out chunks with empty embeddings
|
||||||
|
var validChunks = new List<Chunk>();
|
||||||
|
var validEmbeddings = new List<float[]>();
|
||||||
|
|
||||||
|
for (var i = 0; i < chunks.Count; i++)
|
||||||
|
{
|
||||||
|
if (chunkEmbeddings[i].Length > 0)
|
||||||
|
{
|
||||||
|
validChunks.Add(chunks[i]);
|
||||||
|
validEmbeddings.Add(chunkEmbeddings[i]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Update chunks with embeddings
|
||||||
|
for (var i = 0; i < validChunks.Count; i++)
|
||||||
|
{
|
||||||
|
validChunks[i].Embedding = validEmbeddings[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
return (queryEmbedding, validEmbeddings.ToArray());
|
||||||
|
```
|
||||||
|
|
||||||
|
**Corrected Code** (matching actual source):
|
||||||
|
```csharp
|
||||||
|
var chunkTexts = chunks.Select(c => c.Content).ToList();
|
||||||
|
var chunkEmbeddingsTask = _embeddingService.GetEmbeddingsWithRateLimitAsync(
|
||||||
|
chunkTexts, onProgress);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Details**:
|
||||||
|
- **Query embedding**: Single request for original question (one embedding)
|
||||||
|
- **Chunk embeddings**: Batch processing of all chunk texts
|
||||||
|
- Both run concurrently via `Task.WhenAll`
|
||||||
|
- `_embeddingService.GetEmbeddingsWithRateLimitAsync` uses:
|
||||||
|
- Batch size: 300 (default)
|
||||||
|
- Max concurrent batches: 4 (default)
|
||||||
|
- Polly retry (3 attempts, exponential backoff)
|
||||||
|
- `RateLimiter` (semaphore) for API concurrency
|
||||||
|
- Failed batches return empty `float[]` (length 0)
|
||||||
|
- Filters out failed chunks (won't be ranked)
|
||||||
|
- `validChunks[i].Embedding = validEmbeddings[i]` attaches embedding to chunk
|
||||||
|
|
||||||
|
**Return**: `(float[] queryEmbedding, float[][] chunkEmbeddings)` where:
|
||||||
|
- `chunkEmbeddings` length = `validChunks.Count` (filtered)
|
||||||
|
- Order matches `validChunks` order (since we filtered parallel arrays)
|
||||||
|
|
||||||
|
**Progress**: Interleaved from embedding service's own progress callbacks (batch X/Y)
|
||||||
|
|
||||||
|
**Potential Issues**:
|
||||||
|
- `GetEmbeddingsWithRateLimitAsync` uses `results[batchIndex] = ...` which is not thread-safe without synchronization - **BUG**?
|
||||||
|
- Actually `results` is an array, not a list, so indexing is thread-safe
|
||||||
|
- But concurrent writes to different indices are safe
|
||||||
|
- Filtering loop assumes `chunkEmbeddings` has same count as `chunks`; if embedding service returns fewer, might index out of range
|
||||||
|
- Looking at `GetEmbeddingsWithRateLimitAsync`: returns `results.SelectMany(r => r).ToArray()` which should match input count (including empty arrays for failed batches)
|
||||||
|
- So safe
|
||||||
|
|
||||||
|
**Memory Consideration**:
|
||||||
|
- `chunkTexts` list holds all chunk strings (may be large, but still in memory)
|
||||||
|
- `chunkEmbeddings` holds all float arrays (600KB for 100 chunks)
|
||||||
|
- Total: modest (~few MB)
|
||||||
|
|
||||||
|
**Future Enhancements**:
|
||||||
|
- Stream embeddings? (No benefit, need all for ranking)
|
||||||
|
- Cache embeddings by content hash (cross-run)
|
||||||
|
- Support different embedding model per query
|
||||||
|
|
||||||
|
### Phase 4: RankAndSelectTopChunks
|
||||||
|
|
||||||
|
**Purpose**: Score chunks by semantic relevance to query, sort, and select top N.
|
||||||
|
|
||||||
|
**Implementation**:
|
||||||
|
```csharp
|
||||||
|
var chunksWithEmbeddings = chunks.Where(c => c.Embedding != null).ToList();
|
||||||
|
|
||||||
|
foreach (var chunk in chunksWithEmbeddings)
|
||||||
|
{
|
||||||
|
chunk.Score = EmbeddingService.CosineSimilarity(queryEmbedding, chunk.Embedding!);
|
||||||
|
}
|
||||||
|
|
||||||
|
var topChunks = chunksWithEmbeddings
|
||||||
|
.OrderByDescending(c => c.Score)
|
||||||
|
.Take(topChunksLimit)
|
||||||
|
.ToList();
|
||||||
|
|
||||||
|
return topChunks;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Details**:
|
||||||
|
- Filters to chunks that have embeddings (successful phase 3)
|
||||||
|
- For each: `Score = CosineSimilarity(queryEmbedding, chunkEmbedding)`
|
||||||
|
- Uses `TensorPrimitives.CosineSimilarity` (SIMD-accelerated)
|
||||||
|
- Returns float typically 0-1 (higher = more relevant)
|
||||||
|
- `OrderByDescending` - highest scores first
|
||||||
|
- `Take(topChunksLimit)` - select top N (from `--chunks` option)
|
||||||
|
- Returns `List<Chunk>` (now with `Score` set)
|
||||||
|
|
||||||
|
**Return**: Top N chunks ready for context formatting
|
||||||
|
|
||||||
|
**Complexity**:
|
||||||
|
- O(n) for scoring (where n = valid chunks, typically 50-100)
|
||||||
|
- O(n log n) for sorting (fast for n=100)
|
||||||
|
- Negligible CPU time
|
||||||
|
|
||||||
|
**Edge Cases**:
|
||||||
|
- If `topChunksLimit` > `chunksWithEmbeddings.Count`, returns all (no padding)
|
||||||
|
- If all embeddings failed, returns empty list
|
||||||
|
- Should handle `topChunksLimit == 0` (returns empty)
|
||||||
|
|
||||||
|
### Context Formatting (After Phase 4)
|
||||||
|
|
||||||
|
**Location**: In `ExecuteAsync`, after ranking:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var context = string.Join("\n\n", topChunks.Select((c, i) =>
|
||||||
|
$"[Source {i + 1}: {c.Title ?? "Unknown"}]({c.SourceUrl})\n{c.Content}"));
|
||||||
|
|
||||||
|
return context;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Format**:
|
||||||
|
```
|
||||||
|
[Source 1: Article Title](https://example.com/article)
|
||||||
|
Chunk content text...
|
||||||
|
|
||||||
|
[Source 2: Another Title](https://example.com/another)
|
||||||
|
Chunk content text...
|
||||||
|
|
||||||
|
[Source 3: Third Title](https://example.com/third)
|
||||||
|
Chunk content text...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Features**:
|
||||||
|
- Each source numbered 1, 2, 3... (matches order of topChunks = descending relevance)
|
||||||
|
- Title or "Unknown" if null
|
||||||
|
- Title is markdown link to original URL
|
||||||
|
- Chunk content as plain text (may contain its own formatting)
|
||||||
|
- Double newline between sources
|
||||||
|
|
||||||
|
**Rationale**:
|
||||||
|
- Markdown links allow copy-pasting to browsers
|
||||||
|
- Numbers allow LLM to cite `[Source 1]` in answer
|
||||||
|
- Original title helps user recognize source
|
||||||
|
|
||||||
|
**Potential Issues**:
|
||||||
|
- LLM might misinterpret "Source 1" as literal citation required
|
||||||
|
- If chunks contain markdown, may conflict (no escaping)
|
||||||
|
- Some titles may have markdown special chars (unlikely but possible)
|
||||||
|
|
||||||
|
**Alternative**: Could use XML-style tags or more robust citation format.
|
||||||
|
|
||||||
|
## Error Handling & Edge Cases
|
||||||
|
|
||||||
|
### Empty Results Handling
|
||||||
|
|
||||||
|
At end of `ExecuteAsync`:
|
||||||
|
```csharp
|
||||||
|
if (searchResults.Count == 0)
|
||||||
|
return "No search results found.";
|
||||||
|
|
||||||
|
if (chunks.Count == 0)
|
||||||
|
return "Found search results but could not extract readable content.";
|
||||||
|
```
|
||||||
|
|
||||||
|
These messages appear in final answer (LLM will respond to these contexts).
|
||||||
|
|
||||||
|
### Partial Failures
|
||||||
|
|
||||||
|
- Some search queries fail → proceed with others
|
||||||
|
- Some articles fail to fetch → continue
|
||||||
|
- Some embedding batches fail → those chunks filtered out
|
||||||
|
- Ranking proceeds with whatever valid embeddings exist
|
||||||
|
|
||||||
|
### Verbose vs Compact Progress
|
||||||
|
|
||||||
|
`verbose` parameter affects what's passed to phases:
|
||||||
|
- **Article fetching**: errors only shown if `verbose`
|
||||||
|
- **Embeddings**: always shows batch progress via `onProgress` (from EmbeddingService)
|
||||||
|
- **Searches**: no error suppression (warning always logged to Console, not through callback)
|
||||||
|
|
||||||
|
### Progress Callback Pattern
|
||||||
|
|
||||||
|
`onProgress` is invoked at major milestones:
|
||||||
|
- Searching: `[Searching web for '{query}'...]`
|
||||||
|
- Article fetch: `[Fetching article X/Y: domain]`
|
||||||
|
- Embeddings: `[Generating embeddings: batch X/Y]`
|
||||||
|
- Final: `[Found top X most relevant chunks overall. Generating answer...]`
|
||||||
|
|
||||||
|
Each phase may invoke many times (e.g., embedding batches). `StatusReporter` handles these appropriately.
|
||||||
|
|
||||||
|
## Performance Characteristics
|
||||||
|
|
||||||
|
### Time Estimate per Phase (for typical 3 queries, 5 results each, ~15 articles):
|
||||||
|
|
||||||
|
| Phase | Time | Dominated By |
|
||||||
|
|-------|------|--------------|
|
||||||
|
| Searches | 3-8s | Network latency to SearxNG |
|
||||||
|
| Article Fetching | 5-15s | Network + SmartReader CPU |
|
||||||
|
| Embeddings | 2-4s | OpenRouter API latency (4 concurrent batches) |
|
||||||
|
| Ranking | <0.1s | CPU (O(n log n) sort, n~100) |
|
||||||
|
| **Total Pipeline** | **10-30s** | Articles + Searches |
|
||||||
|
|
||||||
|
### Concurrency Limits Effect
|
||||||
|
|
||||||
|
**Article Fetching** (`MaxConcurrentArticleFetches` = 10):
|
||||||
|
- 15 articles → 2 waves (10 then 5)
|
||||||
|
- If each takes 2s → ~4s total (vs 30s sequential)
|
||||||
|
|
||||||
|
**Embedding Batching** (`MaxConcurrentEmbeddingRequests` = 4, `EmbeddingBatchSize` = 300):
|
||||||
|
- 80 chunks → 1 batch of 300 (all fit)
|
||||||
|
- If 300 chunks → 1 batch (300 fits), but max concurrent = 4 if multiple embedding calls
|
||||||
|
- Here: single embedding call with 80 items = 1 batch (no parallelism needed)
|
||||||
|
|
||||||
|
### Memory Usage
|
||||||
|
|
||||||
|
- `searchResults` (15 items) → ~30KB
|
||||||
|
- `chunks` (80 items × 500 chars) → ~40KB text + embeddings ~400KB (80 × 1536 × 4)
|
||||||
|
- Total ≈ 500KB excluding temporary HTTP buffers
|
||||||
|
|
||||||
|
## Design Decisions
|
||||||
|
|
||||||
|
### Why Use ConcurrentBag for Results/Chunks?
|
||||||
|
|
||||||
|
Thread-safe collection allows parallel tasks to add without locks. Order is not preserved (but we `DistinctBy` and `Select` maintains order of insertion? Actually no, `ConcurrentBag` doesn't guarantee order. But we later `ToList()` and `DistinctBy` preserves first occurrence order from the bag's enumeration (which is nondeterministic). This is acceptable because order doesn't matter (ranking is semantic). If order mattered, would need `ConcurrentQueue` or sorting by source.
|
||||||
|
|
||||||
|
### Why Not Use Parallel.ForEach for Article Fetching?
|
||||||
|
|
||||||
|
We use `Task.WhenAll` with `Select` + semaphore. `Parallel.ForEachAsync` could also work but requires .NET 6+ and we want to use same pattern as other phases. Semaphore gives explicit concurrency control.
|
||||||
|
|
||||||
|
### Why Separate Query Embedding from Chunk Embeddings?
|
||||||
|
|
||||||
|
`GetEmbeddingAsync` is called directly (not batched) because there's only one query. Could be batched with chunks but:
|
||||||
|
- Query is small (single string)
|
||||||
|
- Batch API has overhead (request structure)
|
||||||
|
- Separate call allows independent completion (no need to wait for chunks to start query embedding)
|
||||||
|
|
||||||
|
### Why Two Different Embedding Methods?
|
||||||
|
|
||||||
|
`EmbeddingService` has:
|
||||||
|
- `GetEmbeddingsWithRateLimitAsync` (used in SearchTool)
|
||||||
|
- `GetEmbeddingsAsync` (similar but different implementation)
|
||||||
|
|
||||||
|
Probably legacy/refactor artifact. Could consolidate.
|
||||||
|
|
||||||
|
### Why Not Deduplicate URLs Earlier?
|
||||||
|
|
||||||
|
Deduplication happens after search aggregation. Could also deduplicate within each search result (SearxNG might already dedupe across engines). But global dedupe is necessary.
|
||||||
|
|
||||||
|
### Why Not Early Filtering (e.g., by domain, length)?
|
||||||
|
|
||||||
|
Possibly could improve quality:
|
||||||
|
- Filter by domain reputation
|
||||||
|
- Filter articles too short (<200 chars) or too long (>50KB)
|
||||||
|
- Not implemented (keep simple)
|
||||||
|
|
||||||
|
## Testing Considerations
|
||||||
|
|
||||||
|
**Unit Testability**: `SearchTool` is fairly testable with mocks:
|
||||||
|
- Mock `SearxngClient` to return predetermined results
|
||||||
|
- Mock `ArticleService` via `EmbeddingService` (or mock that too)
|
||||||
|
- Verify progress callback invocations
|
||||||
|
- Verify final context format
|
||||||
|
|
||||||
|
**Integration Testing**:
|
||||||
|
- End-to-end with real/mocked external services
|
||||||
|
- Need test SearxNG instance and test OpenRouter key (or mock responses)
|
||||||
|
|
||||||
|
**Performance Testing**:
|
||||||
|
- Benchmark with different concurrency settings
|
||||||
|
- Profile memory for large result sets (1000+ articles)
|
||||||
|
- Measure embedding API latency impact
|
||||||
|
|
||||||
|
## Known Issues
|
||||||
|
|
||||||
|
### Bug in ExecuteParallelEmbeddingsAsync?
|
||||||
|
|
||||||
|
Looking at the actual source code of `ExecuteParallelEmbeddingsAsync` **in the core SearchTool**:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var chunkTexts = chunks.Select(c => c.Content).ToList();
|
||||||
|
var chunkEmbeddingsTask = _embeddingService.GetEmbeddingsWithRateLimitAsync(
|
||||||
|
chunkTexts, onProgress);
|
||||||
|
```
|
||||||
|
|
||||||
|
This is correct.
|
||||||
|
|
||||||
|
But in the **initial search result**, I notice there might be confusion. I'll verify this when writing the full component documentation.
|
||||||
|
|
||||||
|
### Potential Race Condition in GetEmbeddingsWithRateLimitAsync
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
results[batchIndex] = batchResults;
|
||||||
|
```
|
||||||
|
|
||||||
|
This is writing to an array index from multiple parallel tasks. Array index writes are atomic for reference types on 64-bit? Actually, writes to different indices are safe because they don't overlap. This is fine.
|
||||||
|
|
||||||
|
### Progress Callback May Overwhelm
|
||||||
|
|
||||||
|
If invoked synchronously from many parallel tasks, could saturate the channel. `Channel.TryWrite` will return false if buffer full; we ignore return value. Could drop messages under heavy load. Acceptable for CLI UI (some messages may be lost but overall progress visible).
|
||||||
|
|
||||||
|
## Related Components
|
||||||
|
|
||||||
|
- **[OpenQueryApp](openquery-app.md)** - calls this
|
||||||
|
- **[SearxngClient](../../services/SearxngClient.md)** - phase 1
|
||||||
|
- **[ArticleService](../../services/ArticleService.md)** - phase 2a
|
||||||
|
- **[ChunkingService](../../services/ChunkingService.md)** - phase 2b
|
||||||
|
- **[EmbeddingService](../../services/EmbeddingService.md)** - phase 3
|
||||||
|
- **[Ranking](../../services/EmbeddingService.md#cosinesimilarity)** - cosine similarity
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [Services Overview](../services/overview.md) - See supporting services
|
||||||
|
- [CLI Reference](../../api/cli.md) - How users trigger this pipeline
|
||||||
|
- [Performance](../performance.md) - Optimize pipeline settings
|
||||||
471
docs/components/services.md
Normal file
471
docs/components/services.md
Normal file
@@ -0,0 +1,471 @@
|
|||||||
|
# Services Overview
|
||||||
|
|
||||||
|
Comprehensive reference for all service classes in OpenQuery.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Service Catalog](#service-catalog)
|
||||||
|
2. [Client Services](#client-services)
|
||||||
|
3. [Processing Services](#processing-services)
|
||||||
|
4. [Infrastructure Services](#infrastructure-services)
|
||||||
|
5. [Service Interactions](#service-interactions)
|
||||||
|
|
||||||
|
## Service Catalog
|
||||||
|
|
||||||
|
OpenQuery's services are organized into three categories:
|
||||||
|
|
||||||
|
| Category | Services | Purpose |
|
||||||
|
|-----------|----------|---------|
|
||||||
|
| **Clients** | `OpenRouterClient`, `SearxngClient` | External API communication |
|
||||||
|
| **Processors** | `EmbeddingService`, `ChunkingService`, `ArticleService` | Data transformation & extraction |
|
||||||
|
| **Infrastructure** | `RateLimiter`, `StatusReporter` | Cross-cutting concerns |
|
||||||
|
|
||||||
|
All services are **stateless** (except for internal configuration) and can be safely reused across multiple operations.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Client Services
|
||||||
|
|
||||||
|
### OpenRouterClient
|
||||||
|
|
||||||
|
**Location**: `Services/OpenRouterClient.cs`
|
||||||
|
**Purpose**: HTTP client for OpenRouter AI APIs (chat completions & embeddings)
|
||||||
|
|
||||||
|
#### API Endpoints
|
||||||
|
|
||||||
|
| Method | Endpoint | Purpose |
|
||||||
|
|--------|----------|---------|
|
||||||
|
| POST | `/chat/completions` | Chat completion (streaming or non-streaming) |
|
||||||
|
| POST | `/embeddings` | Embedding generation for text inputs |
|
||||||
|
|
||||||
|
#### Authentication
|
||||||
|
```
|
||||||
|
Authorization: Bearer {apiKey}
|
||||||
|
Accept: application/json
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Public Methods
|
||||||
|
|
||||||
|
##### `StreamAsync(ChatCompletionRequest request, CancellationToken cancellationToken)`
|
||||||
|
- **Returns**: `IAsyncEnumerable<StreamChunk>`
|
||||||
|
- **Behavior**: Sets `request.Stream = true`, posts, reads Server-Sent Events stream
|
||||||
|
- **Use Case**: Final answer streaming, real-time responses
|
||||||
|
- **Stream Format**: SSE lines `data: {json}`; yields `TextDelta` or `ToolCall`
|
||||||
|
|
||||||
|
##### `CompleteAsync(ChatCompletionRequest request)`
|
||||||
|
- **Returns**: `Task<ChatCompletionResponse>`
|
||||||
|
- **Behavior**: Sets `request.Stream = false`, posts, returns full response
|
||||||
|
- **Use Case**: Query generation (non-streaming)
|
||||||
|
|
||||||
|
##### `EmbedAsync(string model, List<string> inputs)`
|
||||||
|
- **Returns**: `Task<float[][]>`
|
||||||
|
- **Behavior**: POST `/embeddings`, returns array of vectors (ordered by input index)
|
||||||
|
- **Use Case**: Batch embedding generation
|
||||||
|
|
||||||
|
##### `HttpClient`
|
||||||
|
- **Property**: Internal `_httpClient` (created per instance)
|
||||||
|
- **Note**: Could use `IHttpClientFactory` for pooling (not needed for CLI)
|
||||||
|
|
||||||
|
#### Error Handling
|
||||||
|
- `EnsureSuccessStatusCode()` throws `HttpRequestException` on 4xx/5xx
|
||||||
|
- No retry logic (handled by `EmbeddingService`)
|
||||||
|
|
||||||
|
#### Configuration
|
||||||
|
```csharp
|
||||||
|
public OpenRouterClient(string apiKey)
|
||||||
|
{
|
||||||
|
_apiKey = apiKey;
|
||||||
|
_httpClient = new HttpClient();
|
||||||
|
_httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
|
||||||
|
_httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Example Usage
|
||||||
|
```csharp
|
||||||
|
var client = new OpenRouterClient("sk-or-...");
|
||||||
|
var request = new ChatCompletionRequest("model", new List<Message> { ... });
|
||||||
|
await foreach (var chunk in client.StreamAsync(request))
|
||||||
|
{
|
||||||
|
Console.Write(chunk.TextDelta);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### SearxngClient
|
||||||
|
|
||||||
|
**Location**: `Services/SearxngClient.cs`
|
||||||
|
**Purpose**: HTTP client for SearxNG metasearch engine
|
||||||
|
|
||||||
|
#### API Endpoint
|
||||||
|
```
|
||||||
|
GET /search?q={query}&format=json
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Constructor
|
||||||
|
```csharp
|
||||||
|
public SearxngClient(string baseUrl) // e.g., "http://localhost:8002"
|
||||||
|
```
|
||||||
|
- `baseUrl` trimmed of trailing `/`
|
||||||
|
|
||||||
|
#### Public Methods
|
||||||
|
|
||||||
|
##### `SearchAsync(string query, int limit = 10)`
|
||||||
|
- **Returns**: `Task<List<SearxngResult>>`
|
||||||
|
- **Behavior**: GET request, deserialize JSON, take up to `limit` results
|
||||||
|
- **On Failure**: Returns empty `List<SearxngResult>` (no exception)
|
||||||
|
|
||||||
|
#### Error Handling
|
||||||
|
- `response.EnsureSuccessStatusCode()` would throw, but code doesn't call it
|
||||||
|
- If invalid JSON or missing `Results`, returns empty list
|
||||||
|
- Failures are **tolerated** - individual search queries may fail without aborting whole operation
|
||||||
|
|
||||||
|
#### Example Searxng Response
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"results": [
|
||||||
|
{
|
||||||
|
"title": "Quantum Entanglement - Wikipedia",
|
||||||
|
"url": "https://en.wikipedia.org/wiki/Quantum_entanglement",
|
||||||
|
"content": "Quantum entanglement is a physical phenomenon..."
|
||||||
|
},
|
||||||
|
...
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Processing Services
|
||||||
|
|
||||||
|
### EmbeddingService
|
||||||
|
|
||||||
|
**Location**: `Services/EmbeddingService.cs`
|
||||||
|
**Purpose**: Generate embeddings with batching, rate limiting, and retry logic
|
||||||
|
|
||||||
|
#### Configuration
|
||||||
|
|
||||||
|
**Embedding Model**: `openai/text-embedding-3-small` (default, configurable via constructor)
|
||||||
|
|
||||||
|
**ParallelProcessingOptions** (hardcoded defaults):
|
||||||
|
```csharp
|
||||||
|
public class ParallelProcessingOptions
|
||||||
|
{
|
||||||
|
public int MaxConcurrentEmbeddingRequests { get; set; } = 4;
|
||||||
|
public int EmbeddingBatchSize { get; set; } = 300;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Public Methods
|
||||||
|
|
||||||
|
##### `GetEmbeddingsAsync(List<string> texts, Action<string>? onProgress, CancellationToken)`
|
||||||
|
- **Returns**: `Task<float[][]>`
|
||||||
|
- **Behavior**:
|
||||||
|
- Splits `texts` into batches of `EmbeddingBatchSize`
|
||||||
|
- Parallel executes batches (max `MaxConcurrentEmbeddingRequests` concurrent)
|
||||||
|
- Each batch: rate-limited, retry-wrapped `client.EmbedAsync(model, batch)`
|
||||||
|
- Reassembles in original order
|
||||||
|
- Failed batches → empty `float[]` for each text
|
||||||
|
- **Progress**: Invokes `onProgress` for each batch: `"[Generating embeddings: batch X/Y]"`
|
||||||
|
- **Thread-Safe**: Uses lock for collecting results
|
||||||
|
|
||||||
|
##### `GetEmbeddingAsync(string text, CancellationToken)`
|
||||||
|
- **Returns**: `Task<float[]>`
|
||||||
|
- **Behavior**: Single embedding with rate limiting and retry
|
||||||
|
- **Use Case**: Query embedding
|
||||||
|
|
||||||
|
##### `Cos static float CosineSimilarity(float[] vector1, float[] vector2)
|
||||||
|
```
|
||||||
|
Uses `System.Numerics.Tensors.TensorPrimitives.CosineSimilarity`
|
||||||
|
|
||||||
|
Returns float between -1 and 1 (typically 0-1 for normalized embeddings)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Implementation**: Single line calling SIMD-accelerated tensor primitive
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ArticleService
|
||||||
|
|
||||||
|
**Location**: `Services/ArticleService.cs`
|
||||||
|
**Purpose**: Extract clean article content from web URLs
|
||||||
|
|
||||||
|
#### Public Methods
|
||||||
|
|
||||||
|
##### `FetchArticleAsync(string url)`
|
||||||
|
- **Returns**: `Task<Article>`
|
||||||
|
- **Behavior**: Delegates to `SmartReader.ParseArticleAsync(url)`
|
||||||
|
- **Result**: `Article` with `Title`, `TextContent`, `IsReadable`, and metadata
|
||||||
|
|
||||||
|
#### Errors
|
||||||
|
- Propagates exceptions (SmartReader may throw on network failures, malformed HTML)
|
||||||
|
- `SearchTool` catches and logs
|
||||||
|
|
||||||
|
#### SmartReader Notes
|
||||||
|
- Open-source article extraction library (bundled via NuGet)
|
||||||
|
- Uses Readability algorithm (similar to Firefox Reader View)
|
||||||
|
- Removes ads, navigation, boilerplate
|
||||||
|
- `IsReadable` indicates quality (e.g., not a 404 page, not too short)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ChunkingService
|
||||||
|
|
||||||
|
**Location**: `Services/ChunkingService.cs`
|
||||||
|
**Purpose**: Split text into 500-character chunks at natural boundaries
|
||||||
|
|
||||||
|
#### Public Methods
|
||||||
|
|
||||||
|
##### `ChunkText(string text)`
|
||||||
|
- **Returns**: `List<string>`
|
||||||
|
- **Algorithm**:
|
||||||
|
- Constant `MAX_CHUNK_SIZE = 500`
|
||||||
|
- While remaining text:
|
||||||
|
- Take up to 500 chars
|
||||||
|
- If not at end, backtrack to last `[' ', '\n', '\r', '.', '!']`
|
||||||
|
- Trim, add if non-empty
|
||||||
|
- Advance start
|
||||||
|
- Returns all chunks
|
||||||
|
|
||||||
|
#### Characteristics
|
||||||
|
- Static class (no instances)
|
||||||
|
- Pure function (no side effects)
|
||||||
|
- Zero dependencies
|
||||||
|
- Handles edge cases (empty text, short text, text without breaks)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Infrastructure Services
|
||||||
|
|
||||||
|
### RateLimiter
|
||||||
|
|
||||||
|
**Location**: `Services/RateLimiter.cs`
|
||||||
|
**Purpose**: Limit concurrent operations using semaphore
|
||||||
|
|
||||||
|
#### Constructor
|
||||||
|
```csharp
|
||||||
|
public RateLimiter(int maxConcurrentRequests)
|
||||||
|
```
|
||||||
|
Creates `SemaphoreSlim` with `maxConcurrentRequests`
|
||||||
|
|
||||||
|
#### Public Methods
|
||||||
|
|
||||||
|
##### `ExecuteAsync<T>(Func<Task<T>> action, CancellationToken)`
|
||||||
|
```csharp
|
||||||
|
public async Task<T> ExecuteAsync<T>(Func<Task<T>> action, CancellationToken cancellationToken = default)
|
||||||
|
{
|
||||||
|
await _semaphore.WaitAsync(cancellationToken);
|
||||||
|
try
|
||||||
|
{
|
||||||
|
return await action();
|
||||||
|
}
|
||||||
|
finally
|
||||||
|
{
|
||||||
|
_semaphore.Release();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- Waits for semaphore slot
|
||||||
|
- Executes `action` (typically an API call)
|
||||||
|
- Releases semaphore (even if exception)
|
||||||
|
- Returns result from `action`
|
||||||
|
|
||||||
|
##### `ExecuteAsync(Func<Task> action, CancellationToken)`
|
||||||
|
- Non-generic version (for void-returning actions)
|
||||||
|
|
||||||
|
#### Disposal
|
||||||
|
```csharp
|
||||||
|
public async ValueTask DisposeAsync()
|
||||||
|
{
|
||||||
|
_semaphore.Dispose();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Implements `IAsyncDisposable` for async cleanup
|
||||||
|
|
||||||
|
#### Usage Pattern
|
||||||
|
```csharp
|
||||||
|
var result = await _rateLimiter.ExecuteAsync(async () =>
|
||||||
|
{
|
||||||
|
return await SomeApiCall();
|
||||||
|
}, cancellationToken);
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Where Used
|
||||||
|
- `EmbeddingService`: Limits concurrent embedding batch requests (default 4)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### StatusReporter
|
||||||
|
|
||||||
|
**Location**: `Services/StatusReporter.cs`
|
||||||
|
**Purpose**: Real-time progress display with spinner (compact) or verbose lines
|
||||||
|
|
||||||
|
#### Constructor
|
||||||
|
```csharp
|
||||||
|
public StatusReporter(bool verbose)
|
||||||
|
```
|
||||||
|
- `verbose = true`: all progress via `WriteLine()` (no spinner)
|
||||||
|
- `verbose = false`: spinner with latest status
|
||||||
|
|
||||||
|
#### Architecture
|
||||||
|
|
||||||
|
**Components**:
|
||||||
|
- `Channel<string> _statusChannel` - producer-consumer queue
|
||||||
|
- `Task _statusProcessor` - background task reading from channel
|
||||||
|
- `CancellationTokenSource _spinnerCts` - spinner task cancellation
|
||||||
|
- `Task _spinnerTask` - spinner animation task
|
||||||
|
- `char[] _spinnerChars` - Braille spinner pattern
|
||||||
|
|
||||||
|
**Spinner Animation**:
|
||||||
|
- Runs at 10 FPS (100ms interval)
|
||||||
|
- Cycles through `['⠋','⠙','⠹','⠸','⠼','⠴','⠦','⠧','⠇','⠏']`
|
||||||
|
- Displays: `⠋ Fetching articles...`
|
||||||
|
- Updates in place using ANSI: `\r\x1b[K` (carriage return + erase line)
|
||||||
|
|
||||||
|
#### Public Methods
|
||||||
|
|
||||||
|
##### `UpdateStatus(string message)`
|
||||||
|
- Fire-and-forget: writes to channel via `TryWrite` (non-blocking)
|
||||||
|
- If channel full, message dropped (acceptable loss for UI)
|
||||||
|
|
||||||
|
##### `WriteLine(string text)`
|
||||||
|
- Stops spinner temporarily
|
||||||
|
- Clears current status line
|
||||||
|
- Writes `text` with newline
|
||||||
|
- In verbose mode: just `Console.WriteLine(text)`
|
||||||
|
|
||||||
|
##### `ClearStatus()`
|
||||||
|
- In compact mode: `Console.Write("\r\x1b[K")` (erase line)
|
||||||
|
- In verbose: no-op
|
||||||
|
- Sets `_currentMessage = null`
|
||||||
|
|
||||||
|
##### `StartSpinner()` / `StopSpinner()`
|
||||||
|
- Manual control (usually `StartSpinner` constructor call, `StopSpinner` by `Dispose`)
|
||||||
|
|
||||||
|
##### `Dispose()`
|
||||||
|
- Completes channel writer
|
||||||
|
- Awaits `_statusProcessor` completion
|
||||||
|
- Calls `StopSpinner()`
|
||||||
|
|
||||||
|
#### Background Processing
|
||||||
|
|
||||||
|
**Status Processor**:
|
||||||
|
```csharp
|
||||||
|
private async Task ProcessStatusUpdatesAsync()
|
||||||
|
{
|
||||||
|
await foreach (var message in _statusChannel.Reader.ReadAllAsync())
|
||||||
|
{
|
||||||
|
if (_verbose)
|
||||||
|
{
|
||||||
|
Console.WriteLine(message);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
Console.Write("\r\x1b[K"); // Clear line
|
||||||
|
Console.Write($"{_spinnerChars[0]} {message}"); // Static spinner
|
||||||
|
_currentMessage = message;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Spinner Task**:
|
||||||
|
```csharp
|
||||||
|
_spinnerTask = Task.Run(async () =>
|
||||||
|
{
|
||||||
|
while (_spinnerCts is { Token.IsCancellationRequested: false })
|
||||||
|
{
|
||||||
|
if (_currentMessage != null)
|
||||||
|
{
|
||||||
|
Console.Write("\r\x1b[K");
|
||||||
|
var charIndex = index++ % spinner.Length;
|
||||||
|
Console.Write($"{spinner[charIndex]} {_currentMessage}");
|
||||||
|
}
|
||||||
|
await Task.Delay(100, _spinnerCts.Token);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Thread Safety
|
||||||
|
- `UpdateStatus` (producer) writes to channel
|
||||||
|
- `ProcessStatusUpdatesAsync` (consumer) reads from channel
|
||||||
|
- `_spinnerTask` runs concurrently
|
||||||
|
- All UI writes happen in consumer/spinner task context (single-threaded UI)
|
||||||
|
|
||||||
|
#### Design Notes
|
||||||
|
- Could be simplified: just use `Console.CursorLeft` for spinner, no channel
|
||||||
|
- Channel allows random `UpdateStatus` calls from any thread without blocking
|
||||||
|
- Braille spinner requires terminal that supports Unicode (most modern terminals do)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Service Interactions
|
||||||
|
|
||||||
|
### Dependency Graph
|
||||||
|
|
||||||
|
```
|
||||||
|
OpenQueryApp
|
||||||
|
├── OpenRouterClient ← (used for query gen + final answer)
|
||||||
|
└── SearchTool
|
||||||
|
├── SearxngClient
|
||||||
|
├── ArticleService (uses SmartReader)
|
||||||
|
├── ChunkingService (static)
|
||||||
|
├── EmbeddingService
|
||||||
|
│ └── OpenRouterClient (different instance)
|
||||||
|
│ └── RateLimiter
|
||||||
|
└── ParallelProcessingOptions (config)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Service Lifetimes
|
||||||
|
|
||||||
|
All services are **transient** (new instance per query execution):
|
||||||
|
- `OpenRouterClient` → 1 instance for query gen + answer
|
||||||
|
- `SearxngClient` → 1 instance for all searches
|
||||||
|
- `EmbeddingService` → 1 instance with its own `OpenRouterClient` and `RateLimiter`
|
||||||
|
- `SearchTool` → 1 instance per query (constructed in `Program.cs`)
|
||||||
|
|
||||||
|
No singleton or static state (except static utility classes like `ChunkingService`).
|
||||||
|
|
||||||
|
### Data Flow Through Services
|
||||||
|
|
||||||
|
```
|
||||||
|
OpenQueryApp
|
||||||
|
│
|
||||||
|
├─ OpenRouterClient.CompleteAsync() → query generation
|
||||||
|
│ Messages → JSON → HTTP request → response → JSON → Messages
|
||||||
|
│
|
||||||
|
└─ SearchTool.ExecuteAsync()
|
||||||
|
│
|
||||||
|
├─ SearxngClient.SearchAsync() × N
|
||||||
|
│ query → URL encode → GET → JSON → SearxngResult[]
|
||||||
|
│
|
||||||
|
├─ ArticleService.FetchArticleAsync() × M
|
||||||
|
│ URL → HTTP GET → SmartReader → Article
|
||||||
|
│
|
||||||
|
├─ ChunkingService.ChunkText() × M
|
||||||
|
│ Article.TextContent → List<string> chunks
|
||||||
|
│
|
||||||
|
├─ EmbeddingService.GetEmbeddingAsync(query) + GetEmbeddingsAsync(chunks[])
|
||||||
|
│ texts → batches → rate-limited HTTP POST → JSON → float[][]
|
||||||
|
│
|
||||||
|
├─ CosineSimilarity(queryEmbedding, chunkEmbedding) × M
|
||||||
|
│ Vectors → dot product → magnitude → score
|
||||||
|
│
|
||||||
|
└─ return context string (formatted chunks)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- **[OpenQueryApp](../components/openquery-app.md)** - Orchestrates services
|
||||||
|
- **[SearchTool](../components/search-tool.md)** - Coordinates pipeline
|
||||||
|
- **[Models](../components/models.md)** - Data structures passed between services
|
||||||
|
- **[API Reference](../../api/cli.md)** - CLI that uses these services
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Service Design Principles**:
|
||||||
|
- Single Responsibility: Each service does one thing well
|
||||||
|
- Stateless: No instance state beyond constructor args
|
||||||
|
- Composable: Services depend on abstractions (other services) not implementations
|
||||||
|
- Testable: Can mock dependencies for unit testing
|
||||||
356
docs/configuration.md
Normal file
356
docs/configuration.md
Normal file
@@ -0,0 +1,356 @@
|
|||||||
|
# Configuration
|
||||||
|
|
||||||
|
Complete guide to configuring OpenQuery for your environment.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Configuration Methods](#configuration-methods)
|
||||||
|
2. [Configuration File](#configuration-file)
|
||||||
|
3. [Environment Variables](#environment-variables)
|
||||||
|
4. [Command-Line Options](#command-line-options)
|
||||||
|
5. [Configuration Priority](#configuration-priority)
|
||||||
|
6. [Recommended Settings](#recommended-settings)
|
||||||
|
7. [Advanced Configuration](#advanced-configuration)
|
||||||
|
|
||||||
|
## Configuration Methods
|
||||||
|
|
||||||
|
OpenQuery can be configured through three methods, which merge together with clear priority:
|
||||||
|
|
||||||
|
| Method | Persistence | Use Case |
|
||||||
|
|--------|-------------|----------|
|
||||||
|
| Configuration File | Permanent | Default values you use daily |
|
||||||
|
| Environment Variables | Session/Shell | CI/CD, scripting, temporary overrides |
|
||||||
|
| Command-Line Options | Per-execution | One-off customizations |
|
||||||
|
|
||||||
|
## Configuration File
|
||||||
|
|
||||||
|
### Location
|
||||||
|
OpenQuery follows the [XDG Base Directory](https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html) specification:
|
||||||
|
|
||||||
|
- **Linux/macOS**: `~/.config/openquery/config`
|
||||||
|
- **Windows**: `%APPDATA%\openquery\config` (e.g., `C:\Users\<user>\AppData\Roaming\openquery\config`)
|
||||||
|
|
||||||
|
### Format
|
||||||
|
Simple `key=value` pairs, one per line:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
ApiKey=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||||
|
Model=qwen/qwen3.5-flash-02-23
|
||||||
|
DefaultQueries=3
|
||||||
|
DefaultChunks=3
|
||||||
|
DefaultResults=5
|
||||||
|
```
|
||||||
|
|
||||||
|
### Schema
|
||||||
|
|
||||||
|
| Key | Type | Default | Description |
|
||||||
|
|-----|------|---------|-------------|
|
||||||
|
| `ApiKey` | string | "" | OpenRouter API authentication key |
|
||||||
|
| `Model` | string | `qwen/qwen3.5-flash-02-23` | Default LLM model to use |
|
||||||
|
| `DefaultQueries` | int | 3 | Number of search queries to generate |
|
||||||
|
| `DefaultChunks` | int | 3 | Number of top context chunks to include |
|
||||||
|
| `DefaultResults` | int | 5 | Number of search results per query |
|
||||||
|
|
||||||
|
### Example Configurations
|
||||||
|
|
||||||
|
**Minimal** (just API key):
|
||||||
|
```ini
|
||||||
|
ApiKey=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||||
|
```
|
||||||
|
|
||||||
|
**Optimized for Research**:
|
||||||
|
```ini
|
||||||
|
ApiKey=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||||
|
Model=google/gemini-3-flash-preview
|
||||||
|
DefaultQueries=5
|
||||||
|
DefaultChunks=4
|
||||||
|
DefaultResults=10
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cost-Conscious**:
|
||||||
|
```ini
|
||||||
|
ApiKey=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||||
|
Model=qwen/qwen3.5-flash-02-23
|
||||||
|
DefaultQueries=2
|
||||||
|
DefaultChunks=2
|
||||||
|
DefaultResults=3
|
||||||
|
```
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
Environment variables override the configuration file and can be set temporarily or permanently in your shell profile.
|
||||||
|
|
||||||
|
### Available Variables
|
||||||
|
|
||||||
|
| Variable | Purpose | Required | Example |
|
||||||
|
|----------|---------|----------|---------|
|
||||||
|
| `OPENROUTER_API_KEY` | OpenRouter API key | **Yes** (unless in config file) | `export OPENROUTER_API_KEY="sk-or-..."` |
|
||||||
|
| `OPENROUTER_MODEL` | Override default LLM model | No | `export OPENROUTER_MODEL="deepseek/deepseek-v3.2"` |
|
||||||
|
| `SEARXNG_URL` | URL of SearxNG instance | No (default: `http://localhost:8002`) | `export SEARXNG_URL="https://searx.example.com"` |
|
||||||
|
|
||||||
|
### Setting Environment Variables
|
||||||
|
|
||||||
|
#### Temporary (Current Session)
|
||||||
|
```bash
|
||||||
|
# Linux/macOS
|
||||||
|
export OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
export SEARXNG_URL="http://localhost:8002"
|
||||||
|
|
||||||
|
# Windows PowerShell
|
||||||
|
$env:OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
$env:SEARXNG_URL="http://localhost:8002"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Permanent (Shell Profile)
|
||||||
|
|
||||||
|
**bash** (`~/.bashrc` or `~/.bash_profile`):
|
||||||
|
```bash
|
||||||
|
export OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
export SEARXNG_URL="http://localhost:8002"
|
||||||
|
```
|
||||||
|
|
||||||
|
**zsh** (`~/.zshrc`):
|
||||||
|
```zsh
|
||||||
|
export OPENROUTER_API_KEY="sk-or-..."
|
||||||
|
export SEARXNG_URL="http://localhost:8002"
|
||||||
|
```
|
||||||
|
|
||||||
|
**fish** (`~/.config/fish/config.fish`):
|
||||||
|
```fish
|
||||||
|
set -x OPENROUTER_API_KEY "sk-or-..."
|
||||||
|
set -x SEARXNG_URL "http://localhost:8002"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Windows** (PowerShell profile):
|
||||||
|
```powershell
|
||||||
|
[Environment]::SetEnvironmentVariable("OPENROUTER_API_KEY", "sk-or-...", "User")
|
||||||
|
[Environment]::SetEnvironmentVariable("SEARXNG_URL", "http://localhost:8002", "User")
|
||||||
|
```
|
||||||
|
|
||||||
|
After editing profile files, restart your terminal or run `source ~/.bashrc` (or equivalent).
|
||||||
|
|
||||||
|
### Security Note
|
||||||
|
Never commit your API key to version control. Use environment variables or config file that's in `.gitignore`. The default `.gitignore` already excludes common build directories but doesn't include the config file since it's outside the project directory (`~/.config/`).
|
||||||
|
|
||||||
|
## Command-Line Options
|
||||||
|
|
||||||
|
Options passed directly to the `openquery` command override both config file and environment variables for that specific execution.
|
||||||
|
|
||||||
|
### Main Command Options
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openquery [OPTIONS] <question>
|
||||||
|
```
|
||||||
|
|
||||||
|
| Option | Aliases | Type | Default Source | Description |
|
||||||
|
|--------|---------|------|----------------|-------------|
|
||||||
|
| `--chunks` | `-c` | int | Config `DefaultChunks` | Number of top context chunks |
|
||||||
|
| `--results` | `-r` | int | Config `DefaultResults` | Search results per query |
|
||||||
|
| ``--queries` | `-q` | int | Config `DefaultQueries` | Number of search queries |
|
||||||
|
| `--short` | `-s` | bool | false | Request concise answer |
|
||||||
|
| `--long` | `-l` | bool | false | Request detailed answer |
|
||||||
|
| `--verbose` | `-v` | bool | false | Show detailed progress |
|
||||||
|
|
||||||
|
### Configure Command Options
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openquery configure [OPTIONS]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Option | Type | Description |
|
||||||
|
|--------|------|-------------|
|
||||||
|
| `--interactive` / `-i` | bool | Launch interactive configuration wizard |
|
||||||
|
| `--key` | string | Set API key |
|
||||||
|
| `--model` | string | Set default model |
|
||||||
|
| `--queries` | int? | Set default queries |
|
||||||
|
| `--chunks` | int? | Set default chunks |
|
||||||
|
| `--results` | int? | Set default results |
|
||||||
|
|
||||||
|
## Configuration Priority
|
||||||
|
|
||||||
|
When OpenQuery needs a value, it checks sources in this order (highest to lowest priority):
|
||||||
|
|
||||||
|
1. **Command-line option** (if provided)
|
||||||
|
2. **Environment variable** (if set)
|
||||||
|
3. **Configuration file** (if key exists)
|
||||||
|
4. **Hard-coded default** (if all above missing)
|
||||||
|
|
||||||
|
### Examples
|
||||||
|
|
||||||
|
**Example 1**: Environment overrides config
|
||||||
|
```bash
|
||||||
|
# config file: DefaultQueries=5
|
||||||
|
export OPENROUTER_MODEL="deepseek/deepseek-v3.2"
|
||||||
|
openquery --queries 2 "question" # Uses: queries=2 (CLI), model=deepseek (env), chunks=3 (config)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example 2**: CLI overrides everything
|
||||||
|
```bash
|
||||||
|
export OPENROUTER_MODEL="qwen/qwen3.5-flash-02-23"
|
||||||
|
openquery --model "google/gemini-3-flash-preview" --chunks 5 "question"
|
||||||
|
# Uses: model=google (CLI), chunks=5 (CLI), queries=3 (default)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example 3**: All sources combined
|
||||||
|
```bash
|
||||||
|
# config: DefaultChunks=4
|
||||||
|
# env: OPENROUTER_MODEL="moonshotai/kimi-k2.5", SEARXNG_URL="http://custom:8002"
|
||||||
|
# CLI: --queries 6 --short
|
||||||
|
openquery "question"
|
||||||
|
# Uses: queries=6 (CLI), chunks=4 (config), results=5 (config),
|
||||||
|
# model=kimi-k2.5 (env), searxng=custom (env), short=true (CLI)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Recommended Settings
|
||||||
|
|
||||||
|
### For Quick Questions (Facts, Definitions)
|
||||||
|
```bash
|
||||||
|
openquery -q 2 -r 3 -c 2 "What is the capital of France?"
|
||||||
|
```
|
||||||
|
- Few queries (2) for straightforward facts
|
||||||
|
- Few results (3) to minimize processing
|
||||||
|
- Few chunks (2) for focused answer
|
||||||
|
|
||||||
|
### For Research (Complex Topics)
|
||||||
|
```bash
|
||||||
|
openquery -q 5 -r 10 -c 4 -l "Explain the causes of the French Revolution"
|
||||||
|
```
|
||||||
|
- More queries (5) for diverse perspectives
|
||||||
|
- More results (10) for comprehensive coverage
|
||||||
|
- More chunks (4) for rich context
|
||||||
|
- Long format for depth
|
||||||
|
|
||||||
|
### For Exploration (Broad Topics)
|
||||||
|
```bash
|
||||||
|
openquery -q 8 -r 15 -c 5 "What are the latest developments in AI?"
|
||||||
|
```
|
||||||
|
- Many queries (8) to explore different angles
|
||||||
|
- Many results (15) for breadth
|
||||||
|
- More chunks (5) for extensive context
|
||||||
|
|
||||||
|
### Cost Optimization
|
||||||
|
```bash
|
||||||
|
openquery configure --model "qwen/qwen3.5-flash-02-23"
|
||||||
|
# Keep defaults: -q 3 -r 5 -c 3
|
||||||
|
```
|
||||||
|
- Qwen Flash is very cost-effective
|
||||||
|
- Default parameters provide good balance
|
||||||
|
|
||||||
|
### Performance Optimization
|
||||||
|
```bash
|
||||||
|
# Adjust ParallelProcessingOptions in SearchTool.cs if needed
|
||||||
|
# Default: MaxConcurrentArticleFetches=10, MaxConcurrentEmbeddingRequests=4
|
||||||
|
```
|
||||||
|
- Reduce these values if you see rate limits or memory pressure
|
||||||
|
- Increase them if you have fast network/API and want more speed
|
||||||
|
|
||||||
|
## Advanced Configuration
|
||||||
|
|
||||||
|
### Changing Concurrency Limits
|
||||||
|
|
||||||
|
Concurrency limits are currently hardcoded in `SearchTool.cs` but can be adjusted:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public class ParallelProcessingOptions
|
||||||
|
{
|
||||||
|
public int MaxConcurrentArticleFetches { get; set; } = 10; // ← Change this
|
||||||
|
public int MaxConcurrentEmbeddingRequests { get; set; } = 4; // ← Change this
|
||||||
|
public int EmbeddingBatchSize { get; set; } = 300; // ← Change this
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
To make these configurable, you could:
|
||||||
|
1. Add fields to `AppConfig`
|
||||||
|
2. Read from config file
|
||||||
|
3. Pass through to `SearchTool` constructor
|
||||||
|
|
||||||
|
### Custom Embedding Model
|
||||||
|
|
||||||
|
The embedding model is hardcoded to `openai/text-embedding-3-small`. To change:
|
||||||
|
|
||||||
|
Edit the `EmbeddingService` constructor:
|
||||||
|
```csharp
|
||||||
|
public EmbeddingService(OpenRouterClient client, string embeddingModel = "your-model")
|
||||||
|
```
|
||||||
|
|
||||||
|
Or make it configurable via CLI/config (future enhancement).
|
||||||
|
|
||||||
|
### Changing Chunk Size
|
||||||
|
|
||||||
|
Chunk size (500 chars) is defined in `ChunkingService.cs`:
|
||||||
|
```csharp
|
||||||
|
private const int MAX_CHUNK_SIZE = 500;
|
||||||
|
```
|
||||||
|
|
||||||
|
Modify this constant to change how articles are split. Larger chunks:
|
||||||
|
- ✅ More context per chunk
|
||||||
|
- ❌ Fewer chunks for same article
|
||||||
|
- ❌ Higher token usage in final answer
|
||||||
|
|
||||||
|
Smaller chunks:
|
||||||
|
- ✅ More granular matching
|
||||||
|
- ❌ May lose context across chunk boundaries
|
||||||
|
|
||||||
|
### Using a Custom SearxNG Instance
|
||||||
|
|
||||||
|
Some SearxNG deployments may require HTTPS, authentication, or custom paths:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# With authentication (if supported)
|
||||||
|
export SEARXNG_URL="https://user:pass@searx.example.com:8080"
|
||||||
|
|
||||||
|
# With custom path
|
||||||
|
export SEARXNG_URL="https://searx.example.com/custom-path"
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: Most SearxNG instances don't require auth as they're designed for privacy.
|
||||||
|
|
||||||
|
### OpenRouter Settings
|
||||||
|
|
||||||
|
OpenRouter supports additional parameters (not yet exposed in OpenQuery):
|
||||||
|
|
||||||
|
- `temperature` - Randomness (0-2, default ~1)
|
||||||
|
- `max_tokens` - Response length limit
|
||||||
|
- `top_p` - Nucleus sampling
|
||||||
|
- `frequency_penalty` / `presence_penalty`
|
||||||
|
|
||||||
|
These could be added to `ChatCompletionRequest` in future versions.
|
||||||
|
|
||||||
|
## Managing Multiple Configurations
|
||||||
|
|
||||||
|
You can maintain multiple config files and symlink or set per-project:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create project-specific config
|
||||||
|
cp ~/.config/openquery/config ~/myproject/openquery.config
|
||||||
|
|
||||||
|
# Use it temporarily
|
||||||
|
OPENQUERY_CONFIG=~/myproject/openquery.config openquery "question"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note**: Currently OpenQuery only looks at `~/.config/openquery/config`. Multi-config support would require code changes (reading from `OPENQUERY_CONFIG` env var).
|
||||||
|
|
||||||
|
## Configuration Validation
|
||||||
|
|
||||||
|
OpenQuery doesn't strictly validate config values. Invalid settings may cause runtime errors:
|
||||||
|
|
||||||
|
- `DefaultQueries <= 0` → May cause exceptions or zero queries
|
||||||
|
- `DefaultChunks <= 0` → May return no context
|
||||||
|
- `DefaultResults <= 0` → No search results
|
||||||
|
|
||||||
|
Validate manually:
|
||||||
|
```bash
|
||||||
|
# Test your config loads
|
||||||
|
cat ~/.config/openquery/config
|
||||||
|
|
||||||
|
# Test with verbose mode
|
||||||
|
openquery -v "test"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [Usage Guide](usage.md) - Learn how to use the CLI
|
||||||
|
- [Architecture](architecture.md) - Understand the system design
|
||||||
|
- [Troubleshooting](troubleshooting.md) - Fix common issues
|
||||||
173
docs/index.md
Normal file
173
docs/index.md
Normal file
@@ -0,0 +1,173 @@
|
|||||||
|
# OpenQuery Documentation
|
||||||
|
|
||||||
|
Welcome to the comprehensive documentation for OpenQuery - the AI-powered search and answer system.
|
||||||
|
|
||||||
|
## 📚 Documentation Overview
|
||||||
|
|
||||||
|
### Getting Started
|
||||||
|
- **[Installation Guide](installation.md)** - Build, install, and setup instructions
|
||||||
|
- **[Configuration](configuration.md)** - Configure API keys, models, and settings
|
||||||
|
- **[Usage Guide](usage.md)** - Complete CLI reference with examples
|
||||||
|
|
||||||
|
### Deep Dive
|
||||||
|
- **[Architecture](architecture.md)** - System design, patterns, and data flow
|
||||||
|
- **[Components](components/overview.md)** - Detailed component documentation
|
||||||
|
- [OpenQueryApp](components/openquery-app.md)
|
||||||
|
- [SearchTool](components/search-tool.md)
|
||||||
|
- [Services](components/services.md)
|
||||||
|
- [Models](components/models.md)
|
||||||
|
- **[API Reference](api/cli.md)** - Complete command-line interface reference
|
||||||
|
- [Environment Variables](api/environment-variables.md)
|
||||||
|
- [Programmatic APIs](api/programmatic.md)
|
||||||
|
|
||||||
|
### Support
|
||||||
|
- **[Troubleshooting](troubleshooting.md)** - Common issues and solutions
|
||||||
|
- **[Performance](performance.md)** - Performance characteristics and optimization
|
||||||
|
|
||||||
|
## 🎯 Quick Links
|
||||||
|
|
||||||
|
### For Users
|
||||||
|
- [Install OpenQuery](installation.md) in 5 minutes
|
||||||
|
- [Configure your API key](configuration.md)
|
||||||
|
- [Learn the basics](usage.md)
|
||||||
|
- [Solve common problems](troubleshooting.md)
|
||||||
|
|
||||||
|
### For Developers
|
||||||
|
- [Understand the architecture](architecture.md)
|
||||||
|
- [Explore components](components/overview.md)
|
||||||
|
- [Use the APIs programmatically](api/programmatic.md)
|
||||||
|
- [Performance tuning](performance.md)
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Project Overview](#project-overview)
|
||||||
|
2. [Key Concepts](#key-concepts)
|
||||||
|
3. [Technology Stack](#technology-stack)
|
||||||
|
4. [System Workflow](#system-workflow)
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
|
||||||
|
**OpenQuery** is a sophisticated CLI tool that combines the power of large language models with web search to provide accurate, well-sourced answers to complex questions.
|
||||||
|
|
||||||
|
### What It Does
|
||||||
|
- Takes a natural language question as input
|
||||||
|
- Generates multiple diverse search queries
|
||||||
|
- Searches the web via SearxNG
|
||||||
|
- Extracts and processes article content
|
||||||
|
- Uses semantic similarity to rank relevance
|
||||||
|
- Synthesizes a comprehensive AI-generated answer with citations
|
||||||
|
|
||||||
|
### Why Use OpenQuery?
|
||||||
|
- **Accuracy**: Multiple search queries reduce bias and increase coverage
|
||||||
|
- **Transparency**: Sources are cited in the final answer
|
||||||
|
- **Speed**: Parallel processing minimizes latency
|
||||||
|
- **Control**: Fine-tune every aspect from query count to chunk selection
|
||||||
|
- **Privacy**: SearxNG provides anonymous, aggregating search
|
||||||
|
|
||||||
|
## Key Concepts
|
||||||
|
|
||||||
|
### Search Queries
|
||||||
|
Instead of using your exact question, OpenQuery generates multiple optimized search queries (default: 3). For example, "What is quantum entanglement?" might become:
|
||||||
|
- "quantum entanglement definition"
|
||||||
|
- "how quantum entanglement works"
|
||||||
|
- "quantum entanglement experiments"
|
||||||
|
|
||||||
|
### Content Chunks
|
||||||
|
Long articles are split into ~500-character chunks. Each chunk is:
|
||||||
|
- Stored with its source URL and title
|
||||||
|
- Converted to a vector embedding (1536 dimensions)
|
||||||
|
- Scored against your query embedding
|
||||||
|
|
||||||
|
### Semantic Ranking
|
||||||
|
Using cosine similarity between embeddings, OpenQuery ranks chunks by relevance and selects the top N (default: 3) for the final context.
|
||||||
|
|
||||||
|
### Streaming Answer
|
||||||
|
The LLM receives your question plus the top chunks as context and streams the answer in real-time, citing sources like `[Source 1]`.
|
||||||
|
|
||||||
|
## Technology Stack
|
||||||
|
|
||||||
|
| Layer | Technology | Purpose |
|
||||||
|
|-------|------------|---------|
|
||||||
|
| Runtime | .NET 10.0 AOT | Native performance, minimal footprint |
|
||||||
|
| LLM | OpenRouter API | Chat completions and embeddings |
|
||||||
|
| Search | SearxNG | Metasearch engine |
|
||||||
|
| Content Extraction | SmartReader | Article text extraction |
|
||||||
|
| Vector Math | System.Numerics.Tensors | High-performance cosine similarity |
|
||||||
|
| Resilience | Polly | Retry and circuit breaker policies |
|
||||||
|
| CLI | System.CommandLine | Command parsing and help |
|
||||||
|
| JSON | System.Text.Json (source-gen) | Fast serialization |
|
||||||
|
|
||||||
|
## System Workflow
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ OpenQuery Workflow │
|
||||||
|
├─────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ 1. User Query: "What is quantum entanglement?" │
|
||||||
|
│ │
|
||||||
|
│ 2. Query Generation (Optional) │
|
||||||
|
│ LLM generates: ["quantum entanglement physics", │
|
||||||
|
│ "quantum entanglement definition", │
|
||||||
|
│ "how does quantum entanglement work"] │
|
||||||
|
│ │
|
||||||
|
│ 3. Parallel Searches │
|
||||||
|
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
|
||||||
|
│ │ Query 1 → │→ │ SearxNG │→ │ Results │ │
|
||||||
|
│ └────────────┘ └────────────┘ └────────────┘ │
|
||||||
|
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
|
||||||
|
│ │ Query 2 → │→ │ SearxNG │→ │ Results │ │
|
||||||
|
│ └────────────┘ └────────────┘ └────────────┘ │
|
||||||
|
│ ┌────────────┐ ┌────────────┐ ┌────────────┘ │
|
||||||
|
│ │ Query 3 → │→ │ SearxNG │→ │ Results (combined) │
|
||||||
|
│ └────────────┘ └────────────┘ └────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ 4. Parallel Article Fetching │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ URL 1 → │→ │ Article │→ │ Chunks │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ URL 2 → │→ │ Article │→ │ Chunks │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||||
|
│ ... (concurrent, max 10 at a time) │
|
||||||
|
│ │
|
||||||
|
│ 5. Parallel Embeddings │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ Chunks │→ │ Embed- │→ │ Vectors │ │
|
||||||
|
│ │ Batch 1 │ │ ding API │ │ │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ Chunks │→ │ Embed- │→ │ Vectors │ │
|
||||||
|
│ │ Batch 2 │ │ ding API │ │ │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||||
|
│ (batches of 300, up to 4 concurrent) │
|
||||||
|
│ │
|
||||||
|
│ 6. Semantic Ranking │
|
||||||
|
│ Query Embedding + Chunk Embeddings → Cosine Similarity → │
|
||||||
|
│ Score → Sort Descending → Top 3 Chunks │
|
||||||
|
│ │
|
||||||
|
│ 7. Final Answer Generation │
|
||||||
|
│ ┌────────────────────────────────────────────┐ │
|
||||||
|
│ │ System: "Answer based on this context:" │ │
|
||||||
|
│ │ Context: [Top 3 chunks with sources] │ │
|
||||||
|
│ │ Question: "What is quantum entanglement?" │ │
|
||||||
|
│ └────────────────────────────────────────────┘ │
|
||||||
|
│ ↓ │
|
||||||
|
│ LLM Streams Answer │
|
||||||
|
│ "Quantum entanglement is..." │
|
||||||
|
│ with citations like [Source 1] │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
1. **[Install OpenQuery](installation.md)**
|
||||||
|
2. **[Configure it](configuration.md)**
|
||||||
|
3. **[Start asking questions](usage.md)**
|
||||||
|
|
||||||
|
For detailed technical information, continue to [the architecture guide](architecture.md).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Need help?** Check the [Troubleshooting](troubleshooting.md) guide.
|
||||||
358
docs/installation.md
Normal file
358
docs/installation.md
Normal file
@@ -0,0 +1,358 @@
|
|||||||
|
# Installation Guide
|
||||||
|
|
||||||
|
This guide covers how to build, install, and configure OpenQuery on your system.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Prerequisites](#prerequisites)
|
||||||
|
2. [Quick Install](#quick-install)
|
||||||
|
3. [Manual Build](#manual-build)
|
||||||
|
4. [Platform-Specific Instructions](#platform-specific-instructions)
|
||||||
|
5. [Post-Installation](#post-installation)
|
||||||
|
6. [Verification](#verification)
|
||||||
|
7. [Uninstallation](#uninstallation)
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
### Required Software
|
||||||
|
- **.NET SDK 10.0** or later
|
||||||
|
- Download from [dotnet.microsoft.com](https://dotnet.microsoft.com/download)
|
||||||
|
- Verify: `dotnet --version` should show 10.x or higher
|
||||||
|
|
||||||
|
### External Services (Setup Required)
|
||||||
|
1. **SearxNG Instance** - Metasearch engine
|
||||||
|
- **Docker (Recommended)**:
|
||||||
|
```bash
|
||||||
|
docker run -d \
|
||||||
|
--name searxng \
|
||||||
|
-p 8002:8080 \
|
||||||
|
-v searxng-data:/etc/searxng \
|
||||||
|
searxng/searxng:latest
|
||||||
|
```
|
||||||
|
- Access at `http://localhost:8002`
|
||||||
|
|
||||||
|
- **Alternative**: Use a public SearxNG instance from [searx.space](https://searx.space)
|
||||||
|
|
||||||
|
2. **OpenRouter API Key** - AI model provider
|
||||||
|
- Sign up at [openrouter.ai](https://openrouter.ai)
|
||||||
|
- Get your API key from dashboard
|
||||||
|
- Free tier available with rate limits
|
||||||
|
|
||||||
|
## Quick Install
|
||||||
|
|
||||||
|
The easiest way to get OpenQuery up and running:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Clone the repository
|
||||||
|
git clone <your-repo-url>
|
||||||
|
cd OpenQuery
|
||||||
|
|
||||||
|
# 2. Make install script executable and run
|
||||||
|
chmod +x install.sh
|
||||||
|
./install.sh
|
||||||
|
|
||||||
|
# 3. Configure your API key
|
||||||
|
openquery configure -i
|
||||||
|
|
||||||
|
# 4. Test it
|
||||||
|
openquery "Hello world"
|
||||||
|
```
|
||||||
|
|
||||||
|
**What the install script does**:
|
||||||
|
- Builds the project in Release mode
|
||||||
|
- Publishes as self-contained AOT binary
|
||||||
|
- Copies to `~/.local/bin/OpenQuery` (Linux/macOS)
|
||||||
|
- Creates config directory `~/.config/openquery/`
|
||||||
|
|
||||||
|
## Manual Build
|
||||||
|
|
||||||
|
If you prefer to build manually or need a specific platform:
|
||||||
|
|
||||||
|
### Step 1: Restore Dependencies
|
||||||
|
```bash
|
||||||
|
dotnet restore
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Build
|
||||||
|
```bash
|
||||||
|
dotnet build -c Release
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Publish
|
||||||
|
|
||||||
|
#### For Current Platform (Self-Contained AOT)
|
||||||
|
```bash
|
||||||
|
dotnet publish -c Release \
|
||||||
|
--self-contained true \
|
||||||
|
/p:PublishAot=true
|
||||||
|
```
|
||||||
|
|
||||||
|
The binary will be at:
|
||||||
|
```
|
||||||
|
bin/Release/net10.0/<rid>/publish/OpenQuery
|
||||||
|
```
|
||||||
|
|
||||||
|
#### For Specific Platform (Cross-Compilation)
|
||||||
|
|
||||||
|
**Runtime Identifiers (RIDs)**:
|
||||||
|
| Platform | RID |
|
||||||
|
|----------|-----|
|
||||||
|
| Linux x64 | `linux-x64` |
|
||||||
|
| Linux ARM64 | `linux-arm64` |
|
||||||
|
| macOS x64 | `osx-x64` |
|
||||||
|
| macOS ARM64 | `osx-arm64` |
|
||||||
|
| Windows x64 | `win-x64` |
|
||||||
|
| Windows ARM64 | `win-arm64` |
|
||||||
|
|
||||||
|
Example for Linux x64:
|
||||||
|
```bash
|
||||||
|
dotnet publish -c Release \
|
||||||
|
-r linux-x64 \
|
||||||
|
--self-contained true \
|
||||||
|
/p:PublishAot=true
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Deploy
|
||||||
|
|
||||||
|
Copy the binary to a directory in your PATH:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Linux/macOS
|
||||||
|
sudo cp bin/Release/net10.0/linux-x64/publish/OpenQuery /usr/local/bin/
|
||||||
|
chmod +x /usr/local/bin/OpenQuery
|
||||||
|
|
||||||
|
# Windows (PowerShell as Admin)
|
||||||
|
Copy-Item bin\Release\net10.0\win-x64\publish\OpenQuery.exe C:\Program Files\OpenQuery\
|
||||||
|
```
|
||||||
|
|
||||||
|
Or use a local bin directory:
|
||||||
|
```bash
|
||||||
|
mkdir -p ~/.local/bin
|
||||||
|
cp bin/Release/net10.0/linux-x64/publish/OpenQuery ~/.local/bin/
|
||||||
|
# Add to PATH if not already: export PATH="$HOME/.local/bin:$PATH"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Platform-Specific Instructions
|
||||||
|
|
||||||
|
### Linux
|
||||||
|
|
||||||
|
#### Ubuntu/Debian
|
||||||
|
```bash
|
||||||
|
# Install .NET SDK 10.0
|
||||||
|
wget https://dot.net/v10/dotnet-install.sh -O dotnet-install.sh
|
||||||
|
chmod +x dotnet-install.sh
|
||||||
|
./dotnet-install.sh --channel 10.0
|
||||||
|
|
||||||
|
# Add to PATH
|
||||||
|
export PATH="$HOME/.dotnet:$PATH"
|
||||||
|
|
||||||
|
# Build and install (as shown above)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### With Systemd Service (Optional)
|
||||||
|
If you run SearxNG locally, you might want it as a service:
|
||||||
|
```bash
|
||||||
|
# Create systemd service for SearxNG (if using Docker)
|
||||||
|
sudo nano /etc/systemd/system/searxng.service
|
||||||
|
```
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[Unit]
|
||||||
|
Description=SearxNG Search Engine
|
||||||
|
Requires=docker.service
|
||||||
|
After=docker.service
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Restart=always
|
||||||
|
ExecStart=/usr/bin/docker start -a searxng
|
||||||
|
ExecStop=/usr/bin/docker stop -t 2 searxng
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo systemctl enable searxng
|
||||||
|
sudo systemctl start searxng
|
||||||
|
```
|
||||||
|
|
||||||
|
### macOS
|
||||||
|
|
||||||
|
#### Homebrew Install (if .NET available)
|
||||||
|
```bash
|
||||||
|
brew install dotnet-sdk
|
||||||
|
```
|
||||||
|
|
||||||
|
#### M1/M2 (ARM64) Notes
|
||||||
|
- Use RID: `osx-arm64`
|
||||||
|
- Ensure you have the ARM64 version of .NET SDK
|
||||||
|
|
||||||
|
### Windows
|
||||||
|
|
||||||
|
#### Using Winget (Windows 10/11)
|
||||||
|
```powershell
|
||||||
|
winget install Microsoft.DotNet.SDK.10
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Manual Install
|
||||||
|
1. Download installer from [dotnet.microsoft.com](https://dotnet.microsoft.com/download)
|
||||||
|
2. Run installer
|
||||||
|
3. Verify in PowerShell:
|
||||||
|
```powershell
|
||||||
|
dotnet --version
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Building
|
||||||
|
```powershell
|
||||||
|
dotnet publish -c Release -r win-x64 --self-contained true /p:PublishAot=true
|
||||||
|
```
|
||||||
|
|
||||||
|
## Post-Installation
|
||||||
|
|
||||||
|
### 1. Verify SearxNG is Running
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:8002/search?q=test&format=json"
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected: JSON response with results array.
|
||||||
|
|
||||||
|
### 2. Configure OpenQuery
|
||||||
|
```bash
|
||||||
|
# Interactive setup
|
||||||
|
openquery configure -i
|
||||||
|
|
||||||
|
# Or via environment variables
|
||||||
|
setx OPENROUTER_API_KEY "sk-or-..." # Windows
|
||||||
|
export OPENROUTER_API_KEY="sk-or-..." # Linux/macOS
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Optional: Set Defaults
|
||||||
|
```bash
|
||||||
|
openquery configure --queries 5 --chunks 4 --results 10
|
||||||
|
```
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
### Test Installation
|
||||||
|
```bash
|
||||||
|
# Check binary exists and is executable
|
||||||
|
which openquery # Linux/macOS
|
||||||
|
where openquery # Windows
|
||||||
|
|
||||||
|
# If installed as OpenQuery (capital O)
|
||||||
|
which OpenQuery
|
||||||
|
```
|
||||||
|
|
||||||
|
### Test Configuration
|
||||||
|
```bash
|
||||||
|
# Should show your config or defaults
|
||||||
|
cat ~/.config/openquery/config
|
||||||
|
```
|
||||||
|
|
||||||
|
### Test the System
|
||||||
|
```bash
|
||||||
|
# Simple query (should work with any API key)
|
||||||
|
openquery "What is 2+2?"
|
||||||
|
|
||||||
|
# More complex query
|
||||||
|
openquery -v "What are the benefits of exercise?"
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected output:
|
||||||
|
- Spinner animation with status updates
|
||||||
|
- Streaming answer from the AI
|
||||||
|
- Citations like `[Source 1](url)` in the answer
|
||||||
|
|
||||||
|
## Uninstallation
|
||||||
|
|
||||||
|
### Using Uninstall Script
|
||||||
|
```bash
|
||||||
|
chmod +x uninstall.sh
|
||||||
|
./uninstall.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
The script will:
|
||||||
|
- Remove binary from `~/.local/bin/`
|
||||||
|
- Ask if you want to delete config directory
|
||||||
|
|
||||||
|
### Manual Removal
|
||||||
|
```bash
|
||||||
|
# Remove binary
|
||||||
|
rm ~/.local/bin/OpenQuery
|
||||||
|
|
||||||
|
# Remove config (optional)
|
||||||
|
rm -r ~/.config/openquery
|
||||||
|
```
|
||||||
|
|
||||||
|
### Remove SearxNG (if no longer needed)
|
||||||
|
```bash
|
||||||
|
docker rm -f searxng
|
||||||
|
docker volume rm searxng-data
|
||||||
|
```
|
||||||
|
|
||||||
|
## Advanced Build Options
|
||||||
|
|
||||||
|
### Reduce Binary Size
|
||||||
|
Edit `OpenQuery.csproj`:
|
||||||
|
```xml
|
||||||
|
<PropertyGroup>
|
||||||
|
<PublishAot>true</PublishAot>
|
||||||
|
<InvariantGlobalization>true</InvariantGlobalization> <!-- Already set -->
|
||||||
|
<StripSymbols>true</StripSymbols>
|
||||||
|
</PropertyGroup>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Debug Build
|
||||||
|
```bash
|
||||||
|
dotnet build -c Debug
|
||||||
|
dotnet run -- "your question"
|
||||||
|
```
|
||||||
|
|
||||||
|
### With Symbols (for debugging)
|
||||||
|
```bash
|
||||||
|
dotnet publish -c Release -r linux-x64 \
|
||||||
|
--self-contained true \
|
||||||
|
/p:PublishAot=true \
|
||||||
|
/p:DebugType=portable
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting Installation
|
||||||
|
|
||||||
|
### "dotnet: command not found"
|
||||||
|
- Add `.dotnet` to PATH: `export PATH="$HOME/.dotnet:$PATH"`
|
||||||
|
- Restart terminal or source shell config
|
||||||
|
|
||||||
|
### "The SDK 'Microsoft.NET.Sdk' was not found"
|
||||||
|
- .NET SDK not installed correctly
|
||||||
|
- Re-run installer or use `dotnet-install.sh`
|
||||||
|
|
||||||
|
### AOT Build Fails
|
||||||
|
- Some platforms may not support AOT yet
|
||||||
|
- Remove `/p:PublishAot=true` to use JIT
|
||||||
|
- Check [.NET AOT support](https://docs.microsoft.com/dotnet/core/deploying/native-aot/)
|
||||||
|
|
||||||
|
### Docker Pull Fails (SearxNG)
|
||||||
|
```bash
|
||||||
|
# Pull image separately first
|
||||||
|
docker pull searxng/searxng:latest
|
||||||
|
# Then run container
|
||||||
|
docker run -d --name searxng -p 8002:8080 searxng/searxng
|
||||||
|
```
|
||||||
|
|
||||||
|
### Port 8002 Already in Use
|
||||||
|
Change port in docker command:
|
||||||
|
```bash
|
||||||
|
docker run -d --name searxng -p 8080:8080 searxng/searxng
|
||||||
|
# Then set SEARXNG_URL=http://localhost:8080
|
||||||
|
```
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
After successful installation:
|
||||||
|
1. [Configure OpenQuery](configuration.md)
|
||||||
|
2. [Learn how to use it](usage.md)
|
||||||
|
3. Read the [Architecture](architecture.md) to understand how it works
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Need help?** See [Troubleshooting](troubleshooting.md) or open an issue.
|
||||||
522
docs/performance.md
Normal file
522
docs/performance.md
Normal file
@@ -0,0 +1,522 @@
|
|||||||
|
# Performance
|
||||||
|
|
||||||
|
Performance characteristics, optimization strategies, and scalability considerations for OpenQuery.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Performance Overview](#performance-overview)
|
||||||
|
2. [Latency Breakdown](#latency-breakdown)
|
||||||
|
3. [Throughput](#throughput)
|
||||||
|
4. [Memory Usage](#memory-usage)
|
||||||
|
5. [Benchmarking](#benchmarking)
|
||||||
|
6. [Optimization Strategies](#optimization-strategies)
|
||||||
|
7. [Scalability Limits](#scalability-limits)
|
||||||
|
|
||||||
|
## Performance Overview
|
||||||
|
|
||||||
|
OpenQuery is designed for **low-latency interactive use** (15-50 seconds end-to-end) while maximizing parallelization to minimize wait time.
|
||||||
|
|
||||||
|
### Key Metrics
|
||||||
|
|
||||||
|
| Metric | Typical | Best Case | Worst Case |
|
||||||
|
|--------|---------|-----------|------------|
|
||||||
|
| **End-to-End Latency** | 15-50s | 10s | 120s+ |
|
||||||
|
| **API Cost** | $0.01-0.05 | $0.005 | $0.20+ |
|
||||||
|
| **Memory Footprint** | 100-300MB | 50MB | 1GB+ |
|
||||||
|
| **Network I/O** | 5-20MB | 1MB | 100MB+ |
|
||||||
|
|
||||||
|
**Note**: Wide variance due to network latency, content size, and LLM speed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Latency Breakdown
|
||||||
|
|
||||||
|
### Default Configuration
|
||||||
|
|
||||||
|
`-q 3 -r 5 -c 3` (3 queries, 5 results each, 3 final chunks)
|
||||||
|
|
||||||
|
| Stage | Operation | Parallelism | Time (p50) | Time (p95) | Dominant Factor |
|
||||||
|
|-------|-----------|-------------|------------|------------|-----------------|
|
||||||
|
| 1 | Query Generation | 1 | 2-5s | 10s | LLM inference speed |
|
||||||
|
| 2a | Searches (3 queries × 5 results) | 3 concurrent | 3-8s | 15s | SearxNG latency |
|
||||||
|
| 2b | Article Fetching (≈15 URLs) | 10 concurrent | 5-15s | 30s | Each site's response time |
|
||||||
|
| 2c | Chunking | 10 concurrent | <1s | 2s | CPU (HTML parsing) |
|
||||||
|
| 3a | Query Embedding | 1 | 0.5-1s | 3s | Embedding API latency |
|
||||||
|
| 3b | Chunk Embeddings (≈50 chunks) | 4 concurrent | 1-3s | 10s | Batch API latency |
|
||||||
|
| 4 | Ranking | 1 | <0.1s | 0.5s | CPU (vector math) |
|
||||||
|
| 5 | Final Answer Streaming | 1 | 5-20s | 40s | LLM generation speed |
|
||||||
|
| **Total** | | | **16-50s** | **~60s** | |
|
||||||
|
|
||||||
|
### Phase Details
|
||||||
|
|
||||||
|
#### Phase 1: Query Generation (2-5s)
|
||||||
|
- Single non-streaming LLM call
|
||||||
|
- Input: system prompt + user question (~200 tokens)
|
||||||
|
- Output: JSON array of 3-5 short strings (~50 tokens)
|
||||||
|
- Fast because small context and output
|
||||||
|
|
||||||
|
#### Phase 2a: Searches (3-8s)
|
||||||
|
- 3 parallel `SearxngClient.SearchAsync` calls
|
||||||
|
- Each: query → SearxNG → aggregator engines → scraped results
|
||||||
|
- Latency highly variable based on:
|
||||||
|
- SearxNG instance performance
|
||||||
|
- Network distance to SearxNG
|
||||||
|
- SearxNG's upstream search engines
|
||||||
|
|
||||||
|
#### Phase 2b: Article Fetching (5-15s)
|
||||||
|
- ≈15 URLs to fetch (3 queries × 5 results minus duplicates)
|
||||||
|
- Up to 10 concurrent fetches (semaphore)
|
||||||
|
- Each: TCP connect + TLS handshake + HTTP GET + SmartReader parse
|
||||||
|
- Latency:
|
||||||
|
- Fast sites (CDN, cached): 200-500ms
|
||||||
|
- Normal sites: 1-3s
|
||||||
|
- Slow/unresponsive sites: timeout after ~30s
|
||||||
|
|
||||||
|
Why 5-15s for 15 URLs with 10 concurrent?
|
||||||
|
- First wave (10 URLs): max latency among them ≈ 3s → 3s
|
||||||
|
- Second wave (5 URLs): another ≈ 3s → total 6s
|
||||||
|
- But many URLs faster (500ms) → total ≈ 2-3s
|
||||||
|
- However, some sites take 5-10s → dominates
|
||||||
|
|
||||||
|
**Tail latency**: Slowest few URLs can dominate total time. Cannot proceed until all fetch attempts complete (or fail).
|
||||||
|
|
||||||
|
#### Phase 2c: Chunking (<1s)
|
||||||
|
- CPU-bound HTML cleaning and splitting
|
||||||
|
- SmartReader is surprisingly fast; C# HTML parser
|
||||||
|
- Typically 100-300 chunks total
|
||||||
|
- <1s on modern CPU
|
||||||
|
|
||||||
|
#### Phase 3: Embeddings (1.5-4s)
|
||||||
|
- **Query embedding**: 1 call, ~200 tokens, ≈ 0.5-1s
|
||||||
|
- **Chunk embeddings**: ≈50 chunks → 1 batch of 50 (batch size 300 unused here)
|
||||||
|
- Batch of 50: still single API call, ~15K tokens (50 × 300 chars ≈ 15K tokens)
|
||||||
|
- If using `text-embedding-3-small`: $0.00002 per 1K → ~$0.0003 per batch
|
||||||
|
- Latency: 1-3s for embedding API
|
||||||
|
|
||||||
|
If more chunks (say 500), would be 2 batches → maybe 2-4s.
|
||||||
|
|
||||||
|
Parallel batches (4 concurrent) help if many batches (1500+ chunks).
|
||||||
|
|
||||||
|
#### Phase 4: Ranking (<0.1s)
|
||||||
|
- Cosine similarity for 50-100 chunks
|
||||||
|
- Each: dot product + normalization (O(dim)=1536)
|
||||||
|
- 100 × 1536 ≈ 150K FLOPs → negligible on modern CPU
|
||||||
|
- SIMD acceleration from `TensorPrimitives`
|
||||||
|
|
||||||
|
#### Phase 5: Final Answer (5-20s)
|
||||||
|
- Streaming chat completion
|
||||||
|
- Input: system prompt + context (50K tokens for 3×500-char chunks) + question
|
||||||
|
- Output: varies wildly (200-2000 tokens typically)
|
||||||
|
- Longer context slightly increases latency
|
||||||
|
- Model choice major factor:
|
||||||
|
- Qwen Flash: fast (5-10s for 1000 output tokens)
|
||||||
|
- Gemini Flash: moderate (10-15s)
|
||||||
|
- Llama-class: slower (20-40s)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Throughput
|
||||||
|
|
||||||
|
### Sequential Execution
|
||||||
|
|
||||||
|
Running queries one after another (default CLI behavior):
|
||||||
|
- Latency per query: 16-50s
|
||||||
|
- Throughput: 1 query / 20s ≈ 180 queries/hour (theoretically)
|
||||||
|
|
||||||
|
But API rate limits will kick in before that:
|
||||||
|
- OpenRouter free tier: limited RPM/TPM
|
||||||
|
- Even paid: soft limits
|
||||||
|
|
||||||
|
### Concurrent Execution (Multiple OpenQuery Instances)
|
||||||
|
|
||||||
|
You could run multiple OpenQuery processes in parallel (different terminals), but they share:
|
||||||
|
- Same API key (OpenRouter rate limit is per API key, not per process)
|
||||||
|
- Same SearxNG instance (could saturate it)
|
||||||
|
|
||||||
|
**Practical**: 3-5 concurrent processes before hitting diminishing returns or rate limits.
|
||||||
|
|
||||||
|
### Throughput Optimization
|
||||||
|
|
||||||
|
To maximize queries per hour:
|
||||||
|
1. Use fastest model (Qwen Flash)
|
||||||
|
2. Reduce `--chunks` to 1-2
|
||||||
|
3. Reduce `--queries` to 1
|
||||||
|
4. Use local/fast SearxNG
|
||||||
|
5. Cache embedding results (not implemented)
|
||||||
|
6. Batch multiple questions in one process (not implemented; would require redesign)
|
||||||
|
|
||||||
|
**Achievable**: Maybe 500-1000 queries/hour on paid OpenRouter plan with aggressive settings.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Memory Usage
|
||||||
|
|
||||||
|
### Baseline
|
||||||
|
|
||||||
|
.NET 10 AOT app with dependencies:
|
||||||
|
- **Code**: ~30MB (AOT compiled native code)
|
||||||
|
- **Runtime**: ~20MB (.NET runtime overhead)
|
||||||
|
- **Base Memory**: ~50MB
|
||||||
|
|
||||||
|
### Per-Query Memory
|
||||||
|
|
||||||
|
| Component | Memory | Lifetime |
|
||||||
|
|-----------|--------|----------|
|
||||||
|
| Search results (15 items) | ~30KB | Pipeline |
|
||||||
|
| Articles (raw HTML) | ~5MB (transient) | Freed after parse |
|
||||||
|
| Articles (extracted text) | ~500KB | Until pipeline complete |
|
||||||
|
| Chunks (≈100 items) | ~50KB text + embeddings 600KB | Until pipeline complete |
|
||||||
|
| Embeddings (100 × 1536 floats) | ~600KB | Until pipeline complete |
|
||||||
|
| HTTP buffers | ~1MB per concurrent request | Short-lived |
|
||||||
|
| **Total per query** | **~2-5MB** (excluding base) | Released after complete |
|
||||||
|
|
||||||
|
**Peak**: When all articles fetched but not yet embedded, we have text ~500KB + chunks ~650KB = ~1.2MB + overhead ≈ 2-3MB.
|
||||||
|
|
||||||
|
**If processing many queries in parallel** (unlikely for CLI), memory would scale linearly.
|
||||||
|
|
||||||
|
### Memory Leak Risks
|
||||||
|
|
||||||
|
- `HttpClient` instances: Created per `OpenRouterClient` and `SearxngClient`. Should be disposed (not happening). But short-lived process exits anyway.
|
||||||
|
- `StatusReporter` background task: Disposed via `using`
|
||||||
|
- `RateLimiter` semaphore: Disposed via `IAsyncDisposable` if wrapped in `using` (not currently, but short-lived)
|
||||||
|
|
||||||
|
No major leaks observed.
|
||||||
|
|
||||||
|
### Memory Optimization Opportunities
|
||||||
|
|
||||||
|
1. **Reuse HttpClient** with `IHttpClientFactory` (but not needed for CLI)
|
||||||
|
2. **Stream article fetching** instead of buffering all articles before embedding (possible: embed as URLs complete)
|
||||||
|
3. **Early chunk filtering**: Discard low-quality chunks before embedding to reduce embedding count
|
||||||
|
4. **Cache embeddings**: By content hash, avoid re-embedding seen text (would need persistent storage)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Benchmarking
|
||||||
|
|
||||||
|
### Methodology
|
||||||
|
|
||||||
|
Measure with `time` command and verbose logging:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
time openquery -v "What is quantum entanglement?" 2>&1 | tee log.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
Parse log for timestamps (or add them manually by modifying code).
|
||||||
|
|
||||||
|
### Sample Benchmark
|
||||||
|
|
||||||
|
**Environment**:
|
||||||
|
- Linux x64, .NET 10 AOT
|
||||||
|
- SearxNG local Docker (localhost:8002)
|
||||||
|
- OpenRouter API (US East)
|
||||||
|
- Model: qwen/qwen3.5-flash-02-23
|
||||||
|
|
||||||
|
**Run 1**:
|
||||||
|
```
|
||||||
|
real 0m23.4s
|
||||||
|
user 0m1.2s
|
||||||
|
sys 0m0.3s
|
||||||
|
```
|
||||||
|
Log breakdown:
|
||||||
|
- Query generation: 3.2s
|
||||||
|
- Searches: 4.1s
|
||||||
|
- Article fetching: 8.7s (12 URLs)
|
||||||
|
- Embeddings: 2.8s (45 chunks)
|
||||||
|
- Final answer: 4.6s (325 tokens)
|
||||||
|
|
||||||
|
**Run 2** (cached SearxNG results, same URLs):
|
||||||
|
```
|
||||||
|
real 0m15.8s
|
||||||
|
```
|
||||||
|
Faster article fetching (2.3s) because sites cached or faster second request.
|
||||||
|
|
||||||
|
**Run 3** (verbose `-s` short answer):
|
||||||
|
```
|
||||||
|
real 0m18.2s
|
||||||
|
```
|
||||||
|
Final answer faster (2.1s instead of 4.6s) due to shorter output.
|
||||||
|
|
||||||
|
### Benchmarking Tips
|
||||||
|
|
||||||
|
1. **Warm up**: First run slower (JIT or AOT cold start). Discard first measurement.
|
||||||
|
2. **Network variance**: Run multiple times and average.
|
||||||
|
3. **Control variables**: Same question, same SearxNG instance, same network conditions.
|
||||||
|
4. **Measure API costs**: Check OpenRouter dashboard for token counts.
|
||||||
|
5. **Profile with dotTrace** or `perf` if investigating CPU bottlenecks.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Optimization Strategies
|
||||||
|
|
||||||
|
### 1. Tune Concurrent Limits
|
||||||
|
|
||||||
|
Edit `SearchTool.cs` where `_options` is created:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var _options = new ParallelProcessingOptions
|
||||||
|
{
|
||||||
|
MaxConcurrentArticleFetches = 5, // ↓ from 10
|
||||||
|
MaxConcurrentEmbeddingRequests = 2, // ↓ from 4
|
||||||
|
EmbeddingBatchSize = 300 // ↑ or ↓ (rarely matters)
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why tune down?**
|
||||||
|
- Hit OpenRouter rate limits
|
||||||
|
- Network bandwidth saturated
|
||||||
|
- Too many concurrent fetches overwhelm target sites (ethical/scraping etiquette)
|
||||||
|
|
||||||
|
**Why tune up?**
|
||||||
|
- Fast network, powerful CPU, no rate limits
|
||||||
|
- Many chunks (>500) needing parallel embedding batches
|
||||||
|
|
||||||
|
**Monitor**:
|
||||||
|
- `openquery -v` shows embedding progress: `[Generating embeddings: batch X/Y]`
|
||||||
|
- If Y=1 (all fitted in one batch), batch size is fine
|
||||||
|
- If Y>1 and max concurrent = Y, you're using full parallelism
|
||||||
|
|
||||||
|
### 2. Reduce Data Volume
|
||||||
|
|
||||||
|
**Fewer search results**:
|
||||||
|
```bash
|
||||||
|
openquery -r 3 "question" # instead of 5 or 10
|
||||||
|
```
|
||||||
|
Effect: Fetches fewer URLs, extracts fewer chunks. Linear reduction in work.
|
||||||
|
|
||||||
|
**Fewer queries**:
|
||||||
|
```bash
|
||||||
|
openquery -q 1 "question"
|
||||||
|
```
|
||||||
|
Effect: One search instead of N. Quality may suffer (less diverse sources).
|
||||||
|
|
||||||
|
**Fewer chunks**:
|
||||||
|
```bash
|
||||||
|
openquery -c 1 "question"
|
||||||
|
```
|
||||||
|
Effect: Only top 1 chunk in context → fewer tokens → faster final answer, but may miss relevant info.
|
||||||
|
|
||||||
|
**Chunk size** (compile-time constant):
|
||||||
|
Edit `ChunkingService.cs`:
|
||||||
|
```csharp
|
||||||
|
private const int MAX_CHUNK_SIZE = 300; // instead of 500
|
||||||
|
```
|
||||||
|
Effect: More chunks (more granular ranking) but each chunk shorter → more chunks to rank, more embeddings to generate. Could increase or decrease total time. Likely more tokens overall (more chunks in context if `-c` is fixed number).
|
||||||
|
|
||||||
|
### 3. Change Embedding Model
|
||||||
|
|
||||||
|
Currently hardcoded to `openai/text-embedding-3-small`. Could use:
|
||||||
|
- `openai/text-embedding-3-large` (higher quality, slower, more expensive)
|
||||||
|
- `intfloat/multilingual-e5-large` (multilingual, smaller)
|
||||||
|
|
||||||
|
Modify `EmbeddingService` constructor:
|
||||||
|
```csharp
|
||||||
|
public EmbeddingService(OpenRouterClient client, string embeddingModel = "your-model")
|
||||||
|
```
|
||||||
|
|
||||||
|
Then pass:
|
||||||
|
```csharp
|
||||||
|
var embeddingService = new EmbeddingService(client, "intfloat/multilingual-e5-large");
|
||||||
|
```
|
||||||
|
|
||||||
|
**Impact**: Different dimensionality (1536 vs 1024 vs 4096). Memory scales with dim. Quality may vary for non-English queries.
|
||||||
|
|
||||||
|
### 4. Caching
|
||||||
|
|
||||||
|
**Current**: No caching. Every query hits all APIs.
|
||||||
|
|
||||||
|
**Embedding cache** (by text hash):
|
||||||
|
- Could store in memory: `Dictionary<string, float[]>`
|
||||||
|
- Or disk: `~/.cache/openquery/embeddings/`
|
||||||
|
- Invalidation: embeddings are deterministic per model, so long-term cache viable
|
||||||
|
|
||||||
|
**Search cache** (by query hash):
|
||||||
|
- Cache `List<SearxngResult>` for identical queries
|
||||||
|
- TTL: maybe 1 hour (search results change over time)
|
||||||
|
|
||||||
|
**Article cache** (by URL hash):
|
||||||
|
- Cache `Article` (text content) per URL
|
||||||
|
- Invalidation: could check `Last-Modified` header or use TTL (1 day)
|
||||||
|
|
||||||
|
**Implementation effort**: Medium. Would need cache abstraction (interface, in-memory + disk options).
|
||||||
|
|
||||||
|
**Benefit**: Repeat queries (common in testing or similar questions) become instant.
|
||||||
|
|
||||||
|
### 5. Parallelize More (Aggressive)
|
||||||
|
|
||||||
|
**Currently**:
|
||||||
|
- Searches: unbounded (as many as `--queries`)
|
||||||
|
- Fetches: max 10
|
||||||
|
- Embeddings: max 4
|
||||||
|
|
||||||
|
Could increase:
|
||||||
|
- Fetches to 20 or 50 (if network/CPU can handle)
|
||||||
|
- Embeddings to 8-16 (if OpenRouter rate limit allows)
|
||||||
|
|
||||||
|
**Risk**:
|
||||||
|
- Overwhelming target sites (unethical scraping)
|
||||||
|
- API rate limits → 429 errors
|
||||||
|
- Local bandwidth saturation
|
||||||
|
|
||||||
|
### 6. Local Models (Self-Hosted)
|
||||||
|
|
||||||
|
Replace OpenRouter with local LLM:
|
||||||
|
- **Query generation**: Could run tiny model locally (no API latency)
|
||||||
|
- **Embeddings**: Could run `all-MiniLM-L6-v2` locally (fast, free after setup)
|
||||||
|
- **Answer**: Could run Llama 3 8B locally (no cost, but slower than GPT-4/Gemini)
|
||||||
|
|
||||||
|
**Benefits**:
|
||||||
|
- Zero API costs (after hardware)
|
||||||
|
- No network latency
|
||||||
|
- Unlimited queries
|
||||||
|
|
||||||
|
**Drawbacks**:
|
||||||
|
- GPU required for decent speed (or CPU very slow)
|
||||||
|
- Setup complexity (Ollama, llama.cpp, vLLM, etc.)
|
||||||
|
- Model quality may lag behind commercial APIs
|
||||||
|
|
||||||
|
**Integration**: Would need to implement local inference backends (separate project scope).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scalability Limits
|
||||||
|
|
||||||
|
### API Rate Limits
|
||||||
|
|
||||||
|
**OpenRouter**:
|
||||||
|
- Free tier: Very limited (few RPM)
|
||||||
|
- Paid: Varies by model, but typical ~10-30 requests/second
|
||||||
|
- Embedding API has separate limits
|
||||||
|
|
||||||
|
**Mitigation**:
|
||||||
|
- Reduce concurrency (see tuning)
|
||||||
|
- Add exponential backoff (already have for embeddings)
|
||||||
|
- Batch embedding requests (already done)
|
||||||
|
|
||||||
|
### SearxNG Limits
|
||||||
|
|
||||||
|
**Single instance**:
|
||||||
|
- Can handle ~10-50 QPS depending on hardware
|
||||||
|
- Upstream search engines may rate limit per instance
|
||||||
|
- Memory ~100-500MB
|
||||||
|
|
||||||
|
**Mitigation**:
|
||||||
|
- Run multiple SearxNG instances behind load balancer
|
||||||
|
- Use different public instances
|
||||||
|
- Implement client-side rate limiting (currently only per-URL fetches limited, not searches)
|
||||||
|
|
||||||
|
### Network Bandwidth
|
||||||
|
|
||||||
|
**Typical data transfer**:
|
||||||
|
- Searches: 1KB per query × 3 = 3KB
|
||||||
|
- Articles: 100-500KB per fetch × 15 = 1.5-7.5MB (raw HTML)
|
||||||
|
- Extracted text: ~10% of HTML size = 150-750KB
|
||||||
|
- Embeddings: 100 chunks × 1536 × 4 bytes = 600KB (request + response)
|
||||||
|
- Final answer: 2-10KB
|
||||||
|
|
||||||
|
**Total**: ~3-10MB per query
|
||||||
|
|
||||||
|
**100 queries/hour**: ~300MB-1GB data transfer
|
||||||
|
|
||||||
|
**Not an issue** for broadband, but could matter on metered connections.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Moatslaw's Law: Scaling with Chunk Count
|
||||||
|
|
||||||
|
Let:
|
||||||
|
- C = number of chunks with valid embeddings
|
||||||
|
- d = embedding dimension (1536)
|
||||||
|
- B = embedding batch size (300)
|
||||||
|
- P = max parallel embedding batches (4)
|
||||||
|
|
||||||
|
**Embedding Time** ≈ `O(C/B * 1/P)` (batches divided by parallelism)
|
||||||
|
|
||||||
|
**Ranking Time** ≈ `O(C * d)` (dot product per chunk)
|
||||||
|
|
||||||
|
**Context Tokens** (for final answer) ≈ `C * avg_chunk_tokens` (≈ 500 chars = 125 tokens)
|
||||||
|
|
||||||
|
**As C increases**:
|
||||||
|
- Embedding time: linear in C/B (sublinear if batch fits in one)
|
||||||
|
- Ranking time: linear in C
|
||||||
|
- Final answer latency: more tokens in context → longer context processing + potentially longer answer (more relevant chunks to synthesize)
|
||||||
|
|
||||||
|
**Practical limit**:
|
||||||
|
- With defaults, C ~ 50-100 (from 15 articles)
|
||||||
|
- Could reach C ~ 500-1000 if:
|
||||||
|
- `--queries` = 10
|
||||||
|
- `--results` = 20 (200 URLs)
|
||||||
|
- Many articles long → many chunks each
|
||||||
|
- At C = 1000:
|
||||||
|
- Embeddings: 1000/300 ≈ 4 batches, with 4 parallel → still 1 sequential step (if 4 batches, parallel all 4 → time ≈ 1 batch duration)
|
||||||
|
- But OpenRouter may have per-minute limits on embedding requests
|
||||||
|
- Ranking: 1000 × 1536 = 1.5M FLOPs → still <0.01s
|
||||||
|
- Context tokens: 1000 × 125 = 125K tokens! Many LLMs have 200K context, so fits, but expensive and slow.
|
||||||
|
|
||||||
|
**Conclusion**: Current defaults scale to C ~ 100-200 comfortably. Beyond that:
|
||||||
|
- Need to increase batch size or parallelism for embeddings
|
||||||
|
- May hit embedding API rate limits
|
||||||
|
- Context token count becomes expensive and may degrade answer quality (LLMs lose focus in very long context)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Profiling
|
||||||
|
|
||||||
|
### CPU Profiling
|
||||||
|
|
||||||
|
Use `dotnet-trace` or `perf`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Collect trace for 30 seconds while running query
|
||||||
|
dotnet-trace collect --process-id $(pgrep OpenQuery) --duration 30s -o trace.nettrace
|
||||||
|
|
||||||
|
# Analyze with Visual Studio or PerfView
|
||||||
|
```
|
||||||
|
|
||||||
|
Look for:
|
||||||
|
- Hot methods: `ChunkingService.ChunkText`, `EmbeddingService.GetEmbeddingsAsync`, cosine similarity
|
||||||
|
- Allocation hotspots
|
||||||
|
|
||||||
|
### Memory Profiling
|
||||||
|
|
||||||
|
```bash
|
||||||
|
dotnet-gcdump collect -p <pid>
|
||||||
|
# Open in VS or dotnet-gcdump analyze
|
||||||
|
```
|
||||||
|
|
||||||
|
Check heap size, object counts (look for large `string` objects from article content).
|
||||||
|
|
||||||
|
### Network Profiling
|
||||||
|
|
||||||
|
Use `tcpdump` or `wireshark`:
|
||||||
|
```bash
|
||||||
|
tcpdump -i any port 8002 or port 443 -w capture.pcap
|
||||||
|
```
|
||||||
|
|
||||||
|
Or simpler: `time` on individual curl commands to measure latency components.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [Configuration](../configuration.md) - Tune for your environment
|
||||||
|
- [Troubleshooting](../troubleshooting.md) - Diagnose slow performance
|
||||||
|
- [Architecture](../architecture.md) - Understand pipeline bottlenecks
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Quick Tuning Cheatsheet**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Fast & cheap (factual Q&A)
|
||||||
|
openquery -q 1 -r 3 -c 2 -s "What is X?"
|
||||||
|
|
||||||
|
# Thorough (research)
|
||||||
|
openquery -q 5 -r 10 -c 5 -l "Deep dive on X"
|
||||||
|
|
||||||
|
# Custom code edit for concurrency
|
||||||
|
# In SearchTool.cs:
|
||||||
|
_options = new ParallelProcessingOptions {
|
||||||
|
MaxConcurrentArticleFetches = 20, // if network can handle
|
||||||
|
MaxConcurrentEmbeddingRequests = 8 // if API allows
|
||||||
|
};
|
||||||
|
```
|
||||||
699
docs/troubleshooting.md
Normal file
699
docs/troubleshooting.md
Normal file
@@ -0,0 +1,699 @@
|
|||||||
|
# Troubleshooting
|
||||||
|
|
||||||
|
Solve common issues, errors, and performance problems with OpenQuery.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Common Errors](#common-errors)
|
||||||
|
2. [Performance Issues](#performance-issues)
|
||||||
|
3. [Debugging Strategies](#debugging-strategies)
|
||||||
|
4. [Getting Help](#getting-help)
|
||||||
|
|
||||||
|
## Common Errors
|
||||||
|
|
||||||
|
### ❌ "API Key is missing"
|
||||||
|
|
||||||
|
**Error Message**:
|
||||||
|
```
|
||||||
|
[Error] API Key is missing. Set OPENROUTER_API_KEY environment variable or run 'configure -i' to set it up.
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cause**: No API key available from environment or config file.
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
|
||||||
|
1. **Set environment variable** (temporary):
|
||||||
|
```bash
|
||||||
|
export OPENROUTER_API_KEY="sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Configure interactively** (persistent):
|
||||||
|
```bash
|
||||||
|
openquery configure -i
|
||||||
|
# Follow prompts to enter API key
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Check config file**:
|
||||||
|
```bash
|
||||||
|
cat ~/.config/openquery/config
|
||||||
|
# Should contain: ApiKey=sk-or-...
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Verify environment**:
|
||||||
|
```bash
|
||||||
|
echo $OPENROUTER_API_KEY
|
||||||
|
# If empty, you didn't export or exported in wrong shell
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ "Network request failed"
|
||||||
|
|
||||||
|
**Error Message**:
|
||||||
|
```
|
||||||
|
[Error] Network request failed. Details: Name or service not known
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cause**: Cannot reach OpenRouter or SearxNG API endpoints.
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
|
||||||
|
1. **Check internet connectivity**:
|
||||||
|
```bash
|
||||||
|
ping 8.8.8.8
|
||||||
|
curl https://openrouter.ai
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Verify SearxNG is running**:
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:8002/search?q=test&format=json"
|
||||||
|
# Should return JSON
|
||||||
|
```
|
||||||
|
|
||||||
|
If connection refused:
|
||||||
|
```bash
|
||||||
|
# Start SearxNG if using Docker
|
||||||
|
docker start searxng
|
||||||
|
# Or run fresh
|
||||||
|
docker run -d --name searxng -p 8002:8080 searxng/searxng:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Check firewall/proxy**:
|
||||||
|
```bash
|
||||||
|
# Test OpenRouter API
|
||||||
|
curl -H "Authorization: Bearer $OPENROUTER_API_KEY" \
|
||||||
|
https://openrouter.ai/api/v1/models
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Test from different network** (if behind restrictive firewall)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ "No search results found"
|
||||||
|
|
||||||
|
**Error Message**:
|
||||||
|
```
|
||||||
|
No search results found.
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cause**: Search queries returned zero results from SearxNG.
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
|
||||||
|
1. **Test SearxNG manually**:
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:8002/search?q=test&format=json" | jq '.results | length'
|
||||||
|
# Should be > 0
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Check SearxNG configuration**:
|
||||||
|
- If self-hosted: ensure internet access is enabled in `/etc/searxng/settings.yml`
|
||||||
|
- Some public instances disable certain engines or have rate limits
|
||||||
|
|
||||||
|
3. **Try a different SearxNG instance**:
|
||||||
|
```bash
|
||||||
|
export SEARXNG_URL="https://searx.example.com"
|
||||||
|
openquery "question"
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Use simpler queries**: Some queries may be too obscure or malformed
|
||||||
|
|
||||||
|
5. **Verbose mode to see queries**:
|
||||||
|
```bash
|
||||||
|
openquery -v "complex question"
|
||||||
|
# See what queries were generated
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ "Found search results but could not extract readable content."
|
||||||
|
|
||||||
|
**Cause**: SearxNG returned results but `ArticleService` failed to extract content from all URLs.
|
||||||
|
|
||||||
|
**Common Reasons**:
|
||||||
|
- JavaScript-heavy sites (React, Vue apps) where content loaded dynamically
|
||||||
|
- Paywalled sites (NYT, academic journals)
|
||||||
|
- PDFs or non-HTML content
|
||||||
|
- Malformed HTML
|
||||||
|
- Server returned error (404, 403, 500)
|
||||||
|
- `robots.txt` blocked crawler
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
1. **Accept that some sites can't be scraped** - try different query to get different results
|
||||||
|
2. **Use site:reddit.com or site:wikipedia.org** - these are usually scrape-friendly
|
||||||
|
3. **Increase `--results`** to get more URLs (some will work)
|
||||||
|
4. **Check verbose output**:
|
||||||
|
```bash
|
||||||
|
openquery -v "question"
|
||||||
|
# Look for "Warning: Failed to fetch article"
|
||||||
|
```
|
||||||
|
5. **Try a local SearxNG instance with more engines** - some engines fetch different sources
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ Rate Limiting (429 Too Many Requests)
|
||||||
|
|
||||||
|
**Symptoms**:
|
||||||
|
```bash
|
||||||
|
[Error] Response status code does not indicate success: 429 (Too Many Requests).
|
||||||
|
```
|
||||||
|
|
||||||
|
Or retries exhausting after Polly attempts.
|
||||||
|
|
||||||
|
**Cause**: Too many concurrent requests to OpenRouter API.
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
|
||||||
|
1. **Reduce concurrency** (edit `SearchTool.cs`):
|
||||||
|
```csharp
|
||||||
|
var _options = new ParallelProcessingOptions
|
||||||
|
{
|
||||||
|
MaxConcurrentArticleFetches = 5, // reduce from 10
|
||||||
|
MaxConcurrentEmbeddingRequests = 2, // reduce from 4
|
||||||
|
EmbeddingBatchSize = 150 // reduce from 300
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Add delay** between embedding batches (custom implementation)
|
||||||
|
|
||||||
|
3. **Upgrade OpenRouter plan** to higher rate limits
|
||||||
|
|
||||||
|
4. **Wait and retry** - rate limits reset after time window
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ Slow Performance
|
||||||
|
|
||||||
|
**Symptom**: Queries take 60+ seconds when they usually take 20s.
|
||||||
|
|
||||||
|
**Diagnosis Steps**:
|
||||||
|
|
||||||
|
1. **Run with verbose mode**:
|
||||||
|
```bash
|
||||||
|
openquery -v "question"
|
||||||
|
```
|
||||||
|
Watch which phase takes longest:
|
||||||
|
- Query generation?
|
||||||
|
- Searching?
|
||||||
|
- Fetching articles?
|
||||||
|
- Embeddings?
|
||||||
|
|
||||||
|
2. **Check network latency**:
|
||||||
|
```bash
|
||||||
|
time curl "https://openrouter.ai/api/v1/models"
|
||||||
|
time curl "http://localhost:8002/search?q=test&format=json"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Common Causes & Fixes**:
|
||||||
|
|
||||||
|
| Phase | Cause | Fix |
|
||||||
|
|-------|-------|-----|
|
||||||
|
| Searches | SearxNG overloaded/slow | Check CPU/memory, restart container |
|
||||||
|
| Fetching | Target sites slow | Reduce `--results` to fewer URLs |
|
||||||
|
| Embeddings | API rate limited | Reduce concurrency (see above) |
|
||||||
|
| Answer | Heavy model/load | Switch to faster model (e.g., Qwen Flash) |
|
||||||
|
|
||||||
|
3. **Resource monitoring**:
|
||||||
|
```bash
|
||||||
|
htop # CPU/memory usage
|
||||||
|
iftop # network throughput
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Reduce parameters**:
|
||||||
|
```bash
|
||||||
|
openquery -q 2 -r 3 -c 2 "question" # lighter load
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ Out of Memory
|
||||||
|
|
||||||
|
**Symptoms**:
|
||||||
|
- Process killed by OOM killer (Linux)
|
||||||
|
- `System.OutOfMemoryException`
|
||||||
|
- System becomes unresponsive
|
||||||
|
|
||||||
|
**Cause**: Processing too many large articles simultaneously.
|
||||||
|
|
||||||
|
**Why**: Each article can be 100KB+ of text, split into many chunks, embeddings are 6KB per chunk (1536 floats × 4 bytes). 200 chunks = 1.2MB embeddings, plus text ~100KB = 1.3MB. Not huge, but many large articles could create thousands of chunks.
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
|
||||||
|
1. **Reduce `--results`** (fewer URLs per query):
|
||||||
|
```bash
|
||||||
|
openquery -r 3 "question" # instead of 10
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Reduce `--queries`** (fewer search queries):
|
||||||
|
```bash
|
||||||
|
openquery -q 2 "question"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Fetches already limited** to 10 concurrent by default, which is reasonable
|
||||||
|
|
||||||
|
4. **Check article size**: Some sites (PDFs, long documents) may yield megabytes of text; SmartReader should truncate but may not
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ Invalid JSON from Query Generation
|
||||||
|
|
||||||
|
**Symptom**: Query generation fails silently, falls back to original question.
|
||||||
|
|
||||||
|
**Cause**: LLM returned non-JSON (even though instructed). Could be:
|
||||||
|
- Model not instruction-following
|
||||||
|
- Output exceeded context window
|
||||||
|
- API error in response
|
||||||
|
|
||||||
|
**Detection**: Run with `-v` to see:
|
||||||
|
```
|
||||||
|
[Failed to generate queries, falling back to original question. Error: ...]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Solutions**:
|
||||||
|
- Try a different model (configure to use Gemini or DeepSeek)
|
||||||
|
- Reduce `--queries` count (simpler task)
|
||||||
|
- Tune system prompt (would require code change)
|
||||||
|
- Accept fallback - the original question often works as sole query
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ Spinner Artifacts in Output
|
||||||
|
|
||||||
|
**Symptom**: When redirecting output to file, you see weird characters like `⠋`, `<60>`, etc.
|
||||||
|
|
||||||
|
**Cause**: Spinner uses Unicode Braille characters and ANSI escape codes.
|
||||||
|
|
||||||
|
**Fix**: Use `2>/dev/null | sed 's/.\x08//g'` to clean:
|
||||||
|
```bash
|
||||||
|
openquery "question" 2>/dev/null | sed 's/.\x08//g' > answer.md
|
||||||
|
```
|
||||||
|
|
||||||
|
Or run with `--verbose` (no spinner, only newline-separated messages):
|
||||||
|
```bash
|
||||||
|
openquery -v "question" > answer.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ "The type or namespace name '...' does not exist" (Build Error)
|
||||||
|
|
||||||
|
**Cause**: Missing NuGet package or wrong .NET SDK version.
|
||||||
|
|
||||||
|
**Solution**:
|
||||||
|
|
||||||
|
1. **Verify .NET SDK 10.0**:
|
||||||
|
```bash
|
||||||
|
dotnet --version
|
||||||
|
# Should be 10.x
|
||||||
|
```
|
||||||
|
|
||||||
|
If lower: https://dotnet.microsoft.com/download/dotnet/10.0
|
||||||
|
|
||||||
|
2. **Restore packages**:
|
||||||
|
```bash
|
||||||
|
dotnet restore
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Clean and rebuild**:
|
||||||
|
```bash
|
||||||
|
dotnet clean
|
||||||
|
dotnet build
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Check OpenQuery.csproj** for package references:
|
||||||
|
```xml
|
||||||
|
<PackageReference Include="Polly.Core" Version="8.6.6" />
|
||||||
|
<PackageReference Include="Polly.RateLimiting" Version="8.6.6" />
|
||||||
|
<PackageReference Include="SmartReader" Version="0.11.0" />
|
||||||
|
<PackageReference Include="System.CommandLine" Version="2.0.0-beta4.22272.1" />
|
||||||
|
<PackageReference Include="System.Numerics.Tensors" Version="9.0.0" />
|
||||||
|
```
|
||||||
|
|
||||||
|
If restore fails, these packages may not be available for .NET 10 preview. Consider:
|
||||||
|
- Downgrade to .NET 8.0 (if packages incompatible)
|
||||||
|
- Or find package versions compatible with .NET 10
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### ❌ AOT Compilation Fails
|
||||||
|
|
||||||
|
**Error**: `error NETSDK1085: The current .NET SDK does not support targeting .NET 10.0.`
|
||||||
|
|
||||||
|
**Cause**: Using .NET SDK older than 10.0.
|
||||||
|
|
||||||
|
**Fix**: Install .NET SDK 10.0 preview.
|
||||||
|
|
||||||
|
**Or**: Disable AOT for development (edit `.csproj`):
|
||||||
|
```xml
|
||||||
|
<!-- Remove or set to false -->
|
||||||
|
<PublishAot>false</PublishAot>
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Performance Issues
|
||||||
|
|
||||||
|
### Slow First Request
|
||||||
|
|
||||||
|
**Expected**: First query slower (JIT compilation for .NET runtime if not AOT, or initial API connections).
|
||||||
|
|
||||||
|
If not using AOT:
|
||||||
|
- Consider publishing with `/p:PublishAot=true` for production distribution
|
||||||
|
- Development builds use JIT, which adds 500ms-2s warmup
|
||||||
|
|
||||||
|
**Mitigation**: Accept as warmup cost, or pre-warm with dummy query.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### High Memory Usage
|
||||||
|
|
||||||
|
**Check**:
|
||||||
|
```bash
|
||||||
|
ps aux | grep OpenQuery
|
||||||
|
# Look at RSS (resident set size)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Typical**: 50-200MB (including .NET runtime, AOT code, data structures)
|
||||||
|
|
||||||
|
**If >500MB**:
|
||||||
|
- Likely processing very many articles
|
||||||
|
- Check `--results` and `--queries` values
|
||||||
|
- Use `--verbose` to see counts: `[Fetched X search results]`, `[Extracted Y chunks]`
|
||||||
|
|
||||||
|
**Reduce**:
|
||||||
|
- `--queries 2` instead of 10
|
||||||
|
- `--results 3` instead of 15
|
||||||
|
- These directly limit number of URLs to fetch
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### High CPU Usage
|
||||||
|
|
||||||
|
**Cause**:
|
||||||
|
- SmartReader HTML parsing (CPU-bound)
|
||||||
|
- Cosine similarity calculations (many chunks, but usually fast)
|
||||||
|
- Spinner animation (negligible)
|
||||||
|
|
||||||
|
**Check**: `htop` → which core at 100%? If single core, likely parsing. If all cores, parallel fetch.
|
||||||
|
|
||||||
|
**Mitigation**:
|
||||||
|
- Ensure `MaxConcurrentArticleFetches` not excessively high (default 10 is okay)
|
||||||
|
- Accept - CPU spikes normal during fetch phase
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### API Costs Higher Than Expected
|
||||||
|
|
||||||
|
**Symptom**: OpenRouter dashboard shows high token usage.
|
||||||
|
|
||||||
|
**Causes**:
|
||||||
|
1. Using expensive model (check `OPENROUTER_MODEL`)
|
||||||
|
2. High `--chunks` → more tokens in context
|
||||||
|
3. High `--queries` + `--results` → many articles → many embedding tokens (usually cheap)
|
||||||
|
4. Long answers (many completion tokens) - especially with `--long`
|
||||||
|
|
||||||
|
**Mitigation**:
|
||||||
|
- Use `qwen/qwen3.5-flash-02-23` (cheapest good option)
|
||||||
|
- Reduce `--chunks` to 2-3
|
||||||
|
- Use `--short` when detailed answer not needed
|
||||||
|
- Set `MaxTokens` in request (would need code change or **LLM capabilities**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Debugging Strategies
|
||||||
|
|
||||||
|
### 1. Enable Verbose Mode
|
||||||
|
|
||||||
|
Always start with:
|
||||||
|
```bash
|
||||||
|
openquery -v "question" 2>&1 | tee debug.log
|
||||||
|
```
|
||||||
|
|
||||||
|
Logs everything:
|
||||||
|
- Generated queries
|
||||||
|
- URLs fetched
|
||||||
|
- Progress counts
|
||||||
|
- Errors/warnings
|
||||||
|
|
||||||
|
**Analyze log**:
|
||||||
|
- How many queries generated? (Should match `--queries`)
|
||||||
|
- How many search results per query? (Should be ≤ `--results`)
|
||||||
|
- How many articles fetched successfully?
|
||||||
|
- How many chunks extracted?
|
||||||
|
- Any warnings?
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. Isolate Components
|
||||||
|
|
||||||
|
**Test SearxNG**:
|
||||||
|
```bash
|
||||||
|
curl "http://localhost:8002/search?q=test&format=json" | jq '.results[0]'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Test OpenRouter API**:
|
||||||
|
```bash
|
||||||
|
curl -X POST https://openrouter.ai/api/v1/chat/completions \
|
||||||
|
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"model":"qwen/qwen3.5-flash-02-23","messages":[{"role":"user","content":"Hello"}]}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Test Article Fetching** (with known good URL):
|
||||||
|
```bash
|
||||||
|
curl -L "https://example.com/article" | head -50
|
||||||
|
```
|
||||||
|
Then check if SmartReader can parse.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. Reduce Scope
|
||||||
|
|
||||||
|
Test with minimal parameters to isolate failing phase:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1 query, 2 results, 1 chunk - should be fast and simple
|
||||||
|
openquery -q 1 -r 2 -c 1 "simple test question" -v
|
||||||
|
|
||||||
|
# If that works, gradually increase:
|
||||||
|
openquery -q 1 -r 5 -c 1 "simple question"
|
||||||
|
openquery -q 3 -r 5 -c 1 "simple question"
|
||||||
|
openquery -q 3 -r 5 -c 3 "simple question"
|
||||||
|
|
||||||
|
# Then try complex question
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 4. Check Resource Limits
|
||||||
|
|
||||||
|
**File descriptors**: If fetching many articles, may hit limit.
|
||||||
|
```bash
|
||||||
|
ulimit -n # usually 1024, should be fine
|
||||||
|
```
|
||||||
|
|
||||||
|
**Memory**: Monitor with `free -h` while running.
|
||||||
|
|
||||||
|
**Disk space**: Not much disk use, but logs could fill if verbose mode used repeatedly.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5. Examine Config File
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cat ~/.config/openquery/config
|
||||||
|
# Ensure no spaces around '='
|
||||||
|
# Correct: ApiKey=sk-or-...
|
||||||
|
# Wrong: ApiKey = sk-or-... (spaces become part of value)
|
||||||
|
```
|
||||||
|
|
||||||
|
Reconfigure if needed:
|
||||||
|
```bash
|
||||||
|
openquery configure --key "sk-or-..."
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 6. Clear Cache / Reset
|
||||||
|
|
||||||
|
No persistent cache exists, but:
|
||||||
|
- Re-start SearxNG container: `docker restart searxng`
|
||||||
|
- Clear DNS cache if network issues: `sudo systemd-resolve --flush-caches`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Getting Help
|
||||||
|
|
||||||
|
### Before Asking
|
||||||
|
|
||||||
|
Gather information:
|
||||||
|
|
||||||
|
1. **OpenQuery version** (commit or build date if available)
|
||||||
|
2. **OS and architecture**: `uname -a` (Linux/macOS) or `systeminfo` (Windows)
|
||||||
|
3. **Full command** you ran
|
||||||
|
4. **Verbose output**: `openquery -v "question" 2>&1 | tee log.txt`
|
||||||
|
5. **Config** (redact API key):
|
||||||
|
```bash
|
||||||
|
sed 's/ApiKey=.*/ApiKey=REDACTED/' ~/.config/openquery/config
|
||||||
|
```
|
||||||
|
6. **SearxNG test**:
|
||||||
|
```bash
|
||||||
|
curl -s "http://localhost:8002/search?q=test&format=json" | jq '.results | length'
|
||||||
|
```
|
||||||
|
7. **OpenRouter test**:
|
||||||
|
```bash
|
||||||
|
curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \
|
||||||
|
https://openrouter.ai/api/v1/models | jq '.data[0].id'
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Where to Ask
|
||||||
|
|
||||||
|
1. **GitHub Issues** (if repository hosted there):
|
||||||
|
- Search existing issues first
|
||||||
|
- Provide all info from above
|
||||||
|
- Include log file (or link to gist)
|
||||||
|
|
||||||
|
2. **Community Forum** (if exists)
|
||||||
|
|
||||||
|
3. **Self-Diagnose**:
|
||||||
|
- Check `docs/troubleshooting.md` (this file)
|
||||||
|
- Check `docs/configuration.md`
|
||||||
|
- Check `docs/usage.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Example Bug Report
|
||||||
|
|
||||||
|
```
|
||||||
|
Title: OpenQuery hangs on "Fetching article X/Y"
|
||||||
|
|
||||||
|
Platform: Ubuntu 22.04, .NET 10.0, OpenQuery built from commit abc123
|
||||||
|
Command: openquery -v "What is Docker?" 2>&1 | tee log.txt
|
||||||
|
|
||||||
|
Verbose output shows:
|
||||||
|
[...]
|
||||||
|
[Fetching article 1/15: docker.com]
|
||||||
|
[Fetching article 2/15: hub.docker.com]
|
||||||
|
[Fetching article 3/15: docs.docker.com]
|
||||||
|
# Hangs here indefinitely, no more progress
|
||||||
|
|
||||||
|
SearxNG test:
|
||||||
|
$ curl "http://localhost:8002/search?q=docker&format=json" | jq '.results | length'
|
||||||
|
15 # SearxNG works
|
||||||
|
|
||||||
|
Config:
|
||||||
|
ApiKey=sk-or-xxxx (redacted)
|
||||||
|
Model=qwen/qwen3.5-flash-02-23
|
||||||
|
DefaultQueries=3
|
||||||
|
DefaultChunks=3
|
||||||
|
DefaultResults=5
|
||||||
|
|
||||||
|
Observation:
|
||||||
|
- Fetches 3 articles fine, then stalls
|
||||||
|
- Nothing in log after "Fetching article 3/15"
|
||||||
|
- Process uses ~150MB memory, CPU 0% (idle)
|
||||||
|
- Ctrl+C exits immediately
|
||||||
|
|
||||||
|
Expected: Should fetch remaining 12 articles (concurrent up to 10)
|
||||||
|
Actual: Only 3 fetched, then silent hang
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Known Issues
|
||||||
|
|
||||||
|
### Issue: Spinner Characters Not Displaying
|
||||||
|
|
||||||
|
Some terminals don't support Braille Unicode patterns.
|
||||||
|
|
||||||
|
**Symptoms**: Spinner shows as `?` or boxes.
|
||||||
|
|
||||||
|
**Fix**: Use font with Unicode support, or disable spinner by setting `TERM=dumb` or use `--verbose`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Issue: Progress Messages Overwritten
|
||||||
|
|
||||||
|
In very fast operations, progress updates may overlap.
|
||||||
|
|
||||||
|
**Cause**: `StatusReporter` uses `Console.Write` without lock in compact mode; concurrent writes from channel processor and spinner task could interleave.
|
||||||
|
|
||||||
|
**Mitigation**: Unlikely in practice (channel serializes, spinner only updates when `_currentMessage` set). If problematic, add lock around Console operations.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Issue: Articles with No Text Content
|
||||||
|
|
||||||
|
Some URLs return articles with empty `TextContent`.
|
||||||
|
|
||||||
|
**Cause**: SmartReader's quality heuristic (`IsReadable`) failed, or article truly has no text (image, script, error page).
|
||||||
|
|
||||||
|
**Effect**: Those URLs contribute zero chunks.
|
||||||
|
|
||||||
|
**Acceptable**: Part of normal operation; not all URLs yield readable content.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Issue: Duplicate Sources in Answer
|
||||||
|
|
||||||
|
Same website may appear multiple times (different articles).
|
||||||
|
|
||||||
|
**Cause**: Different URLs from different search results may be from same domain but different pages.
|
||||||
|
|
||||||
|
**Effect**: `[Source 1]` and `[Source 3]` could both be `example.com`. Not necessarily bad - they're different articles.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Performance Tuning Reference
|
||||||
|
|
||||||
|
| Setting | Default | Fastest | Most Thorough | Notes |
|
||||||
|
|---------|---------|---------|---------------|-------|
|
||||||
|
| `--queries` | 3 | 1 | 8+ | More queries = more searches |
|
||||||
|
| `--results` | 5 | 2 | 15+ | Fewer = fewer articles to fetch |
|
||||||
|
| `--chunks` | 3 | 1 | 5+ | More chunks = more context tokens |
|
||||||
|
| `MaxConcurrentArticleFetches` | 10 | 5 | 20 | Higher = more parallel fetches |
|
||||||
|
| `MaxConcurrentEmbeddingRequests` | 4 | 2 | 8 | Higher = faster embeddings (may hit rate limits) |
|
||||||
|
| `EmbeddingBatchSize` | 300 | 100 | 1000 | Larger = fewer API calls, more data per call |
|
||||||
|
|
||||||
|
**Start**: Defaults are balanced.
|
||||||
|
|
||||||
|
**Adjust if**:
|
||||||
|
- Slow: Reduce `--results`, `--queries`, or concurrency limits
|
||||||
|
- Poor quality: Increase `--chunks`, `--results`, `--queries`
|
||||||
|
- Rate limited: Reduce concurrency limits
|
||||||
|
- High cost: Use `--short`, reduce `--chunks`, choose cheaper model
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [Performance](../performance.md) - Detailed performance analysis
|
||||||
|
- [Configuration](../configuration.md) - Adjust settings
|
||||||
|
- [Usage](../usage.md) - Optimize workflow
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Quick Diagnostic Checklist**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Check API key
|
||||||
|
echo $OPENROUTER_API_KEY | head -c 10
|
||||||
|
|
||||||
|
# 2. Test SearxNG
|
||||||
|
curl -s "http://localhost:8002/search?q=test&format=json" | jq '.results | length'
|
||||||
|
|
||||||
|
# 3. Test OpenRouter
|
||||||
|
curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \
|
||||||
|
https://openrouter.ai/api/v1/models | jq '.data[0].id'
|
||||||
|
|
||||||
|
# 4. Run verbose
|
||||||
|
openquery -v "test" 2>&1 | grep -E "Fetching|Generated|Found"
|
||||||
|
|
||||||
|
# 5. Check resource usage while running
|
||||||
|
htop
|
||||||
|
|
||||||
|
# 6. Reduce scope and retry
|
||||||
|
openquery -q 1 -r 2 -c 1 "simple test"
|
||||||
|
```
|
||||||
483
docs/usage.md
Normal file
483
docs/usage.md
Normal file
@@ -0,0 +1,483 @@
|
|||||||
|
# Usage Guide
|
||||||
|
|
||||||
|
Complete reference for using the OpenQuery command-line interface.
|
||||||
|
|
||||||
|
## 📋 Table of Contents
|
||||||
|
|
||||||
|
1. [Basic Usage](#basic-usage)
|
||||||
|
2. [Command Reference](#command-reference)
|
||||||
|
3. [Examples](#examples)
|
||||||
|
4. [Output Format](#output-format)
|
||||||
|
5. [Tips and Tricks](#tips-and-tricks)
|
||||||
|
|
||||||
|
## Basic Usage
|
||||||
|
|
||||||
|
### Simplest Form
|
||||||
|
```bash
|
||||||
|
openquery "your question here"
|
||||||
|
```
|
||||||
|
|
||||||
|
That's it! OpenQuery will:
|
||||||
|
1. Generate search queries
|
||||||
|
2. Search the web
|
||||||
|
3. Extract relevant content
|
||||||
|
4. Stream an answer with sources
|
||||||
|
|
||||||
|
### Common Pattern
|
||||||
|
```bash
|
||||||
|
openquery [OPTIONS] "your question"
|
||||||
|
```
|
||||||
|
|
||||||
|
Quotes around the question are recommended to preserve spaces.
|
||||||
|
|
||||||
|
## Command Reference
|
||||||
|
|
||||||
|
### Main Command
|
||||||
|
|
||||||
|
#### `openquery [options] <question>`
|
||||||
|
|
||||||
|
Ask a question and get an AI-powered answer with citations.
|
||||||
|
|
||||||
|
**Arguments**:
|
||||||
|
- `question` (positional, one or more words) - The question to ask
|
||||||
|
|
||||||
|
**Options**:
|
||||||
|
|
||||||
|
| Option | Aliases | Type | Default | Description |
|
||||||
|
|--------|---------|------|---------|-------------|
|
||||||
|
| `--chunks` | `-c` | int | 3 (from config) | Number of top relevant content chunks to include in context |
|
||||||
|
| `--results` | `-r` | int | 5 (from config) | Number of search results to fetch per generated query |
|
||||||
|
| `--queries` | `-q` | int | 3 (from config) | Number of search queries to generate from your question |
|
||||||
|
| `--short` | `-s` | bool | false | Request a concise, to-the-point answer |
|
||||||
|
| `--long` | `-l` | bool | false | Request a detailed, comprehensive answer |
|
||||||
|
| `--verbose` | `-v` | bool | false | Show detailed progress information and debug output |
|
||||||
|
|
||||||
|
**Behavior**:
|
||||||
|
- Short and long are mutually exclusive but can both be omitted (balanced answer)
|
||||||
|
- If both `--short` and `--long` are specified, `--long` takes precedence
|
||||||
|
- Options override configuration file defaults
|
||||||
|
|
||||||
|
#### `openquery configure [options]`
|
||||||
|
|
||||||
|
Configure OpenQuery settings (API key, model, defaults).
|
||||||
|
|
||||||
|
**Options**:
|
||||||
|
|
||||||
|
| Option | Type | Description |
|
||||||
|
|--------|------|-------------|
|
||||||
|
| `--interactive` / `-i` | bool | Launch interactive configuration wizard |
|
||||||
|
| `--key` | string | Set the OpenRouter API key |
|
||||||
|
| `--model` | string | Set the default model |
|
||||||
|
| `--queries` | int? | Set default number of queries |
|
||||||
|
| `--chunks` | int? | Set default number of chunks |
|
||||||
|
| `--results` | int? | Set default number of results |
|
||||||
|
|
||||||
|
**Examples**:
|
||||||
|
```bash
|
||||||
|
# Interactive wizard
|
||||||
|
openquery configure -i
|
||||||
|
|
||||||
|
# Set just the API key
|
||||||
|
openquery configure --key "sk-or-..."
|
||||||
|
|
||||||
|
# Set multiple defaults non-interactively
|
||||||
|
openquery configure --model "deepseek/deepseek-v3.2" --queries 5 --chunks 4
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note**: Options with `?` are nullable; only provided values are updated.
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
### Everyday Queries
|
||||||
|
|
||||||
|
**Simple factual question**:
|
||||||
|
```bash
|
||||||
|
openquery "What is the speed of light?"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Multi-word question**:
|
||||||
|
```bash
|
||||||
|
openquery "How do solar panels work?"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Question with special characters**:
|
||||||
|
```bash
|
||||||
|
openquery "What's the weather in New York?"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Customizing Output
|
||||||
|
|
||||||
|
**Get a quick answer**:
|
||||||
|
```bash
|
||||||
|
openquery -s "Who is the CEO of Tesla?"
|
||||||
|
```
|
||||||
|
Output: "Elon Musk is the CEO of Tesla." (minimal explanation)
|
||||||
|
|
||||||
|
**Get detailed analysis**:
|
||||||
|
```bash
|
||||||
|
openquery -l "Explain how nuclear fusion works"
|
||||||
|
```
|
||||||
|
Output: Multi-paragraph detailed explanation with scientific details
|
||||||
|
|
||||||
|
**See everything**:
|
||||||
|
```bash
|
||||||
|
openquery -v "What is machine learning?"
|
||||||
|
```
|
||||||
|
Output: Shows all progress messages alongside the answer
|
||||||
|
|
||||||
|
### Adjusting Search Depth
|
||||||
|
|
||||||
|
**Minimal search** (fast, cheap):
|
||||||
|
```bash
|
||||||
|
openquery -q 1 -r 2 -c 1 "What time is it in London?"
|
||||||
|
```
|
||||||
|
- 1 generated query
|
||||||
|
- 2 results per query
|
||||||
|
- 1 context chunk
|
||||||
|
|
||||||
|
**Thorough research** (slow, comprehensive):
|
||||||
|
```bash
|
||||||
|
openquery -q 8 -r 15 -c 5 "History and applications of cryptography"
|
||||||
|
```
|
||||||
|
- 8 diverse queries
|
||||||
|
- 15 results each
|
||||||
|
- 5 top chunks
|
||||||
|
|
||||||
|
**Balanced (recommended defaults)**:
|
||||||
|
```bash
|
||||||
|
openquery "Latest advancements in CRISPR technology"
|
||||||
|
```
|
||||||
|
- 3 queries
|
||||||
|
- 5 results each
|
||||||
|
- 3 top chunks
|
||||||
|
|
||||||
|
### Combining Options
|
||||||
|
|
||||||
|
**Verbose custom search**:
|
||||||
|
```bash
|
||||||
|
openquery -v -q 5 -r 10 -c 4 "What are the ethical implications of AI?"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Short answer with more context**:
|
||||||
|
```bash
|
||||||
|
openquery -s -c 5 "Python vs JavaScript for web development"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Long answer, lots of research**:
|
||||||
|
```bash
|
||||||
|
openquery -l -q 10 -r 20 -c 6 "Complete guide to quantum computing"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Practical Use Cases
|
||||||
|
|
||||||
|
**News and Current Events**:
|
||||||
|
```bash
|
||||||
|
openquery "Latest developments in the Ukraine conflict"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Technical Questions**:
|
||||||
|
```bash
|
||||||
|
openquery "How to set up a PostgreSQL replication cluster"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Health Information** (verify with doctor!):
|
||||||
|
```bash
|
||||||
|
openquery "What are the symptoms of vitamin D deficiency?"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cooking**:
|
||||||
|
```bash
|
||||||
|
openquery "How to make authentic Italian pizza dough"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Travel**:
|
||||||
|
```bash
|
||||||
|
openquery "Best things to do in Tokyo in spring"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Programming**:
|
||||||
|
```bash
|
||||||
|
openquery "Rust vs Go for backend development in 2025"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Configuration Examples
|
||||||
|
|
||||||
|
**Set up for the first time**:
|
||||||
|
```bash
|
||||||
|
openquery configure -i
|
||||||
|
# Follow prompts to enter API key, choose model, set defaults
|
||||||
|
```
|
||||||
|
|
||||||
|
**Switch to a different model**:
|
||||||
|
```bash
|
||||||
|
openquery configure --model "google/gemini-3-flash-preview"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Update default number of queries**:
|
||||||
|
```bash
|
||||||
|
openquery configure --queries 5
|
||||||
|
```
|
||||||
|
|
||||||
|
**Set cost-effective defaults**:
|
||||||
|
```bash
|
||||||
|
openquery configure --model "qwen/qwen3.5-flash-02-23" --queries 2 --chunks 2 --results 3
|
||||||
|
```
|
||||||
|
|
||||||
|
**Check your configuration**:
|
||||||
|
```bash
|
||||||
|
cat ~/.config/openquery/config
|
||||||
|
```
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
### Standard Output (Streaming)
|
||||||
|
|
||||||
|
The answer streams in real-time, character by character, like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
⠋ Generating search queries... (spinner with status)
|
||||||
|
⠹ Searching web...
|
||||||
|
⠸ Fetching articles...
|
||||||
|
⠼ Processing embeddings...
|
||||||
|
⠴ Generating answer...
|
||||||
|
Assistant: Quantum entanglement is a phenomenon where pairs or groups of
|
||||||
|
particles interact in ways such that the quantum state of each particle
|
||||||
|
cannot be described independently of the others, even when separated by
|
||||||
|
large distances.
|
||||||
|
|
||||||
|
[Source 1: Understanding Quantum Mechanics](https://example.com/quantum)
|
||||||
|
[Source 2: Quantum Physics Overview](https://example.com/physics)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verbose Mode Output (`-v`)
|
||||||
|
|
||||||
|
When `--verbose` is enabled, you see detailed progress:
|
||||||
|
|
||||||
|
```
|
||||||
|
[Generating 3 search queries based on your question...]
|
||||||
|
[Generated queries:
|
||||||
|
1. quantum entanglement definition
|
||||||
|
2. how quantum entanglement works
|
||||||
|
3. quantum entanglement Bell's theorem
|
||||||
|
]
|
||||||
|
[Searching web for 'quantum entanglement definition'...]
|
||||||
|
[Searching web for 'how quantum entanglement works'...]
|
||||||
|
[Searching web for 'quantum entanglement Bell's theorem'...]
|
||||||
|
[Fetched 15 search results total]
|
||||||
|
[Fetching article 1/12: physicsworld.com]
|
||||||
|
[Fetching article 2/12: nature.com]
|
||||||
|
...
|
||||||
|
[Fetching article 12/12: scientificamerican.com]
|
||||||
|
[Extracted 48 content chunks]
|
||||||
|
[Generating embeddings: batch 1/4]
|
||||||
|
[Generating embeddings: batch 2/4]
|
||||||
|
[Generating embeddings: batch 3/4]
|
||||||
|
[Generating embeddings: batch 4/4]
|
||||||
|
[Ranked chunks by relevance]
|
||||||
|
[Found top 3 most relevant chunks overall. Generating answer...]
|
||||||
|
|
||||||
|
Assistant: Quantum entanglement is a fundamental phenomenon in quantum
|
||||||
|
mechanics where...
|
||||||
|
```
|
||||||
|
|
||||||
|
### Source Citations
|
||||||
|
|
||||||
|
Sources are formatted as markdown links in the answer:
|
||||||
|
```
|
||||||
|
[Source 1: Article Title](https://example.com/article)
|
||||||
|
```
|
||||||
|
|
||||||
|
These appear inline where the AI references that source. Multiple sources can be cited in a single paragraph.
|
||||||
|
|
||||||
|
### Error Output
|
||||||
|
|
||||||
|
Errors are written to stderr and exit with non-zero status:
|
||||||
|
|
||||||
|
```
|
||||||
|
[Error] API Key is missing. Set OPENROUTER_API_KEY environment variable or run 'configure -i'.
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tips and Tricks
|
||||||
|
|
||||||
|
### Speed Tips
|
||||||
|
|
||||||
|
1. **Reduce concurrency limits** (edit `SearchTool.cs` if constantly rate-limited)
|
||||||
|
2. **Reduce `--results`** - fewer articles to fetch and process
|
||||||
|
3. **Reduce `--queries`** - fewer parallel searches
|
||||||
|
4. **Use local SearxNG** - minimize network latency to search backend
|
||||||
|
5. **Cache results** - future enhancement could add caching
|
||||||
|
|
||||||
|
### Quality Tips
|
||||||
|
|
||||||
|
1. **Increase `--chunks`** to 4-5 for complex topics
|
||||||
|
2. **Increase `--queries`** to 5-8 for broad exploration
|
||||||
|
3. **Use `--long`** for deep topics that need elaboration
|
||||||
|
4. **Check `-v` output** to see which sources were selected
|
||||||
|
5. **Try different models** - some are better at synthesis, others at facts
|
||||||
|
|
||||||
|
### Cost Tips
|
||||||
|
|
||||||
|
1. **Use `qwen/qwen3.5-flash-02-23`** - cheapest good model
|
||||||
|
2. **Reduce `--chunks`** and `--results`** - fewer tokens in context
|
||||||
|
3. **Use `--short`** - shorter answers use fewer completion tokens
|
||||||
|
4. **Monitor usage** at [openrouter.ai](https://openrouter.ai) dashboard
|
||||||
|
|
||||||
|
### Workflow Tips
|
||||||
|
|
||||||
|
**Iterative deepening**:
|
||||||
|
```bash
|
||||||
|
# Start broad
|
||||||
|
openquery -v "machine learning"
|
||||||
|
|
||||||
|
# Identify subtopics from answer, then dive deeper
|
||||||
|
openquery "What is transformer architecture in LLMs?"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Compare answers**:
|
||||||
|
```bash
|
||||||
|
# Same question with different models
|
||||||
|
OPENROUTER_MODEL="qwen/qwen3.5-flash-02-23" openquery "question"
|
||||||
|
OPENROUTER_MODEL="google/gemini-3-flash-preview" openquery "question"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Save answers**:
|
||||||
|
```bash
|
||||||
|
openquery "What is Docker?" > answer.md
|
||||||
|
# answer.md will contain the streamed output (including spinner chars, so filter):
|
||||||
|
openquery "What is Docker?" 2>/dev/null | sed 's/.\x08//g' > clean-answer.md
|
||||||
|
```
|
||||||
|
|
||||||
|
### Shell Aliases and Functions
|
||||||
|
|
||||||
|
Add to `~/.bashrc` or `~/.zshrc`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Short alias
|
||||||
|
alias oq='openquery'
|
||||||
|
|
||||||
|
# With common options
|
||||||
|
alias oql='openquery -l -q 5 -r 10' # long, thorough
|
||||||
|
alias oqs='openquery -s' # short
|
||||||
|
alias oqv='openquery -v' # verbose
|
||||||
|
|
||||||
|
# Function to save output cleanly
|
||||||
|
oqsave() {
|
||||||
|
openquery "$@" 2>/dev/null | sed 's/.\x08//g' > "answer-$(date +%Y%m%d-%H%M%S).md"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Scripting
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/bin/bash
|
||||||
|
# batch-questions.sh
|
||||||
|
|
||||||
|
while IFS= read -r question; do
|
||||||
|
echo "## $question" >> research.md
|
||||||
|
echo "" >> research.md
|
||||||
|
openquery -l "$question" 2>/dev/null | sed 's/.\x08//g' >> research.md
|
||||||
|
echo "" >> research.md
|
||||||
|
done < questions.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
### Chaining with Other Tools
|
||||||
|
|
||||||
|
Pipe to `jq` (if you modify to output JSON):
|
||||||
|
```bash
|
||||||
|
# Future: openquery --json "question" | jq '.answer'
|
||||||
|
```
|
||||||
|
|
||||||
|
Pipe to `pbcopy` (macOS) or `xclip` (Linux):
|
||||||
|
```bash
|
||||||
|
openquery "quick fact" 2>/dev/null | sed 's/.\x08//g' | pbcopy
|
||||||
|
```
|
||||||
|
|
||||||
|
Filter sources:
|
||||||
|
```bash
|
||||||
|
openquery "topic" 2>/dev/null | sed 's/.\x08//g' | grep -E '^\[Source'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Keyboard Interrupts
|
||||||
|
|
||||||
|
- **Ctrl+C** during processing: Cancels current operation, exits gracefully
|
||||||
|
- **Ctrl+C** during streaming answer: Stops streaming, shows partial answer
|
||||||
|
- **Ctrl+Z** (suspend): Not recommended; may leave background tasks running
|
||||||
|
|
||||||
|
OpenQuery uses proper cancellation tokens to clean up resources on interrupt.
|
||||||
|
|
||||||
|
## Exit Codes
|
||||||
|
|
||||||
|
| Code | Meaning |
|
||||||
|
|------|---------|
|
||||||
|
| 0 | Success - answer was generated |
|
||||||
|
| 1 | Error - see stderr message |
|
||||||
|
| 2 | Configuration error (missing API key) |
|
||||||
|
|
||||||
|
You can check the exit code in shell scripts:
|
||||||
|
```bash
|
||||||
|
openquery "question"
|
||||||
|
if [ $? -eq 0 ]; then
|
||||||
|
echo "Success!"
|
||||||
|
else
|
||||||
|
echo "Failed"
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
## Limitation and Workarounds
|
||||||
|
|
||||||
|
### Question Length
|
||||||
|
Very long questions (>2000 chars) may be truncated by the LLM's context window or cause token limits.
|
||||||
|
|
||||||
|
**Workaround**: Keep questions concise; discuss complex multi-part questions separately.
|
||||||
|
|
||||||
|
### Answer Length Limits
|
||||||
|
The LLM may hit `max_tokens` limits for very complex questions.
|
||||||
|
|
||||||
|
**Workaround**: Use `--long` flag (already maximizes allowed tokens) or break into sub-questions.
|
||||||
|
|
||||||
|
### Rate Limiting
|
||||||
|
OpenRouter may rate limit if you send too many requests too quickly.
|
||||||
|
|
||||||
|
**Symptoms**: 429 errors, occasional timeouts.
|
||||||
|
|
||||||
|
**Workaround**: The built-in retry (Polly) handles this automatically. For persistent issues:
|
||||||
|
- Reduce concurrency (edit code)
|
||||||
|
- Add delays between queries
|
||||||
|
- Upgrade OpenRouter plan
|
||||||
|
|
||||||
|
### SearxNG Timeouts
|
||||||
|
Large SearxNG responses or slow targets may timeout.
|
||||||
|
|
||||||
|
**Workaround**: Reduce `--results` or check SearxNG logs. Nothing to do on OpenQuery side (HTTP client timeout is ~30s default).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [Architecture](architecture.md) - Understand how OpenQuery works under the hood
|
||||||
|
- [Configuration](configuration.md) - Fine-tune your setup
|
||||||
|
- [Troubleshooting](troubleshooting.md) - Solve common problems
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Quick Reference Card**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Basic
|
||||||
|
openquery "question"
|
||||||
|
|
||||||
|
# Quick fact
|
||||||
|
openquery -s "question"
|
||||||
|
|
||||||
|
# Deep research
|
||||||
|
openquery -l -q 5 -r 10 -c 4 "question"
|
||||||
|
|
||||||
|
# See progress
|
||||||
|
openquery -v "question"
|
||||||
|
|
||||||
|
# Configure
|
||||||
|
openquery configure -i
|
||||||
|
|
||||||
|
# Check config
|
||||||
|
cat ~/.config/openquery/config
|
||||||
|
```
|
||||||
Reference in New Issue
Block a user