docs: add comprehensive documentation with README and detailed guides
- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
This commit is contained in:
356
docs/configuration.md
Normal file
356
docs/configuration.md
Normal file
@@ -0,0 +1,356 @@
|
||||
# Configuration
|
||||
|
||||
Complete guide to configuring OpenQuery for your environment.
|
||||
|
||||
## 📋 Table of Contents
|
||||
|
||||
1. [Configuration Methods](#configuration-methods)
|
||||
2. [Configuration File](#configuration-file)
|
||||
3. [Environment Variables](#environment-variables)
|
||||
4. [Command-Line Options](#command-line-options)
|
||||
5. [Configuration Priority](#configuration-priority)
|
||||
6. [Recommended Settings](#recommended-settings)
|
||||
7. [Advanced Configuration](#advanced-configuration)
|
||||
|
||||
## Configuration Methods
|
||||
|
||||
OpenQuery can be configured through three methods, which merge together with clear priority:
|
||||
|
||||
| Method | Persistence | Use Case |
|
||||
|--------|-------------|----------|
|
||||
| Configuration File | Permanent | Default values you use daily |
|
||||
| Environment Variables | Session/Shell | CI/CD, scripting, temporary overrides |
|
||||
| Command-Line Options | Per-execution | One-off customizations |
|
||||
|
||||
## Configuration File
|
||||
|
||||
### Location
|
||||
OpenQuery follows the [XDG Base Directory](https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html) specification:
|
||||
|
||||
- **Linux/macOS**: `~/.config/openquery/config`
|
||||
- **Windows**: `%APPDATA%\openquery\config` (e.g., `C:\Users\<user>\AppData\Roaming\openquery\config`)
|
||||
|
||||
### Format
|
||||
Simple `key=value` pairs, one per line:
|
||||
|
||||
```ini
|
||||
ApiKey=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
Model=qwen/qwen3.5-flash-02-23
|
||||
DefaultQueries=3
|
||||
DefaultChunks=3
|
||||
DefaultResults=5
|
||||
```
|
||||
|
||||
### Schema
|
||||
|
||||
| Key | Type | Default | Description |
|
||||
|-----|------|---------|-------------|
|
||||
| `ApiKey` | string | "" | OpenRouter API authentication key |
|
||||
| `Model` | string | `qwen/qwen3.5-flash-02-23` | Default LLM model to use |
|
||||
| `DefaultQueries` | int | 3 | Number of search queries to generate |
|
||||
| `DefaultChunks` | int | 3 | Number of top context chunks to include |
|
||||
| `DefaultResults` | int | 5 | Number of search results per query |
|
||||
|
||||
### Example Configurations
|
||||
|
||||
**Minimal** (just API key):
|
||||
```ini
|
||||
ApiKey=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
```
|
||||
|
||||
**Optimized for Research**:
|
||||
```ini
|
||||
ApiKey=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
Model=google/gemini-3-flash-preview
|
||||
DefaultQueries=5
|
||||
DefaultChunks=4
|
||||
DefaultResults=10
|
||||
```
|
||||
|
||||
**Cost-Conscious**:
|
||||
```ini
|
||||
ApiKey=sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
Model=qwen/qwen3.5-flash-02-23
|
||||
DefaultQueries=2
|
||||
DefaultChunks=2
|
||||
DefaultResults=3
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Environment variables override the configuration file and can be set temporarily or permanently in your shell profile.
|
||||
|
||||
### Available Variables
|
||||
|
||||
| Variable | Purpose | Required | Example |
|
||||
|----------|---------|----------|---------|
|
||||
| `OPENROUTER_API_KEY` | OpenRouter API key | **Yes** (unless in config file) | `export OPENROUTER_API_KEY="sk-or-..."` |
|
||||
| `OPENROUTER_MODEL` | Override default LLM model | No | `export OPENROUTER_MODEL="deepseek/deepseek-v3.2"` |
|
||||
| `SEARXNG_URL` | URL of SearxNG instance | No (default: `http://localhost:8002`) | `export SEARXNG_URL="https://searx.example.com"` |
|
||||
|
||||
### Setting Environment Variables
|
||||
|
||||
#### Temporary (Current Session)
|
||||
```bash
|
||||
# Linux/macOS
|
||||
export OPENROUTER_API_KEY="sk-or-..."
|
||||
export SEARXNG_URL="http://localhost:8002"
|
||||
|
||||
# Windows PowerShell
|
||||
$env:OPENROUTER_API_KEY="sk-or-..."
|
||||
$env:SEARXNG_URL="http://localhost:8002"
|
||||
```
|
||||
|
||||
#### Permanent (Shell Profile)
|
||||
|
||||
**bash** (`~/.bashrc` or `~/.bash_profile`):
|
||||
```bash
|
||||
export OPENROUTER_API_KEY="sk-or-..."
|
||||
export SEARXNG_URL="http://localhost:8002"
|
||||
```
|
||||
|
||||
**zsh** (`~/.zshrc`):
|
||||
```zsh
|
||||
export OPENROUTER_API_KEY="sk-or-..."
|
||||
export SEARXNG_URL="http://localhost:8002"
|
||||
```
|
||||
|
||||
**fish** (`~/.config/fish/config.fish`):
|
||||
```fish
|
||||
set -x OPENROUTER_API_KEY "sk-or-..."
|
||||
set -x SEARXNG_URL "http://localhost:8002"
|
||||
```
|
||||
|
||||
**Windows** (PowerShell profile):
|
||||
```powershell
|
||||
[Environment]::SetEnvironmentVariable("OPENROUTER_API_KEY", "sk-or-...", "User")
|
||||
[Environment]::SetEnvironmentVariable("SEARXNG_URL", "http://localhost:8002", "User")
|
||||
```
|
||||
|
||||
After editing profile files, restart your terminal or run `source ~/.bashrc` (or equivalent).
|
||||
|
||||
### Security Note
|
||||
Never commit your API key to version control. Use environment variables or config file that's in `.gitignore`. The default `.gitignore` already excludes common build directories but doesn't include the config file since it's outside the project directory (`~/.config/`).
|
||||
|
||||
## Command-Line Options
|
||||
|
||||
Options passed directly to the `openquery` command override both config file and environment variables for that specific execution.
|
||||
|
||||
### Main Command Options
|
||||
|
||||
```bash
|
||||
openquery [OPTIONS] <question>
|
||||
```
|
||||
|
||||
| Option | Aliases | Type | Default Source | Description |
|
||||
|--------|---------|------|----------------|-------------|
|
||||
| `--chunks` | `-c` | int | Config `DefaultChunks` | Number of top context chunks |
|
||||
| `--results` | `-r` | int | Config `DefaultResults` | Search results per query |
|
||||
| ``--queries` | `-q` | int | Config `DefaultQueries` | Number of search queries |
|
||||
| `--short` | `-s` | bool | false | Request concise answer |
|
||||
| `--long` | `-l` | bool | false | Request detailed answer |
|
||||
| `--verbose` | `-v` | bool | false | Show detailed progress |
|
||||
|
||||
### Configure Command Options
|
||||
|
||||
```bash
|
||||
openquery configure [OPTIONS]
|
||||
```
|
||||
|
||||
| Option | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `--interactive` / `-i` | bool | Launch interactive configuration wizard |
|
||||
| `--key` | string | Set API key |
|
||||
| `--model` | string | Set default model |
|
||||
| `--queries` | int? | Set default queries |
|
||||
| `--chunks` | int? | Set default chunks |
|
||||
| `--results` | int? | Set default results |
|
||||
|
||||
## Configuration Priority
|
||||
|
||||
When OpenQuery needs a value, it checks sources in this order (highest to lowest priority):
|
||||
|
||||
1. **Command-line option** (if provided)
|
||||
2. **Environment variable** (if set)
|
||||
3. **Configuration file** (if key exists)
|
||||
4. **Hard-coded default** (if all above missing)
|
||||
|
||||
### Examples
|
||||
|
||||
**Example 1**: Environment overrides config
|
||||
```bash
|
||||
# config file: DefaultQueries=5
|
||||
export OPENROUTER_MODEL="deepseek/deepseek-v3.2"
|
||||
openquery --queries 2 "question" # Uses: queries=2 (CLI), model=deepseek (env), chunks=3 (config)
|
||||
```
|
||||
|
||||
**Example 2**: CLI overrides everything
|
||||
```bash
|
||||
export OPENROUTER_MODEL="qwen/qwen3.5-flash-02-23"
|
||||
openquery --model "google/gemini-3-flash-preview" --chunks 5 "question"
|
||||
# Uses: model=google (CLI), chunks=5 (CLI), queries=3 (default)
|
||||
```
|
||||
|
||||
**Example 3**: All sources combined
|
||||
```bash
|
||||
# config: DefaultChunks=4
|
||||
# env: OPENROUTER_MODEL="moonshotai/kimi-k2.5", SEARXNG_URL="http://custom:8002"
|
||||
# CLI: --queries 6 --short
|
||||
openquery "question"
|
||||
# Uses: queries=6 (CLI), chunks=4 (config), results=5 (config),
|
||||
# model=kimi-k2.5 (env), searxng=custom (env), short=true (CLI)
|
||||
```
|
||||
|
||||
## Recommended Settings
|
||||
|
||||
### For Quick Questions (Facts, Definitions)
|
||||
```bash
|
||||
openquery -q 2 -r 3 -c 2 "What is the capital of France?"
|
||||
```
|
||||
- Few queries (2) for straightforward facts
|
||||
- Few results (3) to minimize processing
|
||||
- Few chunks (2) for focused answer
|
||||
|
||||
### For Research (Complex Topics)
|
||||
```bash
|
||||
openquery -q 5 -r 10 -c 4 -l "Explain the causes of the French Revolution"
|
||||
```
|
||||
- More queries (5) for diverse perspectives
|
||||
- More results (10) for comprehensive coverage
|
||||
- More chunks (4) for rich context
|
||||
- Long format for depth
|
||||
|
||||
### For Exploration (Broad Topics)
|
||||
```bash
|
||||
openquery -q 8 -r 15 -c 5 "What are the latest developments in AI?"
|
||||
```
|
||||
- Many queries (8) to explore different angles
|
||||
- Many results (15) for breadth
|
||||
- More chunks (5) for extensive context
|
||||
|
||||
### Cost Optimization
|
||||
```bash
|
||||
openquery configure --model "qwen/qwen3.5-flash-02-23"
|
||||
# Keep defaults: -q 3 -r 5 -c 3
|
||||
```
|
||||
- Qwen Flash is very cost-effective
|
||||
- Default parameters provide good balance
|
||||
|
||||
### Performance Optimization
|
||||
```bash
|
||||
# Adjust ParallelProcessingOptions in SearchTool.cs if needed
|
||||
# Default: MaxConcurrentArticleFetches=10, MaxConcurrentEmbeddingRequests=4
|
||||
```
|
||||
- Reduce these values if you see rate limits or memory pressure
|
||||
- Increase them if you have fast network/API and want more speed
|
||||
|
||||
## Advanced Configuration
|
||||
|
||||
### Changing Concurrency Limits
|
||||
|
||||
Concurrency limits are currently hardcoded in `SearchTool.cs` but can be adjusted:
|
||||
|
||||
```csharp
|
||||
public class ParallelProcessingOptions
|
||||
{
|
||||
public int MaxConcurrentArticleFetches { get; set; } = 10; // ← Change this
|
||||
public int MaxConcurrentEmbeddingRequests { get; set; } = 4; // ← Change this
|
||||
public int EmbeddingBatchSize { get; set; } = 300; // ← Change this
|
||||
}
|
||||
```
|
||||
|
||||
To make these configurable, you could:
|
||||
1. Add fields to `AppConfig`
|
||||
2. Read from config file
|
||||
3. Pass through to `SearchTool` constructor
|
||||
|
||||
### Custom Embedding Model
|
||||
|
||||
The embedding model is hardcoded to `openai/text-embedding-3-small`. To change:
|
||||
|
||||
Edit the `EmbeddingService` constructor:
|
||||
```csharp
|
||||
public EmbeddingService(OpenRouterClient client, string embeddingModel = "your-model")
|
||||
```
|
||||
|
||||
Or make it configurable via CLI/config (future enhancement).
|
||||
|
||||
### Changing Chunk Size
|
||||
|
||||
Chunk size (500 chars) is defined in `ChunkingService.cs`:
|
||||
```csharp
|
||||
private const int MAX_CHUNK_SIZE = 500;
|
||||
```
|
||||
|
||||
Modify this constant to change how articles are split. Larger chunks:
|
||||
- ✅ More context per chunk
|
||||
- ❌ Fewer chunks for same article
|
||||
- ❌ Higher token usage in final answer
|
||||
|
||||
Smaller chunks:
|
||||
- ✅ More granular matching
|
||||
- ❌ May lose context across chunk boundaries
|
||||
|
||||
### Using a Custom SearxNG Instance
|
||||
|
||||
Some SearxNG deployments may require HTTPS, authentication, or custom paths:
|
||||
|
||||
```bash
|
||||
# With authentication (if supported)
|
||||
export SEARXNG_URL="https://user:pass@searx.example.com:8080"
|
||||
|
||||
# With custom path
|
||||
export SEARXNG_URL="https://searx.example.com/custom-path"
|
||||
```
|
||||
|
||||
Note: Most SearxNG instances don't require auth as they're designed for privacy.
|
||||
|
||||
### OpenRouter Settings
|
||||
|
||||
OpenRouter supports additional parameters (not yet exposed in OpenQuery):
|
||||
|
||||
- `temperature` - Randomness (0-2, default ~1)
|
||||
- `max_tokens` - Response length limit
|
||||
- `top_p` - Nucleus sampling
|
||||
- `frequency_penalty` / `presence_penalty`
|
||||
|
||||
These could be added to `ChatCompletionRequest` in future versions.
|
||||
|
||||
## Managing Multiple Configurations
|
||||
|
||||
You can maintain multiple config files and symlink or set per-project:
|
||||
|
||||
```bash
|
||||
# Create project-specific config
|
||||
cp ~/.config/openquery/config ~/myproject/openquery.config
|
||||
|
||||
# Use it temporarily
|
||||
OPENQUERY_CONFIG=~/myproject/openquery.config openquery "question"
|
||||
```
|
||||
|
||||
**Note**: Currently OpenQuery only looks at `~/.config/openquery/config`. Multi-config support would require code changes (reading from `OPENQUERY_CONFIG` env var).
|
||||
|
||||
## Configuration Validation
|
||||
|
||||
OpenQuery doesn't strictly validate config values. Invalid settings may cause runtime errors:
|
||||
|
||||
- `DefaultQueries <= 0` → May cause exceptions or zero queries
|
||||
- `DefaultChunks <= 0` → May return no context
|
||||
- `DefaultResults <= 0` → No search results
|
||||
|
||||
Validate manually:
|
||||
```bash
|
||||
# Test your config loads
|
||||
cat ~/.config/openquery/config
|
||||
|
||||
# Test with verbose mode
|
||||
openquery -v "test"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Usage Guide](usage.md) - Learn how to use the CLI
|
||||
- [Architecture](architecture.md) - Understand the system design
|
||||
- [Troubleshooting](troubleshooting.md) - Fix common issues
|
||||
Reference in New Issue
Block a user