docs: add comprehensive documentation with README and detailed guides

- Add user-friendly README.md with quick start guide - Create docs/ folder with structured technical documentation: - installation.md: Build and setup instructions - configuration.md: Complete config reference - usage.md: CLI usage guide with examples - architecture.md: System design and patterns - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models) - api/: CLI reference, environment variables, programmatic API - troubleshooting.md: Common issues and solutions - performance.md: Latency, throughput, and optimization - All documentation fully cross-referenced with internal links - Covers project overview, architecture, components, APIs, and support See individual files for complete documentation.
2026-03-19 10:01:58 +01:00
parent b28d8998f7
commit 65ca2401ae
16 changed files with 7073 additions and 0 deletions
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -0,0 +1,699 @@
+# Troubleshooting
+
+Solve common issues, errors, and performance problems with OpenQuery.
+
+## 📋 Table of Contents
+
+1. [Common Errors](#common-errors)
+2. [Performance Issues](#performance-issues)
+3. [Debugging Strategies](#debugging-strategies)
+4. [Getting Help](#getting-help)
+
+## Common Errors
+
+### ❌ "API Key is missing"
+
+**Error Message**:
+```
+[Error] API Key is missing. Set OPENROUTER_API_KEY environment variable or run 'configure -i' to set it up.
+```
+
+**Cause**: No API key available from environment or config file.
+
+**Solutions**:
+
+1. **Set environment variable** (temporary):
+```bash
+export OPENROUTER_API_KEY="sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
+```
+
+2. **Configure interactively** (persistent):
+```bash
+openquery configure -i
+# Follow prompts to enter API key
+```
+
+3. **Check config file**:
+```bash
+cat ~/.config/openquery/config
+# Should contain: ApiKey=sk-or-...
+```
+
+4. **Verify environment**:
+```bash
+echo $OPENROUTER_API_KEY
+# If empty, you didn't export or exported in wrong shell
+```
+
+---
+
+### ❌ "Network request failed"
+
+**Error Message**:
+```
+[Error] Network request failed. Details: Name or service not known
+```
+
+**Cause**: Cannot reach OpenRouter or SearxNG API endpoints.
+
+**Solutions**:
+
+1. **Check internet connectivity**:
+```bash
+ping 8.8.8.8
+curl https://openrouter.ai
+```
+
+2. **Verify SearxNG is running**:
+```bash
+curl "http://localhost:8002/search?q=test&format=json"
+# Should return JSON
+```
+
+If connection refused:
+```bash
+# Start SearxNG if using Docker
+docker start searxng
+# Or run fresh
+docker run -d --name searxng -p 8002:8080 searxng/searxng:latest
+```
+
+3. **Check firewall/proxy**:
+```bash
+# Test OpenRouter API
+curl -H "Authorization: Bearer $OPENROUTER_API_KEY" \
+     https://openrouter.ai/api/v1/models
+```
+
+4. **Test from different network** (if behind restrictive firewall)
+
+---
+
+### ❌ "No search results found"
+
+**Error Message**:
+```
+No search results found.
+```
+
+**Cause**: Search queries returned zero results from SearxNG.
+
+**Solutions**:
+
+1. **Test SearxNG manually**:
+```bash
+curl "http://localhost:8002/search?q=test&format=json" | jq '.results | length'
+# Should be > 0
+```
+
+2. **Check SearxNG configuration**:
+- If self-hosted: ensure internet access is enabled in `/etc/searxng/settings.yml`
+- Some public instances disable certain engines or have rate limits
+
+3. **Try a different SearxNG instance**:
+```bash
+export SEARXNG_URL="https://searx.example.com"
+openquery "question"
+```
+
+4. **Use simpler queries**: Some queries may be too obscure or malformed
+
+5. **Verbose mode to see queries**:
+```bash
+openquery -v "complex question"
+# See what queries were generated
+```
+
+---
+
+### ❌ "Found search results but could not extract readable content."
+
+**Cause**: SearxNG returned results but `ArticleService` failed to extract content from all URLs.
+
+**Common Reasons**:
+- JavaScript-heavy sites (React, Vue apps) where content loaded dynamically
+- Paywalled sites (NYT, academic journals)
+- PDFs or non-HTML content
+- Malformed HTML
+- Server returned error (404, 403, 500)
+- `robots.txt` blocked crawler
+
+**Solutions**:
+1. **Accept that some sites can't be scraped** - try different query to get different results
+2. **Use site:reddit.com or site:wikipedia.org** - these are usually scrape-friendly
+3. **Increase `--results`** to get more URLs (some will work)
+4. **Check verbose output**:
+```bash
+openquery -v "question"
+# Look for "Warning: Failed to fetch article"
+```
+5. **Try a local SearxNG instance with more engines** - some engines fetch different sources
+
+---
+
+### ❌ Rate Limiting (429 Too Many Requests)
+
+**Symptoms**:
+```bash
+[Error] Response status code does not indicate success: 429 (Too Many Requests).
+```
+
+Or retries exhausting after Polly attempts.
+
+**Cause**: Too many concurrent requests to OpenRouter API.
+
+**Solutions**:
+
+1. **Reduce concurrency** (edit `SearchTool.cs`):
+```csharp
+var _options = new ParallelProcessingOptions
+{
+    MaxConcurrentArticleFetches = 5,  // reduce from 10
+    MaxConcurrentEmbeddingRequests = 2,  // reduce from 4
+    EmbeddingBatchSize = 150  // reduce from 300
+};
+```
+
+2. **Add delay** between embedding batches (custom implementation)
+
+3. **Upgrade OpenRouter plan** to higher rate limits
+
+4. **Wait and retry** - rate limits reset after time window
+
+---
+
+### ❌ Slow Performance
+
+**Symptom**: Queries take 60+ seconds when they usually take 20s.
+
+**Diagnosis Steps**:
+
+1. **Run with verbose mode**:
+```bash
+openquery -v "question"
+```
+Watch which phase takes longest:
+- Query generation?
+- Searching?
+- Fetching articles?
+- Embeddings?
+
+2. **Check network latency**:
+```bash
+time curl "https://openrouter.ai/api/v1/models"
+time curl "http://localhost:8002/search?q=test&format=json"
+```
+
+**Common Causes & Fixes**:
+
+| Phase | Cause | Fix |
+|-------|-------|-----|
+| Searches | SearxNG overloaded/slow | Check CPU/memory, restart container |
+| Fetching | Target sites slow | Reduce `--results` to fewer URLs |
+| Embeddings | API rate limited | Reduce concurrency (see above) |
+| Answer | Heavy model/load | Switch to faster model (e.g., Qwen Flash) |
+
+3. **Resource monitoring**:
+```bash
+htop  # CPU/memory usage
+iftop  # network throughput
+```
+
+4. **Reduce parameters**:
+```bash
+openquery -q 2 -r 3 -c 2 "question"  # lighter load
+```
+
+---
+
+### ❌ Out of Memory
+
+**Symptoms**:
+- Process killed by OOM killer (Linux)
+- `System.OutOfMemoryException`
+- System becomes unresponsive
+
+**Cause**: Processing too many large articles simultaneously.
+
+**Why**: Each article can be 100KB+ of text, split into many chunks, embeddings are 6KB per chunk (1536 floats × 4 bytes). 200 chunks = 1.2MB embeddings, plus text ~100KB = 1.3MB. Not huge, but many large articles could create thousands of chunks.
+
+**Solutions**:
+
+1. **Reduce `--results`** (fewer URLs per query):
+```bash
+openquery -r 3 "question"  # instead of 10
+```
+
+2. **Reduce `--queries`** (fewer search queries):
+```bash
+openquery -q 2 "question"
+```
+
+3. **Fetches already limited** to 10 concurrent by default, which is reasonable
+
+4. **Check article size**: Some sites (PDFs, long documents) may yield megabytes of text; SmartReader should truncate but may not
+
+---
+
+### ❌ Invalid JSON from Query Generation
+
+**Symptom**: Query generation fails silently, falls back to original question.
+
+**Cause**: LLM returned non-JSON (even though instructed). Could be:
+- Model not instruction-following
+- Output exceeded context window
+- API error in response
+
+**Detection**: Run with `-v` to see:
+```
+[Failed to generate queries, falling back to original question. Error: ...]
+```
+
+**Solutions**:
+- Try a different model (configure to use Gemini or DeepSeek)
+- Reduce `--queries` count (simpler task)
+- Tune system prompt (would require code change)
+- Accept fallback - the original question often works as sole query
+
+---
+
+### ❌ Spinner Artifacts in Output
+
+**Symptom**: When redirecting output to file, you see weird characters like `⠋`, `<60>`, etc.
+
+**Cause**: Spinner uses Unicode Braille characters and ANSI escape codes.
+
+**Fix**: Use `2>/dev/null | sed 's/.\x08//g'` to clean:
+```bash
+openquery "question" 2>/dev/null | sed 's/.\x08//g' > answer.md
+```
+
+Or run with `--verbose` (no spinner, only newline-separated messages):
+```bash
+openquery -v "question" > answer.txt
+```
+
+---
+
+### ❌ "The type or namespace name '...' does not exist" (Build Error)
+
+**Cause**: Missing NuGet package or wrong .NET SDK version.
+
+**Solution**:
+
+1. **Verify .NET SDK 10.0**:
+```bash
+dotnet --version
+# Should be 10.x
+```
+
+If lower: https://dotnet.microsoft.com/download/dotnet/10.0
+
+2. **Restore packages**:
+```bash
+dotnet restore
+```
+
+3. **Clean and rebuild**:
+```bash
+dotnet clean
+dotnet build
+```
+
+4. **Check OpenQuery.csproj** for package references:
+```xml
+<PackageReference Include="Polly.Core" Version="8.6.6" />
+<PackageReference Include="Polly.RateLimiting" Version="8.6.6" />
+<PackageReference Include="SmartReader" Version="0.11.0" />
+<PackageReference Include="System.CommandLine" Version="2.0.0-beta4.22272.1" />
+<PackageReference Include="System.Numerics.Tensors" Version="9.0.0" />
+```
+
+If restore fails, these packages may not be available for .NET 10 preview. Consider:
+- Downgrade to .NET 8.0 (if packages incompatible)
+- Or find package versions compatible with .NET 10
+
+---
+
+### ❌ AOT Compilation Fails
+
+**Error**: `error NETSDK1085: The current .NET SDK does not support targeting .NET 10.0.`
+
+**Cause**: Using .NET SDK older than 10.0.
+
+**Fix**: Install .NET SDK 10.0 preview.
+
+**Or**: Disable AOT for development (edit `.csproj`):
+```xml
+<!-- Remove or set to false -->
+<PublishAot>false</PublishAot>
+```
+
+---
+
+## Performance Issues
+
+### Slow First Request
+
+**Expected**: First query slower (JIT compilation for .NET runtime if not AOT, or initial API connections).
+
+If not using AOT:
+- Consider publishing with `/p:PublishAot=true` for production distribution
+- Development builds use JIT, which adds 500ms-2s warmup
+
+**Mitigation**: Accept as warmup cost, or pre-warm with dummy query.
+
+---
+
+### High Memory Usage
+
+**Check**:
+```bash
+ps aux | grep OpenQuery
+# Look at RSS (resident set size)
+```
+
+**Typical**: 50-200MB (including .NET runtime, AOT code, data structures)
+
+**If >500MB**:
+- Likely processing very many articles
+- Check `--results` and `--queries` values
+- Use `--verbose` to see counts: `[Fetched X search results]`, `[Extracted Y chunks]`
+
+**Reduce**:
+- `--queries 2` instead of 10
+- `--results 3` instead of 15
+- These directly limit number of URLs to fetch
+
+---
+
+### High CPU Usage
+
+**Cause**: 
+- SmartReader HTML parsing (CPU-bound)
+- Cosine similarity calculations (many chunks, but usually fast)
+- Spinner animation (negligible)
+
+**Check**: `htop` → which core at 100%? If single core, likely parsing. If all cores, parallel fetch.
+
+**Mitigation**:
+- Ensure `MaxConcurrentArticleFetches` not excessively high (default 10 is okay)
+- Accept - CPU spikes normal during fetch phase
+
+---
+
+### API Costs Higher Than Expected
+
+**Symptom**: OpenRouter dashboard shows high token usage.
+
+**Causes**:
+1. Using expensive model (check `OPENROUTER_MODEL`)
+2. High `--chunks` → more tokens in context
+3. High `--queries` + `--results` → many articles → many embedding tokens (usually cheap)
+4. Long answers (many completion tokens) - especially with `--long`
+
+**Mitigation**:
+- Use `qwen/qwen3.5-flash-02-23` (cheapest good option)
+- Reduce `--chunks` to 2-3
+- Use `--short` when detailed answer not needed
+- Set `MaxTokens` in request (would need code change or **LLM capabilities**
+
+---
+
+## Debugging Strategies
+
+### 1. Enable Verbose Mode
+
+Always start with:
+```bash
+openquery -v "question" 2>&1 | tee debug.log
+```
+
+Logs everything:
+- Generated queries
+- URLs fetched
+- Progress counts
+- Errors/warnings
+
+**Analyze log**:
+- How many queries generated? (Should match `--queries`)
+- How many search results per query? (Should be ≤ `--results`)
+- How many articles fetched successfully?
+- How many chunks extracted?
+- Any warnings?
+
+---
+
+### 2. Isolate Components
+
+**Test SearxNG**:
+```bash
+curl "http://localhost:8002/search?q=test&format=json" | jq '.results[0]'
+```
+
+**Test OpenRouter API**:
+```bash
+curl -X POST https://openrouter.ai/api/v1/chat/completions \
+  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{"model":"qwen/qwen3.5-flash-02-23","messages":[{"role":"user","content":"Hello"}]}'
+```
+
+**Test Article Fetching** (with known good URL):
+```bash
+curl -L "https://example.com/article" | head -50
+```
+Then check if SmartReader can parse.
+
+---
+
+### 3. Reduce Scope
+
+Test with minimal parameters to isolate failing phase:
+
+```bash
+# 1 query, 2 results, 1 chunk - should be fast and simple
+openquery -q 1 -r 2 -c 1 "simple test question" -v
+
+# If that works, gradually increase:
+openquery -q 1 -r 5 -c 1 "simple question"
+openquery -q 3 -r 5 -c 1 "simple question"
+openquery -q 3 -r 5 -c 3 "simple question"
+
+# Then try complex question
+```
+
+---
+
+### 4. Check Resource Limits
+
+**File descriptors**: If fetching many articles, may hit limit.
+```bash
+ulimit -n  # usually 1024, should be fine
+```
+
+**Memory**: Monitor with `free -h` while running.
+
+**Disk space**: Not much disk use, but logs could fill if verbose mode used repeatedly.
+
+---
+
+### 5. Examine Config File
+
+```bash
+cat ~/.config/openquery/config
+# Ensure no spaces around '='
+# Correct: ApiKey=sk-or-...
+# Wrong: ApiKey = sk-or-...  (spaces become part of value)
+```
+
+Reconfigure if needed:
+```bash
+openquery configure --key "sk-or-..."
+```
+
+---
+
+### 6. Clear Cache / Reset
+
+No persistent cache exists, but:
+- Re-start SearxNG container: `docker restart searxng`
+- Clear DNS cache if network issues: `sudo systemd-resolve --flush-caches`
+
+---
+
+## Getting Help
+
+### Before Asking
+
+Gather information:
+
+1. **OpenQuery version** (commit or build date if available)
+2. **OS and architecture**: `uname -a` (Linux/macOS) or `systeminfo` (Windows)
+3. **Full command** you ran
+4. **Verbose output**: `openquery -v "question" 2>&1 | tee log.txt`
+5. **Config** (redact API key):
+```bash
+sed 's/ApiKey=.*/ApiKey=REDACTED/' ~/.config/openquery/config
+```
+6. **SearxNG test**:
+```bash
+curl -s "http://localhost:8002/search?q=test&format=json" | jq '.results | length'
+```
+7. **OpenRouter test**:
+```bash
+curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \
+     https://openrouter.ai/api/v1/models | jq '.data[0].id'
+```
+
+---
+
+### Where to Ask
+
+1. **GitHub Issues** (if repository hosted there):
+   - Search existing issues first
+   - Provide all info from above
+   - Include log file (or link to gist)
+
+2. **Community Forum** (if exists)
+
+3. **Self-Diagnose**:
+   - Check `docs/troubleshooting.md` (this file)
+   - Check `docs/configuration.md`
+   - Check `docs/usage.md`
+
+---
+
+### Example Bug Report
+
+```
+Title: OpenQuery hangs on "Fetching article X/Y"
+
+Platform: Ubuntu 22.04, .NET 10.0, OpenQuery built from commit abc123
+Command: openquery -v "What is Docker?" 2>&1 | tee log.txt
+
+Verbose output shows:
+[...]
+[Fetching article 1/15: docker.com]
+[Fetching article 2/15: hub.docker.com]
+[Fetching article 3/15: docs.docker.com]
+# Hangs here indefinitely, no more progress
+
+SearxNG test:
+$ curl "http://localhost:8002/search?q=docker&format=json" | jq '.results | length'
+15  # SearxNG works
+
+Config:
+ApiKey=sk-or-xxxx (redacted)
+Model=qwen/qwen3.5-flash-02-23
+DefaultQueries=3
+DefaultChunks=3
+DefaultResults=5
+
+Observation:
+- Fetches 3 articles fine, then stalls
+- Nothing in log after "Fetching article 3/15"
+- Process uses ~150MB memory, CPU 0% (idle)
+- Ctrl+C exits immediately
+
+Expected: Should fetch remaining 12 articles (concurrent up to 10)
+Actual: Only 3 fetched, then silent hang
+```
+
+---
+
+## Known Issues
+
+### Issue: Spinner Characters Not Displaying
+
+Some terminals don't support Braille Unicode patterns.
+
+**Symptoms**: Spinner shows as `?` or boxes.
+
+**Fix**: Use font with Unicode support, or disable spinner by setting `TERM=dumb` or use `--verbose`.
+
+---
+
+### Issue: Progress Messages Overwritten
+
+In very fast operations, progress updates may overlap.
+
+**Cause**: `StatusReporter` uses `Console.Write` without lock in compact mode; concurrent writes from channel processor and spinner task could interleave.
+
+**Mitigation**: Unlikely in practice (channel serializes, spinner only updates when `_currentMessage` set). If problematic, add lock around Console operations.
+
+---
+
+### Issue: Articles with No Text Content
+
+Some URLs return articles with empty `TextContent`.
+
+**Cause**: SmartReader's quality heuristic (`IsReadable`) failed, or article truly has no text (image, script, error page).
+
+**Effect**: Those URLs contribute zero chunks.
+
+**Acceptable**: Part of normal operation; not all URLs yield readable content.
+
+---
+
+### Issue: Duplicate Sources in Answer
+
+Same website may appear multiple times (different articles).
+
+**Cause**: Different URLs from different search results may be from same domain but different pages.
+
+**Effect**: `[Source 1]` and `[Source 3]` could both be `example.com`. Not necessarily bad - they're different articles.
+
+---
+
+## Performance Tuning Reference
+
+| Setting | Default | Fastest | Most Thorough | Notes |
+|---------|---------|---------|---------------|-------|
+| `--queries` | 3 | 1 | 8+ | More queries = more searches |
+| `--results` | 5 | 2 | 15+ | Fewer = fewer articles to fetch |
+| `--chunks` | 3 | 1 | 5+ | More chunks = more context tokens |
+| `MaxConcurrentArticleFetches` | 10 | 5 | 20 | Higher = more parallel fetches |
+| `MaxConcurrentEmbeddingRequests` | 4 | 2 | 8 | Higher = faster embeddings (may hit rate limits) |
+| `EmbeddingBatchSize` | 300 | 100 | 1000 | Larger = fewer API calls, more data per call |
+
+**Start**: Defaults are balanced.
+
+**Adjust if**:
+- Slow: Reduce `--results`, `--queries`, or concurrency limits
+- Poor quality: Increase `--chunks`, `--results`, `--queries`
+- Rate limited: Reduce concurrency limits
+- High cost: Use `--short`, reduce `--chunks`, choose cheaper model
+
+---
+
+## Next Steps
+
+- [Performance](../performance.md) - Detailed performance analysis
+- [Configuration](../configuration.md) - Adjust settings
+- [Usage](../usage.md) - Optimize workflow
+
+---
+
+**Quick Diagnostic Checklist**
+
+```bash
+# 1. Check API key
+echo $OPENROUTER_API_KEY | head -c 10
+
+# 2. Test SearxNG
+curl -s "http://localhost:8002/search?q=test&format=json" | jq '.results | length'
+
+# 3. Test OpenRouter
+curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \
+     https://openrouter.ai/api/v1/models | jq '.data[0].id'
+
+# 4. Run verbose
+openquery -v "test" 2>&1 | grep -E "Fetching|Generated|Found"
+
+# 5. Check resource usage while running
+htop
+
+# 6. Reduce scope and retry
+openquery -q 1 -r 2 -c 1 "simple test"
+```