# Troubleshooting Solve common issues, errors, and performance problems with OpenQuery. ## 📋 Table of Contents 1. [Common Errors](#common-errors) 2. [Performance Issues](#performance-issues) 3. [Debugging Strategies](#debugging-strategies) 4. [Getting Help](#getting-help) ## Common Errors ### ❌ "API Key is missing" **Error Message**: ``` [Error] API Key is missing. Set OPENROUTER_API_KEY environment variable or run 'configure -i' to set it up. ``` **Cause**: No API key available from environment or config file. **Solutions**: 1. **Set environment variable** (temporary): ```bash export OPENROUTER_API_KEY="sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" ``` 2. **Configure interactively** (persistent): ```bash openquery configure -i # Follow prompts to enter API key ``` 3. **Check config file**: ```bash cat ~/.config/openquery/config # Should contain: ApiKey=sk-or-... ``` 4. **Verify environment**: ```bash echo $OPENROUTER_API_KEY # If empty, you didn't export or exported in wrong shell ``` --- ### ❌ "Network request failed" **Error Message**: ``` [Error] Network request failed. Details: Name or service not known ``` **Cause**: Cannot reach OpenRouter or SearxNG API endpoints. **Solutions**: 1. **Check internet connectivity**: ```bash ping 8.8.8.8 curl https://openrouter.ai ``` 2. **Verify SearxNG is running**: ```bash curl "http://localhost:8002/search?q=test&format=json" # Should return JSON ``` If connection refused: ```bash # Start SearxNG if using Docker docker start searxng # Or run fresh docker run -d --name searxng -p 8002:8080 searxng/searxng:latest ``` 3. **Check firewall/proxy**: ```bash # Test OpenRouter API curl -H "Authorization: Bearer $OPENROUTER_API_KEY" \ https://openrouter.ai/api/v1/models ``` 4. **Test from different network** (if behind restrictive firewall) --- ### ❌ "No search results found" **Error Message**: ``` No search results found. ``` **Cause**: Search queries returned zero results from SearxNG. **Solutions**: 1. **Test SearxNG manually**: ```bash curl "http://localhost:8002/search?q=test&format=json" | jq '.results | length' # Should be > 0 ``` 2. **Check SearxNG configuration**: - If self-hosted: ensure internet access is enabled in `/etc/searxng/settings.yml` - Some public instances disable certain engines or have rate limits 3. **Try a different SearxNG instance**: ```bash export SEARXNG_URL="https://searx.example.com" openquery "question" ``` 4. **Use simpler queries**: Some queries may be too obscure or malformed 5. **Verbose mode to see queries**: ```bash openquery -v "complex question" # See what queries were generated ``` --- ### ❌ "Found search results but could not extract readable content." **Cause**: SearxNG returned results but `ArticleService` failed to extract content from all URLs. **Common Reasons**: - JavaScript-heavy sites (React, Vue apps) where content loaded dynamically - Paywalled sites (NYT, academic journals) - PDFs or non-HTML content - Malformed HTML - Server returned error (404, 403, 500) - `robots.txt` blocked crawler **Solutions**: 1. **Accept that some sites can't be scraped** - try different query to get different results 2. **Use site:reddit.com or site:wikipedia.org** - these are usually scrape-friendly 3. **Increase `--results`** to get more URLs (some will work) 4. **Check verbose output**: ```bash openquery -v "question" # Look for "Warning: Failed to fetch article" ``` 5. **Try a local SearxNG instance with more engines** - some engines fetch different sources --- ### ❌ Rate Limiting (429 Too Many Requests) **Symptoms**: ```bash [Error] Response status code does not indicate success: 429 (Too Many Requests). ``` Or retries exhausting after Polly attempts. **Cause**: Too many concurrent requests to OpenRouter API. **Solutions**: 1. **Reduce concurrency** (edit `SearchTool.cs`): ```csharp var _options = new ParallelProcessingOptions { MaxConcurrentArticleFetches = 5, // reduce from 10 MaxConcurrentEmbeddingRequests = 2, // reduce from 4 EmbeddingBatchSize = 150 // reduce from 300 }; ``` 2. **Add delay** between embedding batches (custom implementation) 3. **Upgrade OpenRouter plan** to higher rate limits 4. **Wait and retry** - rate limits reset after time window --- ### ❌ Slow Performance **Symptom**: Queries take 60+ seconds when they usually take 20s. **Diagnosis Steps**: 1. **Run with verbose mode**: ```bash openquery -v "question" ``` Watch which phase takes longest: - Query generation? - Searching? - Fetching articles? - Embeddings? 2. **Check network latency**: ```bash time curl "https://openrouter.ai/api/v1/models" time curl "http://localhost:8002/search?q=test&format=json" ``` **Common Causes & Fixes**: | Phase | Cause | Fix | |-------|-------|-----| | Searches | SearxNG overloaded/slow | Check CPU/memory, restart container | | Fetching | Target sites slow | Reduce `--results` to fewer URLs | | Embeddings | API rate limited | Reduce concurrency (see above) | | Answer | Heavy model/load | Switch to faster model (e.g., Qwen Flash) | 3. **Resource monitoring**: ```bash htop # CPU/memory usage iftop # network throughput ``` 4. **Reduce parameters**: ```bash openquery -q 2 -r 3 -c 2 "question" # lighter load ``` --- ### ❌ Out of Memory **Symptoms**: - Process killed by OOM killer (Linux) - `System.OutOfMemoryException` - System becomes unresponsive **Cause**: Processing too many large articles simultaneously. **Why**: Each article can be 100KB+ of text, split into many chunks, embeddings are 6KB per chunk (1536 floats × 4 bytes). 200 chunks = 1.2MB embeddings, plus text ~100KB = 1.3MB. Not huge, but many large articles could create thousands of chunks. **Solutions**: 1. **Reduce `--results`** (fewer URLs per query): ```bash openquery -r 3 "question" # instead of 10 ``` 2. **Reduce `--queries`** (fewer search queries): ```bash openquery -q 2 "question" ``` 3. **Fetches already limited** to 10 concurrent by default, which is reasonable 4. **Check article size**: Some sites (PDFs, long documents) may yield megabytes of text; SmartReader should truncate but may not --- ### ❌ Invalid JSON from Query Generation **Symptom**: Query generation fails silently, falls back to original question. **Cause**: LLM returned non-JSON (even though instructed). Could be: - Model not instruction-following - Output exceeded context window - API error in response **Detection**: Run with `-v` to see: ``` [Failed to generate queries, falling back to original question. Error: ...] ``` **Solutions**: - Try a different model (configure to use Gemini or DeepSeek) - Reduce `--queries` count (simpler task) - Tune system prompt (would require code change) - Accept fallback - the original question often works as sole query --- ### ❌ Spinner Artifacts in Output **Symptom**: When redirecting output to file, you see weird characters like `⠋`, `�`, etc. **Cause**: Spinner uses Unicode Braille characters and ANSI escape codes. **Fix**: Use `2>/dev/null | sed 's/.\x08//g'` to clean: ```bash openquery "question" 2>/dev/null | sed 's/.\x08//g' > answer.md ``` Or run with `--verbose` (no spinner, only newline-separated messages): ```bash openquery -v "question" > answer.txt ``` --- ### ❌ "The type or namespace name '...' does not exist" (Build Error) **Cause**: Missing NuGet package or wrong .NET SDK version. **Solution**: 1. **Verify .NET SDK 10.0**: ```bash dotnet --version # Should be 10.x ``` If lower: https://dotnet.microsoft.com/download/dotnet/10.0 2. **Restore packages**: ```bash dotnet restore ``` 3. **Clean and rebuild**: ```bash dotnet clean dotnet build ``` 4. **Check OpenQuery.csproj** for package references: ```xml ``` If restore fails, these packages may not be available for .NET 10 preview. Consider: - Downgrade to .NET 8.0 (if packages incompatible) - Or find package versions compatible with .NET 10 --- ### ❌ AOT Compilation Fails **Error**: `error NETSDK1085: The current .NET SDK does not support targeting .NET 10.0.` **Cause**: Using .NET SDK older than 10.0. **Fix**: Install .NET SDK 10.0 preview. **Or**: Disable AOT for development (edit `.csproj`): ```xml false ``` --- ## Performance Issues ### Slow First Request **Expected**: First query slower (JIT compilation for .NET runtime if not AOT, or initial API connections). If not using AOT: - Consider publishing with `/p:PublishAot=true` for production distribution - Development builds use JIT, which adds 500ms-2s warmup **Mitigation**: Accept as warmup cost, or pre-warm with dummy query. --- ### High Memory Usage **Check**: ```bash ps aux | grep OpenQuery # Look at RSS (resident set size) ``` **Typical**: 50-200MB (including .NET runtime, AOT code, data structures) **If >500MB**: - Likely processing very many articles - Check `--results` and `--queries` values - Use `--verbose` to see counts: `[Fetched X search results]`, `[Extracted Y chunks]` **Reduce**: - `--queries 2` instead of 10 - `--results 3` instead of 15 - These directly limit number of URLs to fetch --- ### High CPU Usage **Cause**: - SmartReader HTML parsing (CPU-bound) - Cosine similarity calculations (many chunks, but usually fast) - Spinner animation (negligible) **Check**: `htop` → which core at 100%? If single core, likely parsing. If all cores, parallel fetch. **Mitigation**: - Ensure `MaxConcurrentArticleFetches` not excessively high (default 10 is okay) - Accept - CPU spikes normal during fetch phase --- ### API Costs Higher Than Expected **Symptom**: OpenRouter dashboard shows high token usage. **Causes**: 1. Using expensive model (check `OPENROUTER_MODEL`) 2. High `--chunks` → more tokens in context 3. High `--queries` + `--results` → many articles → many embedding tokens (usually cheap) 4. Long answers (many completion tokens) - especially with `--long` **Mitigation**: - Use `qwen/qwen3.5-flash-02-23` (cheapest good option) - Reduce `--chunks` to 2-3 - Use `--short` when detailed answer not needed - Set `MaxTokens` in request (would need code change or **LLM capabilities** --- ## Debugging Strategies ### 1. Enable Verbose Mode Always start with: ```bash openquery -v "question" 2>&1 | tee debug.log ``` Logs everything: - Generated queries - URLs fetched - Progress counts - Errors/warnings **Analyze log**: - How many queries generated? (Should match `--queries`) - How many search results per query? (Should be ≤ `--results`) - How many articles fetched successfully? - How many chunks extracted? - Any warnings? --- ### 2. Isolate Components **Test SearxNG**: ```bash curl "http://localhost:8002/search?q=test&format=json" | jq '.results[0]' ``` **Test OpenRouter API**: ```bash curl -X POST https://openrouter.ai/api/v1/chat/completions \ -H "Authorization: Bearer $OPENROUTER_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"qwen/qwen3.5-flash-02-23","messages":[{"role":"user","content":"Hello"}]}' ``` **Test Article Fetching** (with known good URL): ```bash curl -L "https://example.com/article" | head -50 ``` Then check if SmartReader can parse. --- ### 3. Reduce Scope Test with minimal parameters to isolate failing phase: ```bash # 1 query, 2 results, 1 chunk - should be fast and simple openquery -q 1 -r 2 -c 1 "simple test question" -v # If that works, gradually increase: openquery -q 1 -r 5 -c 1 "simple question" openquery -q 3 -r 5 -c 1 "simple question" openquery -q 3 -r 5 -c 3 "simple question" # Then try complex question ``` --- ### 4. Check Resource Limits **File descriptors**: If fetching many articles, may hit limit. ```bash ulimit -n # usually 1024, should be fine ``` **Memory**: Monitor with `free -h` while running. **Disk space**: Not much disk use, but logs could fill if verbose mode used repeatedly. --- ### 5. Examine Config File ```bash cat ~/.config/openquery/config # Ensure no spaces around '=' # Correct: ApiKey=sk-or-... # Wrong: ApiKey = sk-or-... (spaces become part of value) ``` Reconfigure if needed: ```bash openquery configure --key "sk-or-..." ``` --- ### 6. Clear Cache / Reset No persistent cache exists, but: - Re-start SearxNG container: `docker restart searxng` - Clear DNS cache if network issues: `sudo systemd-resolve --flush-caches` --- ## Getting Help ### Before Asking Gather information: 1. **OpenQuery version** (commit or build date if available) 2. **OS and architecture**: `uname -a` (Linux/macOS) or `systeminfo` (Windows) 3. **Full command** you ran 4. **Verbose output**: `openquery -v "question" 2>&1 | tee log.txt` 5. **Config** (redact API key): ```bash sed 's/ApiKey=.*/ApiKey=REDACTED/' ~/.config/openquery/config ``` 6. **SearxNG test**: ```bash curl -s "http://localhost:8002/search?q=test&format=json" | jq '.results | length' ``` 7. **OpenRouter test**: ```bash curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \ https://openrouter.ai/api/v1/models | jq '.data[0].id' ``` --- ### Where to Ask 1. **GitHub Issues** (if repository hosted there): - Search existing issues first - Provide all info from above - Include log file (or link to gist) 2. **Community Forum** (if exists) 3. **Self-Diagnose**: - Check `docs/troubleshooting.md` (this file) - Check `docs/configuration.md` - Check `docs/usage.md` --- ### Example Bug Report ``` Title: OpenQuery hangs on "Fetching article X/Y" Platform: Ubuntu 22.04, .NET 10.0, OpenQuery built from commit abc123 Command: openquery -v "What is Docker?" 2>&1 | tee log.txt Verbose output shows: [...] [Fetching article 1/15: docker.com] [Fetching article 2/15: hub.docker.com] [Fetching article 3/15: docs.docker.com] # Hangs here indefinitely, no more progress SearxNG test: $ curl "http://localhost:8002/search?q=docker&format=json" | jq '.results | length' 15 # SearxNG works Config: ApiKey=sk-or-xxxx (redacted) Model=qwen/qwen3.5-flash-02-23 DefaultQueries=3 DefaultChunks=3 DefaultResults=5 Observation: - Fetches 3 articles fine, then stalls - Nothing in log after "Fetching article 3/15" - Process uses ~150MB memory, CPU 0% (idle) - Ctrl+C exits immediately Expected: Should fetch remaining 12 articles (concurrent up to 10) Actual: Only 3 fetched, then silent hang ``` --- ## Known Issues ### Issue: Spinner Characters Not Displaying Some terminals don't support Braille Unicode patterns. **Symptoms**: Spinner shows as `?` or boxes. **Fix**: Use font with Unicode support, or disable spinner by setting `TERM=dumb` or use `--verbose`. --- ### Issue: Progress Messages Overwritten In very fast operations, progress updates may overlap. **Cause**: `StatusReporter` uses `Console.Write` without lock in compact mode; concurrent writes from channel processor and spinner task could interleave. **Mitigation**: Unlikely in practice (channel serializes, spinner only updates when `_currentMessage` set). If problematic, add lock around Console operations. --- ### Issue: Articles with No Text Content Some URLs return articles with empty `TextContent`. **Cause**: SmartReader's quality heuristic (`IsReadable`) failed, or article truly has no text (image, script, error page). **Effect**: Those URLs contribute zero chunks. **Acceptable**: Part of normal operation; not all URLs yield readable content. --- ### Issue: Duplicate Sources in Answer Same website may appear multiple times (different articles). **Cause**: Different URLs from different search results may be from same domain but different pages. **Effect**: `[Source 1]` and `[Source 3]` could both be `example.com`. Not necessarily bad - they're different articles. --- ## Performance Tuning Reference | Setting | Default | Fastest | Most Thorough | Notes | |---------|---------|---------|---------------|-------| | `--queries` | 3 | 1 | 8+ | More queries = more searches | | `--results` | 5 | 2 | 15+ | Fewer = fewer articles to fetch | | `--chunks` | 3 | 1 | 5+ | More chunks = more context tokens | | `MaxConcurrentArticleFetches` | 10 | 5 | 20 | Higher = more parallel fetches | | `MaxConcurrentEmbeddingRequests` | 4 | 2 | 8 | Higher = faster embeddings (may hit rate limits) | | `EmbeddingBatchSize` | 300 | 100 | 1000 | Larger = fewer API calls, more data per call | **Start**: Defaults are balanced. **Adjust if**: - Slow: Reduce `--results`, `--queries`, or concurrency limits - Poor quality: Increase `--chunks`, `--results`, `--queries` - Rate limited: Reduce concurrency limits - High cost: Use `--short`, reduce `--chunks`, choose cheaper model --- ## Next Steps - [Performance](../performance.md) - Detailed performance analysis - [Configuration](../configuration.md) - Adjust settings - [Usage](../usage.md) - Optimize workflow --- **Quick Diagnostic Checklist** ```bash # 1. Check API key echo $OPENROUTER_API_KEY | head -c 10 # 2. Test SearxNG curl -s "http://localhost:8002/search?q=test&format=json" | jq '.results | length' # 3. Test OpenRouter curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \ https://openrouter.ai/api/v1/models | jq '.data[0].id' # 4. Run verbose openquery -v "test" 2>&1 | grep -E "Fetching|Generated|Found" # 5. Check resource usage while running htop # 6. Reduce scope and retry openquery -q 1 -r 2 -c 1 "simple test" ```