Files

OpenQuery Documentation 65ca2401ae docs: add comprehensive documentation with README and detailed guides

- Add user-friendly README.md with quick start guide
- Create docs/ folder with structured technical documentation:
  - installation.md: Build and setup instructions
  - configuration.md: Complete config reference
  - usage.md: CLI usage guide with examples
  - architecture.md: System design and patterns
  - components/: Deep dive into each component (OpenQueryApp, SearchTool, Services, Models)
  - api/: CLI reference, environment variables, programmatic API
  - troubleshooting.md: Common issues and solutions
  - performance.md: Latency, throughput, and optimization
- All documentation fully cross-referenced with internal links
- Covers project overview, architecture, components, APIs, and support

See individual files for complete documentation.

2026-03-19 10:01:58 +01:00

17 KiB

Raw Blame History

Troubleshooting

Solve common issues, errors, and performance problems with OpenQuery.

📋 Table of Contents

Common Errors
Performance Issues
Debugging Strategies
Getting Help

Common Errors

❌ "API Key is missing"

Error Message:

[Error] API Key is missing. Set OPENROUTER_API_KEY environment variable or run 'configure -i' to set it up.

Cause: No API key available from environment or config file.

Solutions:

Set environment variable (temporary):

export OPENROUTER_API_KEY="sk-or-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

Configure interactively (persistent):

openquery configure -i
# Follow prompts to enter API key

Check config file:

cat ~/.config/openquery/config
# Should contain: ApiKey=sk-or-...

Verify environment:

echo $OPENROUTER_API_KEY
# If empty, you didn't export or exported in wrong shell

❌ "Network request failed"

Error Message:

[Error] Network request failed. Details: Name or service not known

Cause: Cannot reach OpenRouter or SearxNG API endpoints.

Solutions:

Check internet connectivity:

ping 8.8.8.8
curl https://openrouter.ai

Verify SearxNG is running:

curl "http://localhost:8002/search?q=test&format=json"
# Should return JSON

If connection refused:

# Start SearxNG if using Docker
docker start searxng
# Or run fresh
docker run -d --name searxng -p 8002:8080 searxng/searxng:latest

Check firewall/proxy:

# Test OpenRouter API
curl -H "Authorization: Bearer $OPENROUTER_API_KEY" \
     https://openrouter.ai/api/v1/models

Test from different network (if behind restrictive firewall)

❌ "No search results found"

Error Message:

No search results found.

Cause: Search queries returned zero results from SearxNG.

Solutions:

Test SearxNG manually:

curl "http://localhost:8002/search?q=test&format=json" | jq '.results | length'
# Should be > 0

Check SearxNG configuration:

If self-hosted: ensure internet access is enabled in /etc/searxng/settings.yml
Some public instances disable certain engines or have rate limits

Try a different SearxNG instance:

export SEARXNG_URL="https://searx.example.com"
openquery "question"

Use simpler queries: Some queries may be too obscure or malformed
Verbose mode to see queries:

openquery -v "complex question"
# See what queries were generated

❌ "Found search results but could not extract readable content."

Cause: SearxNG returned results but ArticleService failed to extract content from all URLs.

Common Reasons:

JavaScript-heavy sites (React, Vue apps) where content loaded dynamically
Paywalled sites (NYT, academic journals)
PDFs or non-HTML content
Malformed HTML
Server returned error (404, 403, 500)
robots.txt blocked crawler

Solutions:

Accept that some sites can't be scraped - try different query to get different results
Use site:reddit.com or site:wikipedia.org - these are usually scrape-friendly
Increase --results to get more URLs (some will work)
Check verbose output:

openquery -v "question"
# Look for "Warning: Failed to fetch article"

Try a local SearxNG instance with more engines - some engines fetch different sources

❌ Rate Limiting (429 Too Many Requests)

Symptoms:

[Error] Response status code does not indicate success: 429 (Too Many Requests).

Or retries exhausting after Polly attempts.

Cause: Too many concurrent requests to OpenRouter API.

Solutions:

Reduce concurrency (edit SearchTool.cs):

var _options = new ParallelProcessingOptions
{
    MaxConcurrentArticleFetches = 5,  // reduce from 10
    MaxConcurrentEmbeddingRequests = 2,  // reduce from 4
    EmbeddingBatchSize = 150  // reduce from 300
};

Add delay between embedding batches (custom implementation)
Upgrade OpenRouter plan to higher rate limits
Wait and retry - rate limits reset after time window

❌ Slow Performance

Symptom: Queries take 60+ seconds when they usually take 20s.

Diagnosis Steps:

Run with verbose mode:

openquery -v "question"

Watch which phase takes longest:

Query generation?
Searching?
Fetching articles?
Embeddings?

Check network latency:

time curl "https://openrouter.ai/api/v1/models"
time curl "http://localhost:8002/search?q=test&format=json"

Common Causes & Fixes:

Phase	Cause	Fix
Searches	SearxNG overloaded/slow	Check CPU/memory, restart container
Fetching	Target sites slow	Reduce `--results` to fewer URLs
Embeddings	API rate limited	Reduce concurrency (see above)
Answer	Heavy model/load	Switch to faster model (e.g., Qwen Flash)

Resource monitoring:

htop  # CPU/memory usage
iftop  # network throughput

Reduce parameters:

openquery -q 2 -r 3 -c 2 "question"  # lighter load

❌ Out of Memory

Symptoms:

Process killed by OOM killer (Linux)
System.OutOfMemoryException
System becomes unresponsive

Cause: Processing too many large articles simultaneously.

Why: Each article can be 100KB+ of text, split into many chunks, embeddings are 6KB per chunk (1536 floats × 4 bytes). 200 chunks = 1.2MB embeddings, plus text ~100KB = 1.3MB. Not huge, but many large articles could create thousands of chunks.

Solutions:

Reduce --results (fewer URLs per query):

openquery -r 3 "question"  # instead of 10

Reduce --queries (fewer search queries):

openquery -q 2 "question"

Fetches already limited to 10 concurrent by default, which is reasonable
Check article size: Some sites (PDFs, long documents) may yield megabytes of text; SmartReader should truncate but may not

❌ Invalid JSON from Query Generation

Symptom: Query generation fails silently, falls back to original question.

Cause: LLM returned non-JSON (even though instructed). Could be:

Model not instruction-following
Output exceeded context window
API error in response

Detection: Run with -v to see:

[Failed to generate queries, falling back to original question. Error: ...]

Solutions:

Try a different model (configure to use Gemini or DeepSeek)
Reduce --queries count (simpler task)
Tune system prompt (would require code change)
Accept fallback - the original question often works as sole query

❌ Spinner Artifacts in Output

Symptom: When redirecting output to file, you see weird characters like ⠋, <EFBFBD>, etc.

Cause: Spinner uses Unicode Braille characters and ANSI escape codes.

Fix: Use 2>/dev/null | sed 's/.\x08//g' to clean:

openquery "question" 2>/dev/null | sed 's/.\x08//g' > answer.md

Or run with --verbose (no spinner, only newline-separated messages):

openquery -v "question" > answer.txt

❌ "The type or namespace name '...' does not exist" (Build Error)

Cause: Missing NuGet package or wrong .NET SDK version.

Solution:

Verify .NET SDK 10.0:

dotnet --version
# Should be 10.x

If lower: https://dotnet.microsoft.com/download/dotnet/10.0

Restore packages:

dotnet restore

Clean and rebuild:

dotnet clean
dotnet build

Check OpenQuery.csproj for package references:

<PackageReference Include="Polly.Core" Version="8.6.6" />
<PackageReference Include="Polly.RateLimiting" Version="8.6.6" />
<PackageReference Include="SmartReader" Version="0.11.0" />
<PackageReference Include="System.CommandLine" Version="2.0.0-beta4.22272.1" />
<PackageReference Include="System.Numerics.Tensors" Version="9.0.0" />

If restore fails, these packages may not be available for .NET 10 preview. Consider:

Downgrade to .NET 8.0 (if packages incompatible)
Or find package versions compatible with .NET 10

❌ AOT Compilation Fails

Error: error NETSDK1085: The current .NET SDK does not support targeting .NET 10.0.

Cause: Using .NET SDK older than 10.0.

Fix: Install .NET SDK 10.0 preview.

Or: Disable AOT for development (edit .csproj):

<!-- Remove or set to false -->
<PublishAot>false</PublishAot>

Performance Issues

Slow First Request

Expected: First query slower (JIT compilation for .NET runtime if not AOT, or initial API connections).

If not using AOT:

Consider publishing with /p:PublishAot=true for production distribution
Development builds use JIT, which adds 500ms-2s warmup

Mitigation: Accept as warmup cost, or pre-warm with dummy query.

High Memory Usage

Check:

ps aux | grep OpenQuery
# Look at RSS (resident set size)

Typical: 50-200MB (including .NET runtime, AOT code, data structures)

If >500MB:

Likely processing very many articles
Check --results and --queries values
Use --verbose to see counts: [Fetched X search results], [Extracted Y chunks]

Reduce:

--queries 2 instead of 10
--results 3 instead of 15
These directly limit number of URLs to fetch

High CPU Usage

Cause:

SmartReader HTML parsing (CPU-bound)
Cosine similarity calculations (many chunks, but usually fast)
Spinner animation (negligible)

Check: htop → which core at 100%? If single core, likely parsing. If all cores, parallel fetch.

Mitigation:

Ensure MaxConcurrentArticleFetches not excessively high (default 10 is okay)
Accept - CPU spikes normal during fetch phase

API Costs Higher Than Expected

Symptom: OpenRouter dashboard shows high token usage.

Causes:

Using expensive model (check OPENROUTER_MODEL)
High --chunks → more tokens in context
High --queries + --results → many articles → many embedding tokens (usually cheap)
Long answers (many completion tokens) - especially with --long

Mitigation:

Use qwen/qwen3.5-flash-02-23 (cheapest good option)
Reduce --chunks to 2-3
Use --short when detailed answer not needed
Set MaxTokens in request (would need code change or LLM capabilities

Debugging Strategies

1. Enable Verbose Mode

Always start with:

openquery -v "question" 2>&1 | tee debug.log

Logs everything:

Generated queries
URLs fetched
Progress counts
Errors/warnings

Analyze log:

How many queries generated? (Should match --queries)
How many search results per query? (Should be ≤ --results)
How many articles fetched successfully?
How many chunks extracted?
Any warnings?

2. Isolate Components

Test SearxNG:

curl "http://localhost:8002/search?q=test&format=json" | jq '.results[0]'

Test OpenRouter API:

curl -X POST https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"qwen/qwen3.5-flash-02-23","messages":[{"role":"user","content":"Hello"}]}'

Test Article Fetching (with known good URL):

curl -L "https://example.com/article" | head -50

Then check if SmartReader can parse.

3. Reduce Scope

Test with minimal parameters to isolate failing phase:

# 1 query, 2 results, 1 chunk - should be fast and simple
openquery -q 1 -r 2 -c 1 "simple test question" -v

# If that works, gradually increase:
openquery -q 1 -r 5 -c 1 "simple question"
openquery -q 3 -r 5 -c 1 "simple question"
openquery -q 3 -r 5 -c 3 "simple question"

# Then try complex question

4. Check Resource Limits

File descriptors: If fetching many articles, may hit limit.

ulimit -n  # usually 1024, should be fine

Memory: Monitor with free -h while running.

Disk space: Not much disk use, but logs could fill if verbose mode used repeatedly.

5. Examine Config File

cat ~/.config/openquery/config
# Ensure no spaces around '='
# Correct: ApiKey=sk-or-...
# Wrong: ApiKey = sk-or-...  (spaces become part of value)

Reconfigure if needed:

openquery configure --key "sk-or-..."

6. Clear Cache / Reset

No persistent cache exists, but:

Re-start SearxNG container: docker restart searxng
Clear DNS cache if network issues: sudo systemd-resolve --flush-caches

Getting Help

Before Asking

Gather information:

OpenQuery version (commit or build date if available)
OS and architecture: uname -a (Linux/macOS) or systeminfo (Windows)
Full command you ran
Verbose output: openquery -v "question" 2>&1 | tee log.txt
Config (redact API key):

sed 's/ApiKey=.*/ApiKey=REDACTED/' ~/.config/openquery/config

SearxNG test:

curl -s "http://localhost:8002/search?q=test&format=json" | jq '.results | length'

OpenRouter test:

curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \
     https://openrouter.ai/api/v1/models | jq '.data[0].id'

Where to Ask

GitHub Issues (if repository hosted there):
- Search existing issues first
- Provide all info from above
- Include log file (or link to gist)
Community Forum (if exists)
Self-Diagnose:
- Check docs/troubleshooting.md (this file)
- Check docs/configuration.md
- Check docs/usage.md

Example Bug Report

Title: OpenQuery hangs on "Fetching article X/Y"

Platform: Ubuntu 22.04, .NET 10.0, OpenQuery built from commit abc123
Command: openquery -v "What is Docker?" 2>&1 | tee log.txt

Verbose output shows:
[...]
[Fetching article 1/15: docker.com]
[Fetching article 2/15: hub.docker.com]
[Fetching article 3/15: docs.docker.com]
# Hangs here indefinitely, no more progress

SearxNG test:
$ curl "http://localhost:8002/search?q=docker&format=json" | jq '.results | length'
15  # SearxNG works

Config:
ApiKey=sk-or-xxxx (redacted)
Model=qwen/qwen3.5-flash-02-23
DefaultQueries=3
DefaultChunks=3
DefaultResults=5

Observation:
- Fetches 3 articles fine, then stalls
- Nothing in log after "Fetching article 3/15"
- Process uses ~150MB memory, CPU 0% (idle)
- Ctrl+C exits immediately

Expected: Should fetch remaining 12 articles (concurrent up to 10)
Actual: Only 3 fetched, then silent hang

Known Issues

Issue: Spinner Characters Not Displaying

Some terminals don't support Braille Unicode patterns.

Symptoms: Spinner shows as ? or boxes.

Fix: Use font with Unicode support, or disable spinner by setting TERM=dumb or use --verbose.

Issue: Progress Messages Overwritten

In very fast operations, progress updates may overlap.

Cause: StatusReporter uses Console.Write without lock in compact mode; concurrent writes from channel processor and spinner task could interleave.

Mitigation: Unlikely in practice (channel serializes, spinner only updates when _currentMessage set). If problematic, add lock around Console operations.

Issue: Articles with No Text Content

Some URLs return articles with empty TextContent.

Cause: SmartReader's quality heuristic (IsReadable) failed, or article truly has no text (image, script, error page).

Effect: Those URLs contribute zero chunks.

Acceptable: Part of normal operation; not all URLs yield readable content.

Issue: Duplicate Sources in Answer

Same website may appear multiple times (different articles).

Cause: Different URLs from different search results may be from same domain but different pages.

Effect: [Source 1] and [Source 3] could both be example.com. Not necessarily bad - they're different articles.

Performance Tuning Reference

Setting	Default	Fastest	Most Thorough	Notes
`--queries`	3	1	8+	More queries = more searches
`--results`	5	2	15+	Fewer = fewer articles to fetch
`--chunks`	3	1	5+	More chunks = more context tokens
`MaxConcurrentArticleFetches`	10	5	20	Higher = more parallel fetches
`MaxConcurrentEmbeddingRequests`	4	2	8	Higher = faster embeddings (may hit rate limits)
`EmbeddingBatchSize`	300	100	1000	Larger = fewer API calls, more data per call

Start: Defaults are balanced.

Adjust if:

Slow: Reduce --results, --queries, or concurrency limits
Poor quality: Increase --chunks, --results, --queries
Rate limited: Reduce concurrency limits
High cost: Use --short, reduce --chunks, choose cheaper model

Next Steps

Performance - Detailed performance analysis
Configuration - Adjust settings
Usage - Optimize workflow

Quick Diagnostic Checklist

# 1. Check API key
echo $OPENROUTER_API_KEY | head -c 10

# 2. Test SearxNG
curl -s "http://localhost:8002/search?q=test&format=json" | jq '.results | length'

# 3. Test OpenRouter
curl -s -H "Authorization: Bearer $OPENROUTER_API_KEY" \
     https://openrouter.ai/api/v1/models | jq '.data[0].id'

# 4. Run verbose
openquery -v "test" 2>&1 | grep -E "Fetching|Generated|Found"

# 5. Check resource usage while running
htop

# 6. Reduce scope and retry
openquery -q 1 -r 2 -c 1 "simple test"

17 KiB Raw Blame History Unescape Escape

Troubleshooting

📋 Table of Contents

Common Errors

❌ "API Key is missing"

❌ "Network request failed"

❌ "No search results found"

❌ "Found search results but could not extract readable content."

❌ Rate Limiting (429 Too Many Requests)

❌ Slow Performance

❌ Out of Memory

❌ Invalid JSON from Query Generation

❌ Spinner Artifacts in Output

❌ "The type or namespace name '...' does not exist" (Build Error)

❌ AOT Compilation Fails

Performance Issues

Slow First Request

High Memory Usage

High CPU Usage

API Costs Higher Than Expected

Debugging Strategies

1. Enable Verbose Mode

2. Isolate Components

3. Reduce Scope

4. Check Resource Limits

5. Examine Config File

6. Clear Cache / Reset

Getting Help

Before Asking

Where to Ask

Example Bug Report

Known Issues

Issue: Spinner Characters Not Displaying

Issue: Progress Messages Overwritten

Issue: Articles with No Text Content

Issue: Duplicate Sources in Answer

Performance Tuning Reference

Next Steps

17 KiB

Raw Blame History