docs: Add a detailed plan for multi-provider support, including pricing and token tracking strategies.

2026-03-05 12:41:55 +01:00
parent f2c3e5032d
commit 4476cc7f15
1 changed files with 293 additions and 0 deletions
--- a/PROVIDERS.md
+++ b/PROVIDERS.md
@@ -0,0 +1,293 @@
+# Provider Support Plan
+
+## Current Problems
+
+1. **OpenRouter Hardcoded**: Endpoint, headers, and pricing API calls are hardcoded to OpenRouter
+2. **Config Ineffective**: SetupTui allows "custom endpoint" but Program.cs ignores it
+3. **Token Count**: Token usage tracking only works with OpenRouter response headers
+4. **Pricing Only for One Provider**: Models list shows pricing, but only when using OpenRouter
+
+---
+
+## Goals
+
+1. Make the system **endpoint-agnostic**
+2. Support pricing/token tracking for **multiple providers**
+3. Keep **OpenRouter as the default** (familiar)
+4. Allow users to configure any OpenAI-compatible endpoint
+5. Show pricing/token info **only when available** for each provider
+
+---
+
+## Provider Categories
+
+### Tier 1: Native Support (Built-in)
+- OpenRouter (default)
+- Ollama (local, no auth)
+- Groq (high-speed inference)
+- Anthropic (native or via API)
+- OpenAI (official api)
+
+### Tier 2: Config-Based Support
+- Cerebras
+- DeepSeek
+- Any OpenAI-compatible endpoint that supports custom headers
+
+### Tier 3: Manual Configuration Required
+- Self-hosted endpoints
+- Corporate proxies
+- Custom middleware layers
+
+---
+
+```csharp
+// Example: Provider interface
+class PricingProvider
+{
+    // Get pricing info from provider's API
+    async Task<List<ModelPricing>> GetModelsAsync(string apiKey);
+    
+    // Get tokens from response
+    async Task<TokenUsage> GetTokensFromResponseAsync(HttpResponseMessage response);
+    
+    // Add provider-specific headers if needed
+    void AddHeaders(HttpRequestMessage request, string apiKey);
+}
+```
+
+**Supported Implementations:**
+- `OpenRouterProvider` (uses `/api/v1/models` + `x-total-tokens`)
+- `GroqProvider` (uses Groq's pricing API + response headers)
+- `OllamaProvider` (free tier, no pricing lookup, basic token counting)
+- `OpenAIProvider` (uses OpenAI's model list + token counting)
+- `GenericProvider` (fallback for any OpenAI-compatible endpoint)
+
+**Configuration:**
+Store provider selection in `anchor.config.json`:
+```json
+{
+  "apiKey": "your-key",
+  "model": "qwen3.5-27b",
+  "endpoint": "https://openrouter.ai/api/v1",
+  "provider": "openrouter"
+}
+```
+
+Auto-detect provider from endpoint URL if not specified.
+---
+
+## Pricing System
+
+### Current State
+- Uses OpenRouter's `/api/v1/models` endpoint
+- Displays pricing in a table during startup
+- Only works when using OpenRouter
+
+### Improved Behavior
+
+**When endpoint matches known provider:**
+1. Fetch pricing from that provider's API
+2. Display pricing in the startup table
+3. Show per-prompt costs in chat output
+
+**When endpoint is generic/unsupported:**
+1. Skip API call (no pricing lookup)
+2. Display `---` or `$` placeholders
+3. Optional: Show "Pricing not available" note
+
+**User Feedback:**
+- Show clear messaging: "Pricing data loaded from OpenRouter"
+- Show: "Pricing not available for this endpoint" (for unsupported)
+- Don't break chat functionality if pricing fails
+
+### Pricing Data Format
+
+Store in `ModelPricing` class:
+```csharp
+class ModelPricing
+{
+    string ModelId;
+    decimal InputPricePerMTokens;
+    decimal OutputPricePerMTokens;
+    double? CacheCreationPricePerMTokens; // if supported
+}
+```
+
+---
+
+## Token Tracking System
+
+### Current State
+- Uses `x-total-tokens` from OpenRouter headers
+- Only works with OpenRouter responses
+
+### Multi-Provider Strategy
+
+**OpenRouter:**
+- Use `x-total-tokens` header
+- Use `x-response-timing` for latency tracking
+
+**Groq:**
+- Use `x-groq-tokens` header
+- Use `x-groq-response-time` for latency
+
+**OpenAI:**
+- Use `x-ai-response-tokens` header (if available)
+- Fall back to response body if needed
+
+**Ollama:**
+- No official token counting
+- Use output length as proxy estimate
+- Optional: Show message token estimates
+
+**Generic/Fallback:**
+- Parse `total_tokens` from response JSON
+- Fall back to character count estimates
+- Show placeholder when unavailable
+
+### Integration Points
+
+**During Chat Session:**
+1. After each response, extract tokens from response headers
+2. Store in `ChatSession.TokensUsed` object
+3. Display in status bar: `Tokens: 128/2048 • Cost: $0.002`
+
+**At Session End:**
+1. Show summary: `Total tokens: 1,024 | Total cost: $0.015`
+2. Write to session log or history file
+
+---
+
+## Implementation Roadmap
+
+### Phase 1: Conditional Pricing (Current Issues First)
+- [ ] Check if endpoint is OpenRouter before fetching pricing
+- [ ] Skip pricing API call for non-OpenRouter endpoints
+- [ ] Show placeholder message if pricing not available
+- [ ] **Time estimate:** 2 hours
+
+### Phase 2: Provider Configuration
+- [ ] Add `provider` field to `AnchorConfig` model
+- [ ] Update `SetupTui` to ask "Which provider?" (openrouter, ollama, groq, etc.)
+- [ ] Auto-detect provider from endpoint URL (smart default)
+- [ ] Write provider to config file on setup
+- [ ] **Time estimate:** 3 hours
+
+### Phase 3: Provider Abstraction
+- [ ] Create `IPricingProvider` interface
+- [ ] Move existing `PricingProvider` to `OpenRouterProvider`
+- [ ] Create `GenericPricingProvider` for fallback
+- [ ] Add provider factory: `ProviderFactory.Create(providerName)`
+- [ ] **Time estimate:** 5 hours
+
+### Phase 4: Token Tracking Enhancement
+- [ ] Create `ITokenTracker` interface
+- [ ] Implement token extraction for multiple providers
+- [ ] Display token usage in status bar
+- [ ] Add per-prompt cost calculation
+- [ ] **Time estimate:** 6 hours
+
+### Phase 5: Second Provider Implementation
+- [ ] Implement `GroqProvider` (similar to OpenRouter)
+- [ ] Test with Groq API
+- [ ] Update documentation
+- [ ] **Time estimate:** 4 hours
+
+### Phase 6: Future-Proofing (Optional)
+- [ ] Add plugin system for custom providers
+- [ ] Allow users to define custom pricing rules
+- [ ] Support OpenRouter-compatible custom endpoints
+- [ ] **Time estimate:** 8+ hours
+
+---
+
+## User Configuration Guide
+
+### Automatic Setup
+Run `/setup` in the chat or `anchor setup` in CLI:
+```
+Which provider are you using?
+1) OpenRouter (qwen models)
+2) Groq (qwen/gemma models)
+3) Ollama (local models)
+4) OpenAI (gpt models)
+5) Custom endpoint
+```
+
+### Manual Configuration
+Edit `anchor.config.json` directly:
+```json
+{
+  "apiKey": "your-api-key",
+  "model": "qwen3.5-27b",
+  "endpoint": "https://api.groq.com/openai/v1",
+  "provider": "groq"  // optional, auto-detected if missing
+}
+```
+
+### Environment Variables
+For custom setup:
+```
+ANCHOR_ENDPOINT=https://api.groq.com/openai/v1
+ANCHOR_PROVIDER=groq
+ANCHOR_API_KEY=...  
+ANCHOR_MODEL=qwen3.5-27b
+```
+
+---
+
+## Known Limitations
+
+### Tier 1 Providers (Full Support)
+**✓ OpenRouter**
+- Pricing: ✓ (native API)
+- Tokens: ✓ (response headers)
+- Cost tracking: ✓
+
+**✓ Groq** (after Phase 4)
+- Pricing: ✓ (will add)
+- Tokens: ✓ (response headers)
+- Cost tracking: ✓
+
+### Tier 2 Providers (Partial Support)
+**○ Ollama**
+- Pricing: ○ (free, no lookup needed)
+- Tokens: ○ (estimated from output)
+- Cost tracking: ○ (placeholder)
+
+**○ OpenAI**
+- Pricing: ○ (manual pricing display)
+- Tokens: ○ (header extraction)
+- Cost tracking: ○ (config-based)
+
+### Tier 3 Providers (Basic Support)
+**□ Custom Endpoints**
+- Pricing: □ (manual only)
+- Tokens: □ (fallback parsing)
+- Cost tracking: □ (user-defined)
+
+---
+
+## Future Enhancements
+
+1. **Pricing Database**: Maintain own pricing database (like OpenRouter's)
+2. **Cost Estimator**: Predict costs before sending message
+3. **Usage Alerts**: Warn user when approaching budget limits
+4. **Multi-Model Support**: Compare costs between different providers
+5. **Plugin System**: Allow community to add new providers
+
+---
+
+## Success Criteria
+
+- ✅ Users can choose from 3+ providers in setup
+- ✅ Pricing displays only for supported endpoints
+- ✅ Token tracking works for all Tier 1 providers
+- ✅ No breaking changes to existing OpenRouter users
+- ✅ Clear documentation on what each provider supports
+- ✅ Graceful degradation for unsupported features
+
+---
+
+*Last Updated: 2025-12-23*
+