7.7 KiB
7.7 KiB
Provider Support Plan
Current Problems
- OpenRouter Hardcoded: Endpoint, headers, and pricing API calls are hardcoded to OpenRouter
- Config Ineffective: SetupTui allows "custom endpoint" but Program.cs ignores it
- Token Count: Token usage tracking only works with OpenRouter response headers
- Pricing Only for One Provider: Models list shows pricing, but only when using OpenRouter
Goals
- Make the system endpoint-agnostic
- Support pricing/token tracking for multiple providers
- Keep OpenRouter as the default (familiar)
- Allow users to configure any OpenAI-compatible endpoint
- Show pricing/token info only when available for each provider
Provider Categories
Tier 1: Native Support (Built-in)
- OpenRouter (default)
- Ollama (local, no auth)
- Groq (high-speed inference)
- Anthropic (native or via API)
- OpenAI (official api)
Tier 2: Config-Based Support
- Cerebras
- DeepSeek
- Any OpenAI-compatible endpoint that supports custom headers
Tier 3: Manual Configuration Required
- Self-hosted endpoints
- Corporate proxies
- Custom middleware layers
// Example: Provider interface
class PricingProvider
{
// Get pricing info from provider's API
async Task<List<ModelPricing>> GetModelsAsync(string apiKey);
// Get tokens from response
async Task<TokenUsage> GetTokensFromResponseAsync(HttpResponseMessage response);
// Add provider-specific headers if needed
void AddHeaders(HttpRequestMessage request, string apiKey);
}
Supported Implementations:
OpenRouterProvider(uses/api/v1/models+x-total-tokens)GroqProvider(uses Groq's pricing API + response headers)OllamaProvider(free tier, no pricing lookup, basic token counting)OpenAIProvider(uses OpenAI's model list + token counting)GenericProvider(fallback for any OpenAI-compatible endpoint)
Configuration:
Store provider selection in anchor.config.json:
{
"apiKey": "your-key",
"model": "qwen3.5-27b",
"endpoint": "https://openrouter.ai/api/v1",
"provider": "openrouter"
}
Auto-detect provider from endpoint URL if not specified.
Pricing System
Current State
- Uses OpenRouter's
/api/v1/modelsendpoint - Displays pricing in a table during startup
- Only works when using OpenRouter
Improved Behavior
When endpoint matches known provider:
- Fetch pricing from that provider's API
- Display pricing in the startup table
- Show per-prompt costs in chat output
When endpoint is generic/unsupported:
- Skip API call (no pricing lookup)
- Display
---or$placeholders - Optional: Show "Pricing not available" note
User Feedback:
- Show clear messaging: "Pricing data loaded from OpenRouter"
- Show: "Pricing not available for this endpoint" (for unsupported)
- Don't break chat functionality if pricing fails
Pricing Data Format
Store in ModelPricing class:
class ModelPricing
{
string ModelId;
decimal InputPricePerMTokens;
decimal OutputPricePerMTokens;
double? CacheCreationPricePerMTokens; // if supported
}
Token Tracking System
Current State
- Uses
x-total-tokensfrom OpenRouter headers - Only works with OpenRouter responses
Multi-Provider Strategy
OpenRouter:
- Use
x-total-tokensheader - Use
x-response-timingfor latency tracking
Groq:
- Use
x-groq-tokensheader - Use
x-groq-response-timefor latency
OpenAI:
- Use
x-ai-response-tokensheader (if available) - Fall back to response body if needed
Ollama:
- No official token counting
- Use output length as proxy estimate
- Optional: Show message token estimates
Generic/Fallback:
- Parse
total_tokensfrom response JSON - Fall back to character count estimates
- Show placeholder when unavailable
Integration Points
During Chat Session:
- After each response, extract tokens from response headers
- Store in
ChatSession.TokensUsedobject - Display in status bar:
Tokens: 128/2048 • Cost: $0.002
At Session End:
- Show summary:
Total tokens: 1,024 | Total cost: $0.015 - Write to session log or history file
Implementation Roadmap
Phase 1: Conditional Pricing (Current Issues First)
- Check if endpoint is OpenRouter before fetching pricing
- Skip pricing API call for non-OpenRouter endpoints
- Show placeholder message if pricing not available
- Time estimate: 2 hours
Phase 2: Provider Configuration
- Add
providerfield toAnchorConfigmodel - Update
SetupTuito ask "Which provider?" (openrouter, ollama, groq, etc.) - Auto-detect provider from endpoint URL (smart default)
- Write provider to config file on setup
- Time estimate: 3 hours
Phase 3: Provider Abstraction
- Create
IPricingProviderinterface - Move existing
PricingProvidertoOpenRouterProvider - Create
GenericPricingProviderfor fallback - Add provider factory:
ProviderFactory.Create(providerName) - Time estimate: 5 hours
Phase 4: Token Tracking Enhancement
- Create
ITokenTrackerinterface - Implement token extraction for multiple providers
- Display token usage in status bar
- Add per-prompt cost calculation
- Time estimate: 6 hours
Phase 5: Second Provider Implementation
- Implement
GroqProvider(similar to OpenRouter) - Test with Groq API
- Update documentation
- Time estimate: 4 hours
Phase 6: Future-Proofing (Optional)
- Add plugin system for custom providers
- Allow users to define custom pricing rules
- Support OpenRouter-compatible custom endpoints
- Time estimate: 8+ hours
User Configuration Guide
Automatic Setup
Run /setup in the chat or anchor setup in CLI:
Which provider are you using?
1) OpenRouter (qwen models)
2) Groq (qwen/gemma models)
3) Ollama (local models)
4) OpenAI (gpt models)
5) Custom endpoint
Manual Configuration
Edit anchor.config.json directly:
{
"apiKey": "your-api-key",
"model": "qwen3.5-27b",
"endpoint": "https://api.groq.com/openai/v1",
"provider": "groq" // optional, auto-detected if missing
}
Environment Variables
For custom setup:
ANCHOR_ENDPOINT=https://api.groq.com/openai/v1
ANCHOR_PROVIDER=groq
ANCHOR_API_KEY=...
ANCHOR_MODEL=qwen3.5-27b
Known Limitations
Tier 1 Providers (Full Support)
✓ OpenRouter
- Pricing: ✓ (native API)
- Tokens: ✓ (response headers)
- Cost tracking: ✓
✓ Groq (after Phase 4)
- Pricing: ✓ (will add)
- Tokens: ✓ (response headers)
- Cost tracking: ✓
Tier 2 Providers (Partial Support)
○ Ollama
- Pricing: ○ (free, no lookup needed)
- Tokens: ○ (estimated from output)
- Cost tracking: ○ (placeholder)
○ OpenAI
- Pricing: ○ (manual pricing display)
- Tokens: ○ (header extraction)
- Cost tracking: ○ (config-based)
Tier 3 Providers (Basic Support)
□ Custom Endpoints
- Pricing: □ (manual only)
- Tokens: □ (fallback parsing)
- Cost tracking: □ (user-defined)
Future Enhancements
- Pricing Database: Maintain own pricing database (like OpenRouter's)
- Cost Estimator: Predict costs before sending message
- Usage Alerts: Warn user when approaching budget limits
- Multi-Model Support: Compare costs between different providers
- Plugin System: Allow community to add new providers
Success Criteria
- ✅ Users can choose from 3+ providers in setup
- ✅ Pricing displays only for supported endpoints
- ✅ Token tracking works for all Tier 1 providers
- ✅ No breaking changes to existing OpenRouter users
- ✅ Clear documentation on what each provider supports
- ✅ Graceful degradation for unsupported features
Last Updated: 2025-12-23