docs: Add a detailed plan for multi-provider support, including pricing and token tracking strategies.
This commit is contained in:
293
PROVIDERS.md
Normal file
293
PROVIDERS.md
Normal file
@@ -0,0 +1,293 @@
|
|||||||
|
# Provider Support Plan
|
||||||
|
|
||||||
|
## Current Problems
|
||||||
|
|
||||||
|
1. **OpenRouter Hardcoded**: Endpoint, headers, and pricing API calls are hardcoded to OpenRouter
|
||||||
|
2. **Config Ineffective**: SetupTui allows "custom endpoint" but Program.cs ignores it
|
||||||
|
3. **Token Count**: Token usage tracking only works with OpenRouter response headers
|
||||||
|
4. **Pricing Only for One Provider**: Models list shows pricing, but only when using OpenRouter
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
1. Make the system **endpoint-agnostic**
|
||||||
|
2. Support pricing/token tracking for **multiple providers**
|
||||||
|
3. Keep **OpenRouter as the default** (familiar)
|
||||||
|
4. Allow users to configure any OpenAI-compatible endpoint
|
||||||
|
5. Show pricing/token info **only when available** for each provider
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Provider Categories
|
||||||
|
|
||||||
|
### Tier 1: Native Support (Built-in)
|
||||||
|
- OpenRouter (default)
|
||||||
|
- Ollama (local, no auth)
|
||||||
|
- Groq (high-speed inference)
|
||||||
|
- Anthropic (native or via API)
|
||||||
|
- OpenAI (official api)
|
||||||
|
|
||||||
|
### Tier 2: Config-Based Support
|
||||||
|
- Cerebras
|
||||||
|
- DeepSeek
|
||||||
|
- Any OpenAI-compatible endpoint that supports custom headers
|
||||||
|
|
||||||
|
### Tier 3: Manual Configuration Required
|
||||||
|
- Self-hosted endpoints
|
||||||
|
- Corporate proxies
|
||||||
|
- Custom middleware layers
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
// Example: Provider interface
|
||||||
|
class PricingProvider
|
||||||
|
{
|
||||||
|
// Get pricing info from provider's API
|
||||||
|
async Task<List<ModelPricing>> GetModelsAsync(string apiKey);
|
||||||
|
|
||||||
|
// Get tokens from response
|
||||||
|
async Task<TokenUsage> GetTokensFromResponseAsync(HttpResponseMessage response);
|
||||||
|
|
||||||
|
// Add provider-specific headers if needed
|
||||||
|
void AddHeaders(HttpRequestMessage request, string apiKey);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Supported Implementations:**
|
||||||
|
- `OpenRouterProvider` (uses `/api/v1/models` + `x-total-tokens`)
|
||||||
|
- `GroqProvider` (uses Groq's pricing API + response headers)
|
||||||
|
- `OllamaProvider` (free tier, no pricing lookup, basic token counting)
|
||||||
|
- `OpenAIProvider` (uses OpenAI's model list + token counting)
|
||||||
|
- `GenericProvider` (fallback for any OpenAI-compatible endpoint)
|
||||||
|
|
||||||
|
**Configuration:**
|
||||||
|
Store provider selection in `anchor.config.json`:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"apiKey": "your-key",
|
||||||
|
"model": "qwen3.5-27b",
|
||||||
|
"endpoint": "https://openrouter.ai/api/v1",
|
||||||
|
"provider": "openrouter"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Auto-detect provider from endpoint URL if not specified.
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pricing System
|
||||||
|
|
||||||
|
### Current State
|
||||||
|
- Uses OpenRouter's `/api/v1/models` endpoint
|
||||||
|
- Displays pricing in a table during startup
|
||||||
|
- Only works when using OpenRouter
|
||||||
|
|
||||||
|
### Improved Behavior
|
||||||
|
|
||||||
|
**When endpoint matches known provider:**
|
||||||
|
1. Fetch pricing from that provider's API
|
||||||
|
2. Display pricing in the startup table
|
||||||
|
3. Show per-prompt costs in chat output
|
||||||
|
|
||||||
|
**When endpoint is generic/unsupported:**
|
||||||
|
1. Skip API call (no pricing lookup)
|
||||||
|
2. Display `---` or `$` placeholders
|
||||||
|
3. Optional: Show "Pricing not available" note
|
||||||
|
|
||||||
|
**User Feedback:**
|
||||||
|
- Show clear messaging: "Pricing data loaded from OpenRouter"
|
||||||
|
- Show: "Pricing not available for this endpoint" (for unsupported)
|
||||||
|
- Don't break chat functionality if pricing fails
|
||||||
|
|
||||||
|
### Pricing Data Format
|
||||||
|
|
||||||
|
Store in `ModelPricing` class:
|
||||||
|
```csharp
|
||||||
|
class ModelPricing
|
||||||
|
{
|
||||||
|
string ModelId;
|
||||||
|
decimal InputPricePerMTokens;
|
||||||
|
decimal OutputPricePerMTokens;
|
||||||
|
double? CacheCreationPricePerMTokens; // if supported
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Token Tracking System
|
||||||
|
|
||||||
|
### Current State
|
||||||
|
- Uses `x-total-tokens` from OpenRouter headers
|
||||||
|
- Only works with OpenRouter responses
|
||||||
|
|
||||||
|
### Multi-Provider Strategy
|
||||||
|
|
||||||
|
**OpenRouter:**
|
||||||
|
- Use `x-total-tokens` header
|
||||||
|
- Use `x-response-timing` for latency tracking
|
||||||
|
|
||||||
|
**Groq:**
|
||||||
|
- Use `x-groq-tokens` header
|
||||||
|
- Use `x-groq-response-time` for latency
|
||||||
|
|
||||||
|
**OpenAI:**
|
||||||
|
- Use `x-ai-response-tokens` header (if available)
|
||||||
|
- Fall back to response body if needed
|
||||||
|
|
||||||
|
**Ollama:**
|
||||||
|
- No official token counting
|
||||||
|
- Use output length as proxy estimate
|
||||||
|
- Optional: Show message token estimates
|
||||||
|
|
||||||
|
**Generic/Fallback:**
|
||||||
|
- Parse `total_tokens` from response JSON
|
||||||
|
- Fall back to character count estimates
|
||||||
|
- Show placeholder when unavailable
|
||||||
|
|
||||||
|
### Integration Points
|
||||||
|
|
||||||
|
**During Chat Session:**
|
||||||
|
1. After each response, extract tokens from response headers
|
||||||
|
2. Store in `ChatSession.TokensUsed` object
|
||||||
|
3. Display in status bar: `Tokens: 128/2048 • Cost: $0.002`
|
||||||
|
|
||||||
|
**At Session End:**
|
||||||
|
1. Show summary: `Total tokens: 1,024 | Total cost: $0.015`
|
||||||
|
2. Write to session log or history file
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Roadmap
|
||||||
|
|
||||||
|
### Phase 1: Conditional Pricing (Current Issues First)
|
||||||
|
- [ ] Check if endpoint is OpenRouter before fetching pricing
|
||||||
|
- [ ] Skip pricing API call for non-OpenRouter endpoints
|
||||||
|
- [ ] Show placeholder message if pricing not available
|
||||||
|
- [ ] **Time estimate:** 2 hours
|
||||||
|
|
||||||
|
### Phase 2: Provider Configuration
|
||||||
|
- [ ] Add `provider` field to `AnchorConfig` model
|
||||||
|
- [ ] Update `SetupTui` to ask "Which provider?" (openrouter, ollama, groq, etc.)
|
||||||
|
- [ ] Auto-detect provider from endpoint URL (smart default)
|
||||||
|
- [ ] Write provider to config file on setup
|
||||||
|
- [ ] **Time estimate:** 3 hours
|
||||||
|
|
||||||
|
### Phase 3: Provider Abstraction
|
||||||
|
- [ ] Create `IPricingProvider` interface
|
||||||
|
- [ ] Move existing `PricingProvider` to `OpenRouterProvider`
|
||||||
|
- [ ] Create `GenericPricingProvider` for fallback
|
||||||
|
- [ ] Add provider factory: `ProviderFactory.Create(providerName)`
|
||||||
|
- [ ] **Time estimate:** 5 hours
|
||||||
|
|
||||||
|
### Phase 4: Token Tracking Enhancement
|
||||||
|
- [ ] Create `ITokenTracker` interface
|
||||||
|
- [ ] Implement token extraction for multiple providers
|
||||||
|
- [ ] Display token usage in status bar
|
||||||
|
- [ ] Add per-prompt cost calculation
|
||||||
|
- [ ] **Time estimate:** 6 hours
|
||||||
|
|
||||||
|
### Phase 5: Second Provider Implementation
|
||||||
|
- [ ] Implement `GroqProvider` (similar to OpenRouter)
|
||||||
|
- [ ] Test with Groq API
|
||||||
|
- [ ] Update documentation
|
||||||
|
- [ ] **Time estimate:** 4 hours
|
||||||
|
|
||||||
|
### Phase 6: Future-Proofing (Optional)
|
||||||
|
- [ ] Add plugin system for custom providers
|
||||||
|
- [ ] Allow users to define custom pricing rules
|
||||||
|
- [ ] Support OpenRouter-compatible custom endpoints
|
||||||
|
- [ ] **Time estimate:** 8+ hours
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## User Configuration Guide
|
||||||
|
|
||||||
|
### Automatic Setup
|
||||||
|
Run `/setup` in the chat or `anchor setup` in CLI:
|
||||||
|
```
|
||||||
|
Which provider are you using?
|
||||||
|
1) OpenRouter (qwen models)
|
||||||
|
2) Groq (qwen/gemma models)
|
||||||
|
3) Ollama (local models)
|
||||||
|
4) OpenAI (gpt models)
|
||||||
|
5) Custom endpoint
|
||||||
|
```
|
||||||
|
|
||||||
|
### Manual Configuration
|
||||||
|
Edit `anchor.config.json` directly:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"apiKey": "your-api-key",
|
||||||
|
"model": "qwen3.5-27b",
|
||||||
|
"endpoint": "https://api.groq.com/openai/v1",
|
||||||
|
"provider": "groq" // optional, auto-detected if missing
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
For custom setup:
|
||||||
|
```
|
||||||
|
ANCHOR_ENDPOINT=https://api.groq.com/openai/v1
|
||||||
|
ANCHOR_PROVIDER=groq
|
||||||
|
ANCHOR_API_KEY=...
|
||||||
|
ANCHOR_MODEL=qwen3.5-27b
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Known Limitations
|
||||||
|
|
||||||
|
### Tier 1 Providers (Full Support)
|
||||||
|
**✓ OpenRouter**
|
||||||
|
- Pricing: ✓ (native API)
|
||||||
|
- Tokens: ✓ (response headers)
|
||||||
|
- Cost tracking: ✓
|
||||||
|
|
||||||
|
**✓ Groq** (after Phase 4)
|
||||||
|
- Pricing: ✓ (will add)
|
||||||
|
- Tokens: ✓ (response headers)
|
||||||
|
- Cost tracking: ✓
|
||||||
|
|
||||||
|
### Tier 2 Providers (Partial Support)
|
||||||
|
**○ Ollama**
|
||||||
|
- Pricing: ○ (free, no lookup needed)
|
||||||
|
- Tokens: ○ (estimated from output)
|
||||||
|
- Cost tracking: ○ (placeholder)
|
||||||
|
|
||||||
|
**○ OpenAI**
|
||||||
|
- Pricing: ○ (manual pricing display)
|
||||||
|
- Tokens: ○ (header extraction)
|
||||||
|
- Cost tracking: ○ (config-based)
|
||||||
|
|
||||||
|
### Tier 3 Providers (Basic Support)
|
||||||
|
**□ Custom Endpoints**
|
||||||
|
- Pricing: □ (manual only)
|
||||||
|
- Tokens: □ (fallback parsing)
|
||||||
|
- Cost tracking: □ (user-defined)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Future Enhancements
|
||||||
|
|
||||||
|
1. **Pricing Database**: Maintain own pricing database (like OpenRouter's)
|
||||||
|
2. **Cost Estimator**: Predict costs before sending message
|
||||||
|
3. **Usage Alerts**: Warn user when approaching budget limits
|
||||||
|
4. **Multi-Model Support**: Compare costs between different providers
|
||||||
|
5. **Plugin System**: Allow community to add new providers
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Success Criteria
|
||||||
|
|
||||||
|
- ✅ Users can choose from 3+ providers in setup
|
||||||
|
- ✅ Pricing displays only for supported endpoints
|
||||||
|
- ✅ Token tracking works for all Tier 1 providers
|
||||||
|
- ✅ No breaking changes to existing OpenRouter users
|
||||||
|
- ✅ Clear documentation on what each provider supports
|
||||||
|
- ✅ Graceful degradation for unsupported features
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Last Updated: 2025-12-23*
|
||||||
|
|
||||||
Reference in New Issue
Block a user