# Provider Support Plan

## Current Problems

1. **OpenRouter Hardcoded**: Endpoint, headers, and pricing API calls are hardcoded to OpenRouter
2. **Config Ineffective**: SetupTui allows "custom endpoint" but Program.cs ignores it
3. **Token Count**: Token usage tracking only works with OpenRouter response headers
4. **Pricing Only for One Provider**: Models list shows pricing, but only when using OpenRouter

---

## Goals

1. Make the system **endpoint-agnostic**
2. Support pricing/token tracking for **multiple providers**
3. Keep **OpenRouter as the default** (familiar)
4. Allow users to configure any OpenAI-compatible endpoint
5. Show pricing/token info **only when available** for each provider

---

## Provider Categories

### Tier 1: Native Support (Built-in)
- OpenRouter (default)
- Ollama (local, no auth)
- Groq (high-speed inference)
- Anthropic (native or via API)
- OpenAI (official api)

### Tier 2: Config-Based Support
- Cerebras
- DeepSeek
- Any OpenAI-compatible endpoint that supports custom headers

### Tier 3: Manual Configuration Required
- Self-hosted endpoints
- Corporate proxies
- Custom middleware layers

---

```csharp
// Example: Provider interface
class PricingProvider
{
    // Get pricing info from provider's API
    async Task<List<ModelPricing>> GetModelsAsync(string apiKey);
    
    // Get tokens from response
    async Task<TokenUsage> GetTokensFromResponseAsync(HttpResponseMessage response);
    
    // Add provider-specific headers if needed
    void AddHeaders(HttpRequestMessage request, string apiKey);
}
```

**Supported Implementations:**
- `OpenRouterProvider` (uses `/api/v1/models` + `x-total-tokens`)
- `GroqProvider` (uses Groq's pricing API + response headers)
- `OllamaProvider` (free tier, no pricing lookup, basic token counting)
- `OpenAIProvider` (uses OpenAI's model list + token counting)
- `GenericProvider` (fallback for any OpenAI-compatible endpoint)

**Configuration:**
Store provider selection in `anchor.config.json`:
```json
{
  "apiKey": "your-key",
  "model": "qwen3.5-27b",
  "endpoint": "https://openrouter.ai/api/v1",
  "provider": "openrouter"
}
```

Auto-detect provider from endpoint URL if not specified.
---

## Pricing System

### Current State
- Uses OpenRouter's `/api/v1/models` endpoint
- Displays pricing in a table during startup
- Only works when using OpenRouter

### Improved Behavior

**When endpoint matches known provider:**
1. Fetch pricing from that provider's API
2. Display pricing in the startup table
3. Show per-prompt costs in chat output

**When endpoint is generic/unsupported:**
1. Skip API call (no pricing lookup)
2. Display `---` or `$` placeholders
3. Optional: Show "Pricing not available" note

**User Feedback:**
- Show clear messaging: "Pricing data loaded from OpenRouter"
- Show: "Pricing not available for this endpoint" (for unsupported)
- Don't break chat functionality if pricing fails

### Pricing Data Format

Store in `ModelPricing` class:
```csharp
class ModelPricing
{
    string ModelId;
    decimal InputPricePerMTokens;
    decimal OutputPricePerMTokens;
    double? CacheCreationPricePerMTokens; // if supported
}
```

---

## Token Tracking System

### Current State
- Uses `x-total-tokens` from OpenRouter headers
- Only works with OpenRouter responses

### Multi-Provider Strategy

**OpenRouter:**
- Use `x-total-tokens` header
- Use `x-response-timing` for latency tracking

**Groq:**
- Use `x-groq-tokens` header
- Use `x-groq-response-time` for latency

**OpenAI:**
- Use `x-ai-response-tokens` header (if available)
- Fall back to response body if needed

**Ollama:**
- No official token counting
- Use output length as proxy estimate
- Optional: Show message token estimates

**Generic/Fallback:**
- Parse `total_tokens` from response JSON
- Fall back to character count estimates
- Show placeholder when unavailable

### Integration Points

**During Chat Session:**
1. After each response, extract tokens from response headers
2. Store in `ChatSession.TokensUsed` object
3. Display in status bar: `Tokens: 128/2048 • Cost: $0.002`

**At Session End:**
1. Show summary: `Total tokens: 1,024 | Total cost: $0.015`
2. Write to session log or history file

---

## Implementation Roadmap

### Phase 1: Conditional Pricing (Current Issues First)
- [ ] Check if endpoint is OpenRouter before fetching pricing
- [ ] Skip pricing API call for non-OpenRouter endpoints
- [ ] Show placeholder message if pricing not available
- [ ] **Time estimate:** 2 hours

### Phase 2: Provider Configuration
- [ ] Add `provider` field to `AnchorConfig` model
- [ ] Update `SetupTui` to ask "Which provider?" (openrouter, ollama, groq, etc.)
- [ ] Auto-detect provider from endpoint URL (smart default)
- [ ] Write provider to config file on setup
- [ ] **Time estimate:** 3 hours

### Phase 3: Provider Abstraction
- [ ] Create `IPricingProvider` interface
- [ ] Move existing `PricingProvider` to `OpenRouterProvider`
- [ ] Create `GenericPricingProvider` for fallback
- [ ] Add provider factory: `ProviderFactory.Create(providerName)`
- [ ] **Time estimate:** 5 hours

### Phase 4: Token Tracking Enhancement
- [ ] Create `ITokenTracker` interface
- [ ] Implement token extraction for multiple providers
- [ ] Display token usage in status bar
- [ ] Add per-prompt cost calculation
- [ ] **Time estimate:** 6 hours

### Phase 5: Second Provider Implementation
- [ ] Implement `GroqProvider` (similar to OpenRouter)
- [ ] Test with Groq API
- [ ] Update documentation
- [ ] **Time estimate:** 4 hours

### Phase 6: Future-Proofing (Optional)
- [ ] Add plugin system for custom providers
- [ ] Allow users to define custom pricing rules
- [ ] Support OpenRouter-compatible custom endpoints
- [ ] **Time estimate:** 8+ hours

---

## User Configuration Guide

### Automatic Setup
Run `/setup` in the chat or `anchor setup` in CLI:
```
Which provider are you using?
1) OpenRouter (qwen models)
2) Groq (qwen/gemma models)
3) Ollama (local models)
4) OpenAI (gpt models)
5) Custom endpoint
```

### Manual Configuration
Edit `anchor.config.json` directly:
```json
{
  "apiKey": "your-api-key",
  "model": "qwen3.5-27b",
  "endpoint": "https://api.groq.com/openai/v1",
  "provider": "groq"  // optional, auto-detected if missing
}
```

### Environment Variables
For custom setup:
```
ANCHOR_ENDPOINT=https://api.groq.com/openai/v1
ANCHOR_PROVIDER=groq
ANCHOR_API_KEY=...  
ANCHOR_MODEL=qwen3.5-27b
```

---

## Known Limitations

### Tier 1 Providers (Full Support)
**✓ OpenRouter**
- Pricing: ✓ (native API)
- Tokens: ✓ (response headers)
- Cost tracking: ✓

**✓ Groq** (after Phase 4)
- Pricing: ✓ (will add)
- Tokens: ✓ (response headers)
- Cost tracking: ✓

### Tier 2 Providers (Partial Support)
**○ Ollama**
- Pricing: ○ (free, no lookup needed)
- Tokens: ○ (estimated from output)
- Cost tracking: ○ (placeholder)

**○ OpenAI**
- Pricing: ○ (manual pricing display)
- Tokens: ○ (header extraction)
- Cost tracking: ○ (config-based)

### Tier 3 Providers (Basic Support)
**□ Custom Endpoints**
- Pricing: □ (manual only)
- Tokens: □ (fallback parsing)
- Cost tracking: □ (user-defined)

---

## Future Enhancements

1. **Pricing Database**: Maintain own pricing database (like OpenRouter's)
2. **Cost Estimator**: Predict costs before sending message
3. **Usage Alerts**: Warn user when approaching budget limits
4. **Multi-Model Support**: Compare costs between different providers
5. **Plugin System**: Allow community to add new providers

---

## Success Criteria

- ✅ Users can choose from 3+ providers in setup
- ✅ Pricing displays only for supported endpoints
- ✅ Token tracking works for all Tier 1 providers
- ✅ No breaking changes to existing OpenRouter users
- ✅ Clear documentation on what each provider supports
- ✅ Graceful degradation for unsupported features

---

*Last Updated: 2025-12-23*