1
0

Compare commits

...

3 Commits

16 changed files with 2139 additions and 28 deletions

View File

@@ -7,7 +7,8 @@ internal sealed class AnchorConfig
{
public string ApiKey { get; set; } = "";
public string Model { get; set; } = "qwen/qwen3.5-397b-a17b";
public string Provider { get; set; } = "openrouter";
public string Endpoint { get; set; } = "https://openrouter.ai/api/v1";
// ── Persistence ──────────────────────────────────────────────────────
private static string ConfigPath =>

View File

@@ -5,6 +5,7 @@ namespace AnchorCli.OpenRouter;
/// </summary>
internal sealed class TokenTracker
{
public string Provider { get; set; } = "Unknown";
public long SessionInputTokens { get; private set; }
public long SessionOutputTokens { get; private set; }
public int RequestCount { get; private set; }
@@ -23,7 +24,6 @@ internal sealed class TokenTracker
/// <summary>Fixed USD per API request.</summary>
public decimal RequestPrice { get; set; }
/// <summary>
/// Record usage from one response (may span multiple LLM rounds).
/// </summary>

293
PROVIDERS.md Normal file
View File

@@ -0,0 +1,293 @@
# Provider Support Plan
## Current Problems
1. **OpenRouter Hardcoded**: Endpoint, headers, and pricing API calls are hardcoded to OpenRouter
2. **Config Ineffective**: SetupTui allows "custom endpoint" but Program.cs ignores it
3. **Token Count**: Token usage tracking only works with OpenRouter response headers
4. **Pricing Only for One Provider**: Models list shows pricing, but only when using OpenRouter
---
## Goals
1. Make the system **endpoint-agnostic**
2. Support pricing/token tracking for **multiple providers**
3. Keep **OpenRouter as the default** (familiar)
4. Allow users to configure any OpenAI-compatible endpoint
5. Show pricing/token info **only when available** for each provider
---
## Provider Categories
### Tier 1: Native Support (Built-in)
- OpenRouter (default)
- Ollama (local, no auth)
- Groq (high-speed inference)
- Anthropic (native or via API)
- OpenAI (official api)
### Tier 2: Config-Based Support
- Cerebras
- DeepSeek
- Any OpenAI-compatible endpoint that supports custom headers
### Tier 3: Manual Configuration Required
- Self-hosted endpoints
- Corporate proxies
- Custom middleware layers
---
```csharp
// Example: Provider interface
class PricingProvider
{
// Get pricing info from provider's API
async Task<List<ModelPricing>> GetModelsAsync(string apiKey);
// Get tokens from response
async Task<TokenUsage> GetTokensFromResponseAsync(HttpResponseMessage response);
// Add provider-specific headers if needed
void AddHeaders(HttpRequestMessage request, string apiKey);
}
```
**Supported Implementations:**
- `OpenRouterProvider` (uses `/api/v1/models` + `x-total-tokens`)
- `GroqProvider` (uses Groq's pricing API + response headers)
- `OllamaProvider` (free tier, no pricing lookup, basic token counting)
- `OpenAIProvider` (uses OpenAI's model list + token counting)
- `GenericProvider` (fallback for any OpenAI-compatible endpoint)
**Configuration:**
Store provider selection in `anchor.config.json`:
```json
{
"apiKey": "your-key",
"model": "qwen3.5-27b",
"endpoint": "https://openrouter.ai/api/v1",
"provider": "openrouter"
}
```
Auto-detect provider from endpoint URL if not specified.
---
## Pricing System
### Current State
- Uses OpenRouter's `/api/v1/models` endpoint
- Displays pricing in a table during startup
- Only works when using OpenRouter
### Improved Behavior
**When endpoint matches known provider:**
1. Fetch pricing from that provider's API
2. Display pricing in the startup table
3. Show per-prompt costs in chat output
**When endpoint is generic/unsupported:**
1. Skip API call (no pricing lookup)
2. Display `---` or `$` placeholders
3. Optional: Show "Pricing not available" note
**User Feedback:**
- Show clear messaging: "Pricing data loaded from OpenRouter"
- Show: "Pricing not available for this endpoint" (for unsupported)
- Don't break chat functionality if pricing fails
### Pricing Data Format
Store in `ModelPricing` class:
```csharp
class ModelPricing
{
string ModelId;
decimal InputPricePerMTokens;
decimal OutputPricePerMTokens;
double? CacheCreationPricePerMTokens; // if supported
}
```
---
## Token Tracking System
### Current State
- Uses `x-total-tokens` from OpenRouter headers
- Only works with OpenRouter responses
### Multi-Provider Strategy
**OpenRouter:**
- Use `x-total-tokens` header
- Use `x-response-timing` for latency tracking
**Groq:**
- Use `x-groq-tokens` header
- Use `x-groq-response-time` for latency
**OpenAI:**
- Use `x-ai-response-tokens` header (if available)
- Fall back to response body if needed
**Ollama:**
- No official token counting
- Use output length as proxy estimate
- Optional: Show message token estimates
**Generic/Fallback:**
- Parse `total_tokens` from response JSON
- Fall back to character count estimates
- Show placeholder when unavailable
### Integration Points
**During Chat Session:**
1. After each response, extract tokens from response headers
2. Store in `ChatSession.TokensUsed` object
3. Display in status bar: `Tokens: 128/2048 • Cost: $0.002`
**At Session End:**
1. Show summary: `Total tokens: 1,024 | Total cost: $0.015`
2. Write to session log or history file
---
## Implementation Roadmap
### Phase 1: Conditional Pricing (Current Issues First)
- [ ] Check if endpoint is OpenRouter before fetching pricing
- [ ] Skip pricing API call for non-OpenRouter endpoints
- [ ] Show placeholder message if pricing not available
- [ ] **Time estimate:** 2 hours
### Phase 2: Provider Configuration
- [ ] Add `provider` field to `AnchorConfig` model
- [ ] Update `SetupTui` to ask "Which provider?" (openrouter, ollama, groq, etc.)
- [ ] Auto-detect provider from endpoint URL (smart default)
- [ ] Write provider to config file on setup
- [ ] **Time estimate:** 3 hours
### Phase 3: Provider Abstraction
- [ ] Create `IPricingProvider` interface
- [ ] Move existing `PricingProvider` to `OpenRouterProvider`
- [ ] Create `GenericPricingProvider` for fallback
- [ ] Add provider factory: `ProviderFactory.Create(providerName)`
- [ ] **Time estimate:** 5 hours
### Phase 4: Token Tracking Enhancement
- [ ] Create `ITokenTracker` interface
- [ ] Implement token extraction for multiple providers
- [ ] Display token usage in status bar
- [ ] Add per-prompt cost calculation
- [ ] **Time estimate:** 6 hours
### Phase 5: Second Provider Implementation
- [ ] Implement `GroqProvider` (similar to OpenRouter)
- [ ] Test with Groq API
- [ ] Update documentation
- [ ] **Time estimate:** 4 hours
### Phase 6: Future-Proofing (Optional)
- [ ] Add plugin system for custom providers
- [ ] Allow users to define custom pricing rules
- [ ] Support OpenRouter-compatible custom endpoints
- [ ] **Time estimate:** 8+ hours
---
## User Configuration Guide
### Automatic Setup
Run `/setup` in the chat or `anchor setup` in CLI:
```
Which provider are you using?
1) OpenRouter (qwen models)
2) Groq (qwen/gemma models)
3) Ollama (local models)
4) OpenAI (gpt models)
5) Custom endpoint
```
### Manual Configuration
Edit `anchor.config.json` directly:
```json
{
"apiKey": "your-api-key",
"model": "qwen3.5-27b",
"endpoint": "https://api.groq.com/openai/v1",
"provider": "groq" // optional, auto-detected if missing
}
```
### Environment Variables
For custom setup:
```
ANCHOR_ENDPOINT=https://api.groq.com/openai/v1
ANCHOR_PROVIDER=groq
ANCHOR_API_KEY=...
ANCHOR_MODEL=qwen3.5-27b
```
---
## Known Limitations
### Tier 1 Providers (Full Support)
**✓ OpenRouter**
- Pricing: ✓ (native API)
- Tokens: ✓ (response headers)
- Cost tracking: ✓
**✓ Groq** (after Phase 4)
- Pricing: ✓ (will add)
- Tokens: ✓ (response headers)
- Cost tracking: ✓
### Tier 2 Providers (Partial Support)
**○ Ollama**
- Pricing: ○ (free, no lookup needed)
- Tokens: ○ (estimated from output)
- Cost tracking: ○ (placeholder)
**○ OpenAI**
- Pricing: ○ (manual pricing display)
- Tokens: ○ (header extraction)
- Cost tracking: ○ (config-based)
### Tier 3 Providers (Basic Support)
**□ Custom Endpoints**
- Pricing: □ (manual only)
- Tokens: □ (fallback parsing)
- Cost tracking: □ (user-defined)
---
## Future Enhancements
1. **Pricing Database**: Maintain own pricing database (like OpenRouter's)
2. **Cost Estimator**: Predict costs before sending message
3. **Usage Alerts**: Warn user when approaching budget limits
4. **Multi-Model Support**: Compare costs between different providers
5. **Plugin System**: Allow community to add new providers
---
## Success Criteria
- ✅ Users can choose from 3+ providers in setup
- ✅ Pricing displays only for supported endpoints
- ✅ Token tracking works for all Tier 1 providers
- ✅ No breaking changes to existing OpenRouter users
- ✅ Clear documentation on what each provider supports
- ✅ Graceful degradation for unsupported features
---
*Last Updated: 2025-12-23*

View File

@@ -1,4 +1,5 @@
using System.ClientModel;
using AnchorCli.Providers;
using Microsoft.Extensions.AI;
using OpenAI;
using AnchorCli;
@@ -15,10 +16,11 @@ if (args.Length > 0 && args[0].Equals("setup", StringComparison.OrdinalIgnoreCas
}
// ── Config ──────────────────────────────────────────────────────────────
const string endpoint = "https://openrouter.ai/api/v1";
var cfg = AnchorConfig.Load();
string apiKey = cfg.ApiKey;
string model = cfg.Model;
string provider = cfg.Provider ?? "openrouter";
string endpoint = cfg.Endpoint ?? "https://openrouter.ai/api/v1";
if (string.IsNullOrWhiteSpace(apiKey))
{
@@ -26,11 +28,14 @@ if (string.IsNullOrWhiteSpace(apiKey))
return;
}
// ── Fetch model pricing from OpenRouter ─────────────────────────────────
var pricingProvider = new PricingProvider();
var tokenTracker = new TokenTracker();
// ── Create token extractor for this provider ───────────────────────────
var tokenExtractor = ProviderFactory.CreateTokenExtractorForEndpoint(endpoint);
var tokenTracker = new TokenTracker { Provider = tokenExtractor.ProviderName };
// ── Fetch model pricing (only for supported providers) ─────────────────
ModelInfo? modelInfo = null;
if (ProviderFactory.IsOpenRouter(endpoint))
{
await AnsiConsole.Status()
.Spinner(Spinner.Known.BouncingBar)
.SpinnerStyle(Style.Parse("cornflowerblue"))
@@ -38,6 +43,7 @@ await AnsiConsole.Status()
{
try
{
var pricingProvider = new OpenRouterProvider();
modelInfo = await pricingProvider.GetModelInfoAsync(model);
if (modelInfo?.Pricing != null)
{
@@ -48,6 +54,7 @@ await AnsiConsole.Status()
}
catch { /* pricing is best-effort */ }
});
}
// ── Pretty header ───────────────────────────────────────────────────────
AnsiConsole.Write(
@@ -68,9 +75,12 @@ var infoTable = new Table()
.AddColumn(new TableColumn("[dim]Value[/]"));
infoTable.AddRow("[grey]Model[/]", $"[cyan]{Markup.Escape(modelInfo?.Name ?? model)}[/]");
infoTable.AddRow("[grey]Endpoint[/]", $"[blue]OpenRouter[/]");
infoTable.AddRow("[grey]Provider[/]", $"[blue]{tokenExtractor.ProviderName}[/]");
infoTable.AddRow("[grey]Endpoint[/]", $"[dim]{endpoint}[/]");
infoTable.AddRow("[grey]CWD[/]", $"[green]{Markup.Escape(Environment.CurrentDirectory)}[/]");
if (modelInfo?.Pricing != null)
if (modelInfo?.Pricing != null)
{
var inM = tokenTracker.InputPrice * 1_000_000m;

View File

@@ -0,0 +1,89 @@
using System.Net.Http.Headers;
using System.Text.Json;
namespace AnchorCli.Providers;
/// <summary>
/// Generic token extractor for any OpenAI-compatible endpoint.
/// Tries common header names and JSON body parsing.
/// </summary>
internal sealed class GenericTokenExtractor : ITokenExtractor
{
public string ProviderName => "Generic";
public (int inputTokens, int outputTokens)? ExtractTokens(HttpResponseHeaders headers, string? responseBody)
{
// Try various common header names
var headerNames = new[] {
"x-total-tokens",
"x-ai-response-tokens",
"x-tokens",
"x-prompt-tokens",
"x-completion-tokens"
};
foreach (var headerName in headerNames)
{
if (headers.TryGetValues(headerName, out var values))
{
if (int.TryParse(values.FirstOrDefault(), out var tokens))
{
// Assume all tokens are output if we can't determine split
return (0, tokens);
}
}
}
// Fallback: try parsing from response body JSON
if (!string.IsNullOrEmpty(responseBody))
{
try
{
using var doc = JsonDocument.Parse(responseBody);
var root = doc.RootElement;
// Try standard OpenAI format: usage.prompt_tokens, usage.completion_tokens
if (root.TryGetProperty("usage", out var usage))
{
var prompt = usage.TryGetProperty("prompt_tokens", out var p) ? p.GetInt32() : 0;
var completion = usage.TryGetProperty("completion_tokens", out var c) ? c.GetInt32() : 0;
if (prompt > 0 || completion > 0)
{
return (prompt, completion);
}
}
}
catch
{
// Ignore parsing errors
}
}
return null;
}
public int? ExtractLatency(HttpResponseHeaders headers)
{
// Try various common latency headers
var headerNames = new[] {
"x-response-time",
"x-response-timing",
"x-latency-ms",
"x-duration-ms"
};
foreach (var headerName in headerNames)
{
if (headers.TryGetValues(headerName, out var values))
{
if (int.TryParse(values.FirstOrDefault(), out var latency))
{
return latency;
}
}
}
return null;
}
}

61
Providers/GroqProvider.cs Normal file
View File

@@ -0,0 +1,61 @@
using System.Net.Http.Headers;
namespace AnchorCli.Providers;
/// <summary>
/// Token extractor for Groq responses.
/// </summary>
internal sealed class GroqTokenExtractor : ITokenExtractor
{
public string ProviderName => "Groq";
public (int inputTokens, int outputTokens)? ExtractTokens(HttpResponseHeaders headers, string? responseBody)
{
// Groq provides x-groq-tokens header (format: "n;<prompt_tokens>,n;<completion_tokens>")
if (headers.TryGetValues("x-groq-tokens", out var values))
{
var tokenStr = values.FirstOrDefault();
if (!string.IsNullOrEmpty(tokenStr))
{
// Parse format: "n;123,n;45" where first is prompt, second is completion
var parts = tokenStr.Split(',');
if (parts.Length >= 2)
{
var inputPart = parts[0].Trim();
var outputPart = parts[1].Trim();
// Extract numbers after "n;"
if (inputPart.StartsWith("n;") && outputPart.StartsWith("n;"))
{
if (int.TryParse(inputPart[2..], out var input) &&
int.TryParse(outputPart[2..], out var output))
{
return (input, output);
}
}
}
}
}
// Fallback: try parsing from response body
if (!string.IsNullOrEmpty(responseBody))
{
// TODO: Parse usage from JSON body if headers aren't available
}
return null;
}
public int? ExtractLatency(HttpResponseHeaders headers)
{
if (headers.TryGetValues("x-groq-response-time", out var values))
{
if (int.TryParse(values.FirstOrDefault(), out var latency))
{
return latency;
}
}
return null;
}
}

View File

@@ -0,0 +1,18 @@
using AnchorCli.OpenRouter;
namespace AnchorCli.Providers;
/// <summary>
/// Interface for fetching model pricing information.
/// </summary>
internal interface IPricingProvider
{
/// <summary>
/// Fetches pricing info for a specific model.
/// </summary>
Task<ModelInfo?> GetModelInfoAsync(string modelId, CancellationToken ct = default);
/// <summary>
/// Fetches all available models with pricing.
/// </summary>
Task<Dictionary<string, ModelInfo>> GetAllModelsAsync(CancellationToken ct = default);
}

View File

@@ -0,0 +1,25 @@
using System.Net.Http.Headers;
namespace AnchorCli.Providers;
/// <summary>
/// Interface for extracting token usage from provider responses.
/// </summary>
internal interface ITokenExtractor
{
/// <summary>
/// Extracts token usage from response headers and/or body.
/// Returns (inputTokens, outputTokens) or null if unavailable.
/// </summary>
(int inputTokens, int outputTokens)? ExtractTokens(HttpResponseHeaders headers, string? responseBody);
/// <summary>
/// Gets the latency from response headers (in ms).
/// </summary>
int? ExtractLatency(HttpResponseHeaders headers);
/// <summary>
/// Gets the provider name for display purposes.
/// </summary>
string ProviderName { get; }
}

View File

@@ -0,0 +1,39 @@
using System.Net.Http.Headers;
namespace AnchorCli.Providers;
/// <summary>
/// Token extractor for Ollama responses.
/// Ollama doesn't provide official token counts, so we estimate.
/// </summary>
internal sealed class OllamaTokenExtractor : ITokenExtractor
{
public string ProviderName => "Ollama";
public (int inputTokens, int outputTokens)? ExtractTokens(HttpResponseHeaders headers, string? responseBody)
{
// Ollama doesn't provide token headers
return null;
}
public int? ExtractLatency(HttpResponseHeaders headers)
{
// Ollama doesn't provide latency headers
return null;
}
/// <summary>
/// Estimates token count from text length (rough approximation).
/// Assumes ~4 characters per token on average.
/// </summary>
public static int EstimateTokens(string text)
{
if (string.IsNullOrEmpty(text))
{
return 0;
}
// Rough estimate: 4 characters per token
return text.Length / 4;
}
}

View File

@@ -0,0 +1,40 @@
using System.Net.Http.Json;
using System.Text.Json;
using AnchorCli.OpenRouter;
namespace AnchorCli.Providers;
/// <summary>
/// Pricing provider for OpenRouter API.
/// </summary>
internal sealed class OpenRouterProvider : IPricingProvider
{
private const string ModelsUrl = "https://openrouter.ai/api/v1/models";
private static readonly HttpClient Http = new();
private Dictionary<string, ModelInfo>? _models;
static OpenRouterProvider()
{
OpenRouterHeaders.ApplyTo(Http);
}
public async Task<Dictionary<string, ModelInfo>> GetAllModelsAsync(CancellationToken ct = default)
{
if (_models != null) return _models;
var response = await Http.GetAsync(ModelsUrl, ct);
response.EnsureSuccessStatusCode();
var json = await response.Content.ReadAsStringAsync(ct);
var result = JsonSerializer.Deserialize(json, AppJsonContext.Default.ModelsResponse);
_models = result?.Data?.ToDictionary(m => m.Id) ?? [];
return _models;
}
public async Task<ModelInfo?> GetModelInfoAsync(string modelId, CancellationToken ct = default)
{
var models = await GetAllModelsAsync(ct);
return models.GetValueOrDefault(modelId);
}
}

View File

@@ -0,0 +1,42 @@
using System.Net.Http.Headers;
namespace AnchorCli.Providers;
/// <summary>
/// Token extractor for OpenRouter responses.
/// </summary>
internal sealed class OpenRouterTokenExtractor : ITokenExtractor
{
public string ProviderName => "OpenRouter";
public (int inputTokens, int outputTokens)? ExtractTokens(HttpResponseHeaders headers, string? responseBody)
{
// OpenRouter provides x-total-tokens header
if (headers.TryGetValues("x-total-tokens", out var values))
{
// Note: OpenRouter only provides total tokens, not split
// We'll estimate split based on typical ratios if needed
if (long.TryParse(values.FirstOrDefault(), out var total))
{
// For now, return total as output (placeholder until we have better splitting)
// In practice, you'd need to track input separately from the request
return (0, (int)total);
}
}
return null;
}
public int? ExtractLatency(HttpResponseHeaders headers)
{
if (headers.TryGetValues("x-response-timing", out var values))
{
if (int.TryParse(values.FirstOrDefault(), out var latency))
{
return latency;
}
}
return null;
}
}

View File

@@ -0,0 +1,70 @@
namespace AnchorCli.Providers;
/// <summary>
/// Factory for creating provider instances based on endpoint or provider name.
/// </summary>
internal static class ProviderFactory
{
/// <summary>
/// Creates a token extractor based on the provider name.
/// </summary>
public static ITokenExtractor CreateTokenExtractor(string providerName)
{
return providerName.ToLowerInvariant() switch
{
"openrouter" => new OpenRouterTokenExtractor(),
"groq" => new GroqTokenExtractor(),
"ollama" => new OllamaTokenExtractor(),
_ => new GenericTokenExtractor()
};
}
/// <summary>
/// Creates a token extractor by auto-detecting from the endpoint URL.
/// </summary>
public static ITokenExtractor CreateTokenExtractorForEndpoint(string endpoint)
{
if (string.IsNullOrEmpty(endpoint))
{
return new GenericTokenExtractor();
}
var url = endpoint.ToLowerInvariant();
if (url.Contains("openrouter"))
{
return new OpenRouterTokenExtractor();
}
if (url.Contains("groq"))
{
return new GroqTokenExtractor();
}
if (url.Contains("ollama") || url.Contains("localhost") || url.Contains("127.0.0.1"))
{
return new OllamaTokenExtractor();
}
return new GenericTokenExtractor();
}
/// <summary>
/// Creates a pricing provider based on the provider name.
/// Only OpenRouter has a pricing API currently.
/// </summary>
public static IPricingProvider? CreatePricingProvider(string providerName)
{
return providerName.ToLowerInvariant() switch
{
"openrouter" => new OpenRouterProvider(),
_ => null // Other providers don't have pricing APIs yet
};
}
/// <summary>
/// Determines if an endpoint is OpenRouter.
/// </summary>
public static bool IsOpenRouter(string endpoint) =>
!string.IsNullOrEmpty(endpoint) && endpoint.Contains("openrouter", StringComparison.OrdinalIgnoreCase);
}

View File

@@ -27,10 +27,85 @@ internal static class SetupTui
AnsiConsole.WriteLine();
// ── Provider ────────────────────────────────────────────────────
var providers = new List<(string Value, string Description)>
{
("openrouter", "default, pricing support"),
("groq", "high-speed inference"),
("ollama", "local, no auth required"),
("openai", "official OpenAI API"),
("custom", "generic OpenAI-compatible endpoint")
};
string currentProvider = config.Provider ?? "openrouter";
AnsiConsole.MarkupLine($" Current provider: [cyan]{Markup.Escape(currentProvider)}[/]");
var selectedProviderChoice = AnsiConsole.Prompt(
new SelectionPrompt<(string Value, string Description)>()
.Title(" Select a provider:")
.UseConverter(p => p.Value + (string.IsNullOrEmpty(p.Description) ? "" : $" [dim]({p.Description})[/]"))
.AddChoices(providers));
config.Provider = selectedProviderChoice.Value;
if (config.Provider == "custom")
{
string customEndpoint = AnsiConsole.Prompt(
new TextPrompt<string>(" Enter endpoint URL:")
.DefaultValue(config.Endpoint)
.AllowEmpty());
if (!string.IsNullOrWhiteSpace(customEndpoint))
{
config.Endpoint = customEndpoint.Trim();
}
}
else
{
config.Endpoint = config.Provider.ToLowerInvariant() switch
{
"openrouter" => "https://openrouter.ai/api/v1",
"groq" => "https://api.groq.com/openai/v1",
"ollama" => "http://localhost:11434/v1",
"openai" => "https://api.openai.com/v1",
_ => config.Endpoint
};
}
AnsiConsole.WriteLine();
// ── Model ─────────────────────────────────────────────────────
AnsiConsole.MarkupLine($" Current model: [cyan]{Markup.Escape(config.Model)}[/]");
var models = new List<(string Value, string Description)>
var models = config.Provider.ToLowerInvariant() switch
{
"groq" => new List<(string Value, string Description)>
{
("llama-3.3-70b-versatile", "fast, powerful"),
("llama-3.1-8b-instant", "very fast"),
("mixtral-8x7b-32768", "sparse MoE"),
("gemma2-9b-it", "Google's Gemma"),
("Custom...", "")
},
"ollama" => new List<(string Value, string Description)>
{
("llama3.2", "Meta's Llama 3.2"),
("qwen2.5", "Alibaba Qwen"),
("mistral", "Mistral AI"),
("codellama", "code-focused"),
("Custom...", "")
},
"openai" => new List<(string Value, string Description)>
{
("gpt-4o", "most capable"),
("gpt-4o-mini", "fast, affordable"),
("o1-preview", "reasoning model"),
("Custom...", "")
},
_ => new List<(string Value, string Description)>
{
("qwen/qwen3.5-397b-a17b", "smart, expensive"),
("qwen/qwen3.5-122b-a10b", "faster"),
@@ -38,6 +113,7 @@ internal static class SetupTui
("qwen/qwen3.5-flash-02-23", "cloud, fast"),
("qwen/qwen3.5-plus-02-15", "cloud, smart"),
("Custom...", "")
}
};
string selectedModel = AnsiConsole.Prompt(

File diff suppressed because it is too large Load Diff

134
docs/NEW_SYSTEM_DESIGN.md Normal file
View File

@@ -0,0 +1,134 @@
# Advanced AI Agent CLI System Design
This document outlines the architecture for a completely new, built-from-scratch AI Agent Command Line Interface system, inspired by the lessons learned from the `Anchor CLI` refactoring.
## 1. Core Principles
* **Event-Driven UI & Decoupled State:** The UI and display layers communicate exclusively through an asynchronous Event Bus.
* **Explicit Control Flow:** Core agent execution utilizes a Mediator pattern (Request/Response) for predictable, traceable control flow rather than pure event spaghetti.
* **Dependency Injection:** A robust IoC container manages lifecycles and dependencies.
* **Pluggable Architecture:** Everything—from the LLM provider to the UI renderer and memory storage—is an injectable plugin.
* **Stateless Components:** Services maintain minimal internal state. State is managed centrally in a session or context store with immutable snapshots.
* **Test-First Design:** Complete absence of static delegates and global mutable state ensures every component is unit-testable in isolation.
* **Pervasive Cancellation:** Every asynchronous operation accepts a `CancellationToken` for graceful termination.
## 2. High-Level Architecture & Project Structure (AOT-Ready)
The system is structurally divided into three distinct C# projects to enforce decoupling, testability, and future-proof design, while maintaining strict compatibility with **.NET Native AOT** compilation for single-file, zero-dependency distribution on Linux/Windows.
### 2.1 Project: `Anchor.AgentFramework` (Class Library)
The core logic and abstractions. It has **no knowledge** of the console, the file system, or specific LLM SDKs.
* **Contains:** Interfaces (`IEventBus`, `IMediator`, `IAgentAvatar`), Memory Management (`ISessionManager`), Execution Loop (`ChatCoordinator`), and the `ToolRunner`.
* **Responsibilities:** Orchestrating the agent's thought process, managing state, and firing events.
### 2.2 Project: `Anchor.Providers` (Class Library)
The vendor-specific implementations for Language Models.
* **Contains:** `OpenAIAvatar`, `AnthropicAvatar`.
* **Responsibilities:** Translating the framework's semantic requests into vendor-specific API calls (e.g., mapping `ToolResult` to OpenAI's tool response format) via SDKs like `Azure.AI.OpenAI`.
### 2.3 Project: `Anchor.Cli` (Console Application)
The "Hosting Shell" and the physical "Senses/Hands" of the application.
* **Contains:** `Program.cs` (Composition Root), `RichConsoleRenderer`, `ConsoleInputDispatcher`, and concrete Tool implementations (e.g., `FileSystemTool`, `CmdTool`).
* **Responsibilities:** Wiring up Dependency Injection, reading from stdin, rendering UI/spinners to stdout, and executing side-effects on the host OS.
### 2.4 Logical Layers
Across these projects, the system operates in five primary layers:
1. **Hosting & Lifecycle (The Host)**
2. **Event & Messaging Backbone (The Bus)**
3. **State & Memory Management (The Brain)**
4. **I/O & User Interface (The Senses & Voice)**
5. **Execution & Tooling (The Hands)**
### 2.5 Dependency Injection Graph
```text
Anchor.Cli (Composition Root - Program.cs)
├── IEventBus → AsyncEventBus
├── IMemoryStore → VectorMemoryStore / SQLiteMemoryStore
├── ISessionManager → ContextAwareSessionManager
│ └── ICompactionStrategy → SemanticCompactionStrategy
├── IUserInputDispatcher → ConsoleInputDispatcher
├── ICommandRegistry → DynamicCommandRegistry
├── IAgentAvatar (LLM Interface) → AnthropicAvatar / OpenAIAvatar
├── IResponseStreamer → TokenAwareResponseStreamer
├── IUiRenderer → RichConsoleRenderer
│ ├── ISpinnerManager → AsyncSpinnerManager
│ └── IStreamingRenderer → ConsoleStreamingRenderer
└── IToolRegistry → DynamicToolRegistry
└── (Injected Tools: FileSystemTool, CmdTool, WebSearchTool)
```
## 3. Component Details
### 3.1 The Messaging Backbone: `IEventBus` and `IMediator` (AOT Safe)
The system utilizes a dual-messaging approach to prevent "event spaghetti":
* **Publish-Subscribe (Events):** Used for things that *happened* and might have multiple or zero listeners (e.g., UI updates, diagnostics).
* `EventBus.PublishAsync(EventBase @event)`
* **Request-Response (Commands):** Used for linear, required actions with a return value.
* `Mediator.Send(IRequest<TResponse> request)`
> [!WARNING]
> Standard `MediatR` relies heavily on runtime reflection for handler discovery, making it **incompatible with Native AOT**. We must use an AOT-safe source-generated alternative, such as the [Mediator](https://github.com/martinothamar/Mediator) library, or implement a simple, source-generated Event/Command bus internally.
**Key Events (Pub/Sub):**
* `UserInputReceived`: Triggered when the user hits Enter.
* `LLMStreamDeltaReceived`: Emitted for token-by-token streaming to the UI.
* `ToolExecutionStarted` / `ToolExecutionCompleted`: Emitted for UI spinners and logging.
* `ContextLimitWarning`: High token usage indicator.
**Key Commands (Request/Response):**
* `ExecuteToolCommand`: Sent from the Avatar to the Tool Runner, returns a `ToolResult`.
### 3.2 The Brain: `ISessionManager` & Memory
Instead of just a simple list of messages, the new system uses a multi-tiered memory architecture with thread-safe access.
* **Short-Term Memory (Context Window):** The active conversation. Must yield **Immutable Context Snapshots** to prevent collection modification exceptions when tools/LLM run concurrently with background tasks.
* **Long-Term Memory (Vector DB):** Indexed facts, summaries, and user preferences.
* **ICompactionStrategy:**
Instead of implicitly using an LLM on the critical path, the system uses tiered, deterministic strategies:
1. **Sliding Window:** Automatically drop the oldest user/assistant message pairs.
2. **Tool Output Truncation:** Remove large file reads from old turns.
3. **LLM Summarization (Optional):** As a last resort, explicitly lock state and summarize old context into a "Context Digest".
### 3.3 The Senses & Voice: Event-Driven CLI UI
The UI is strictly separated from business logic, which is an ideal architecture for a dedicated CLI tool. The `RichConsoleRenderer` only listens to the `IEventBus`.
* **Input Loop:** `IUserInputDispatcher` sits in a loop reading stdin. When input is received, it fires `UserInputReceived`. It captures `Ctrl+C` to trigger a global `CancellationToken`.
* **Output Loop:** `IUiRenderer` subscribes to `LLMStreamDeltaReceived` and renders tokens. It subscribes to `ToolExecutionStarted` and spins up a dedicated UI spinner, preventing async console output from overwriting the active prompt.
* **Headless CLI Mode:** For CI/CD environments or scripting, the system can run non-interactively by simply swapping the `RichConsoleRenderer` with a `BasicLoggingRenderer`—the core agent logic remains untouched.
### 3.4 The Hands: Plugins and Tooling
Tools are no longer hardcoded.
* **IToolRegistry:** Discovers tools at startup via Reflection or Assembly Scanning.
* **Tool Execution:** When the LLM API returns a `tool_calls` stop reason, the `IAgentAvatar` iteratively or concurrently sends an `ExecuteToolCommand` via the Mediator. It directly awaits the results, appends them to the context snapshot, and resumes the LLM generation. This provides explicit, traceable control flow.
* **Cancellation:** Every async method across the entire system accepts a `CancellationToken` to allow graceful termination of infinite loops or runaway processes.
## 4. Execution Flow (Anatomy of a User Turn)
1. **Input:** User types "Find the bug in main.py".
2. **Dispatch:** `ConsoleInputDispatcher` reads it and publishes `UserInputReceived`.
3. **Routing:** Built-in command handler (if applicable) checks if it's a structural command (`/clear`, `/exit`). Otherwise `SessionManager` adds it to the active context.
4. **Inference:** A `ChatCoordinator` service reacts to the updated context and asks the `IAgentAvatar` for a response.
5. **Streaming:** The Avatar calls the Anthropic/OpenAI API. As tokens arrive, it publishes `LLMStreamDeltaReceived`.
6. **Rendering:** `RichConsoleRenderer` receives the deltas and prints them to the terminal.
7. **Tool Request:** The LLM API returns a tool call. The Avatar dispatches an `ExecuteToolCommand` via the Mediator. The EventBus also publishes a `ToolExecutionStarted` event for the UI spinner.
8. **Execution & Feedback:** `ToolRunner` handles the command, runs it safely with the `CancellationToken`, and returns the result back to the Avatar. The Avatar feeds this back to the LLM API automatically.
9. **Completion:** The turn ends. The `SessionManager` checks token bounds and runs compaction if necessary.
## 5. Conclusion (Native AOT Focus)
While `ARCHITECTURE_REFACTOR.md` focuses on migrating a legacy "God Class", this new design assumes a green-field, **AOT-first** approach.
To achieve true Native AOT, we must strictly avoid runtime reflection. This means:
1. Using `CreateSlimBuilder()` instead of `CreateDefaultBuilder()` in `Microsoft.Extensions.Hosting`.
2. Using Source Generators for Dependency Injection setup.
3. Using Source Generators for JSON Serialization (`System.Text.Json.Serialization.JsonSerializableAttribute`).
4. Replacing reflection-heavy libraries like `MediatR` and `Scrutor` with AOT-friendly source-generated alternatives.
By adhering to these constraints, the resulting single-binary Linux executable will have near-instant startup time and a dramatically reduced memory footprint compared to a standard JIT-compiled .NET application.

112
docs/ToolConsolidation.md Normal file
View File

@@ -0,0 +1,112 @@
# Tool Consolidation Ideas
This document outlines opportunities to merge similar tools to simplify the API.
## 1. File Write Operations
**Current tools:** `CreateFile`, `InsertAfter`, `AppendToFile`
**Proposed merge:** `WriteToFile`
```csharp
public static string WriteToFile(
string path,
string[] content,
string? mode = "create",
string? anchor = null)
```
**Behavior:**
- `mode="create"` - Creates new file (error if exists)
- `mode="append"` - Appends to EOF (creates if missing)
- `mode="insert"` - Inserts after anchor (requires existing file)
**Benefits:**
- Reduces 3 tools to 1
- Cleaner API for LLM
- Unified error handling
## 2. File Move Operations
**Current tools:** `RenameFile`, `CopyFile`
**Proposed merge:** `MoveFile`
```csharp
public static string MoveFile(
string sourcePath,
string destinationPath,
bool copy = false)
```
**Behavior:**
- `copy=false` - Moves file (current RenameFile behavior)
- `copy=true` - Copies file (current CopyFile behavior)
**Benefits:**
- 90% identical logic
- Only difference is File.Move vs File.Copy
- Both create parent directories
- Similar error handling patterns
## 3. Grep Operations
**Current tools:** `GrepFile`, `GrepRecursive`
**Proposed merge:** `Grep`
```csharp
public static string Grep(
string path,
string pattern,
bool recursive = false,
string? filePattern = null)
```
**Behavior:**
- `recursive=false` - Searches single file (current GrepFile)
- `recursive=true` - Searches directory recursively (current GrepRecursive)
- `filePattern` - Optional glob to filter files when recursive
**Benefits:**
- Very similar logic
- Reduces 2 tools to 1
- Cleaner API for LLM
## 4. Delete Operations
**Current tools:** `DeleteFile`, `DeleteDir`
**Proposed merge:** `Delete`
```csharp
public static string Delete(
string path,
bool recursive = true)
```
**Behavior:**
- Auto-detects if path is file or directory
- `recursive=true` - Delete directory and all contents
- `recursive=false` - Only matters for directories (error if not empty)
**Benefits:**
- Auto-detects file vs directory
- Similar error handling patterns
- Reduces 2 tools to 1
## Summary
These consolidations would reduce the tool count from 17 to 13 tools, making the API simpler and easier for the LLM to use effectively.
**High priority merges:**
1. ✅ File Write Operations (3 → 1)
2. ✅ File Move Operations (2 → 1)
3. ✅ Grep Operations (2 → 1)
4. ✅ Delete Operations (2 → 1)
**Kept separate:**
- `ReadFile` - distinct read-only operation
- `ListDir`, `FindFiles`, `GetFileInfo` - different purposes
- `CreateDir` - simple enough to keep standalone
- `ReplaceLines`, `InsertAfter`, `DeleteRange` - too complex to merge without confusing LLM