1
0
Files
AnchorCli/docs/NEW_SYSTEM_DESIGN.md

9.5 KiB

Advanced AI Agent CLI System Design

This document outlines the architecture for a completely new, built-from-scratch AI Agent Command Line Interface system, inspired by the lessons learned from the Anchor CLI refactoring.

1. Core Principles

  • Event-Driven UI & Decoupled State: The UI and display layers communicate exclusively through an asynchronous Event Bus.
  • Explicit Control Flow: Core agent execution utilizes a Mediator pattern (Request/Response) for predictable, traceable control flow rather than pure event spaghetti.
  • Dependency Injection: A robust IoC container manages lifecycles and dependencies.
  • Pluggable Architecture: Everything—from the LLM provider to the UI renderer and memory storage—is an injectable plugin.
  • Stateless Components: Services maintain minimal internal state. State is managed centrally in a session or context store with immutable snapshots.
  • Test-First Design: Complete absence of static delegates and global mutable state ensures every component is unit-testable in isolation.
  • Pervasive Cancellation: Every asynchronous operation accepts a CancellationToken for graceful termination.

2. High-Level Architecture & Project Structure (AOT-Ready)

The system is structurally divided into three distinct C# projects to enforce decoupling, testability, and future-proof design, while maintaining strict compatibility with .NET Native AOT compilation for single-file, zero-dependency distribution on Linux/Windows.

2.1 Project: Anchor.AgentFramework (Class Library)

The core logic and abstractions. It has no knowledge of the console, the file system, or specific LLM SDKs.

  • Contains: Interfaces (IEventBus, IMediator, IAgentAvatar), Memory Management (ISessionManager), Execution Loop (ChatCoordinator), and the ToolRunner.
  • Responsibilities: Orchestrating the agent's thought process, managing state, and firing events.

2.2 Project: Anchor.Providers (Class Library)

The vendor-specific implementations for Language Models.

  • Contains: OpenAIAvatar, AnthropicAvatar.
  • Responsibilities: Translating the framework's semantic requests into vendor-specific API calls (e.g., mapping ToolResult to OpenAI's tool response format) via SDKs like Azure.AI.OpenAI.

2.3 Project: Anchor.Cli (Console Application)

The "Hosting Shell" and the physical "Senses/Hands" of the application.

  • Contains: Program.cs (Composition Root), RichConsoleRenderer, ConsoleInputDispatcher, and concrete Tool implementations (e.g., FileSystemTool, CmdTool).
  • Responsibilities: Wiring up Dependency Injection, reading from stdin, rendering UI/spinners to stdout, and executing side-effects on the host OS.

2.4 Logical Layers

Across these projects, the system operates in five primary layers:

  1. Hosting & Lifecycle (The Host)
  2. Event & Messaging Backbone (The Bus)
  3. State & Memory Management (The Brain)
  4. I/O & User Interface (The Senses & Voice)
  5. Execution & Tooling (The Hands)

2.5 Dependency Injection Graph

Anchor.Cli (Composition Root - Program.cs)
│
├── IEventBus → AsyncEventBus
│
├── IMemoryStore → VectorMemoryStore / SQLiteMemoryStore
├── ISessionManager → ContextAwareSessionManager
│   └── ICompactionStrategy → SemanticCompactionStrategy
│
├── IUserInputDispatcher → ConsoleInputDispatcher
├── ICommandRegistry → DynamicCommandRegistry
│
├── IAgentAvatar (LLM Interface) → AnthropicAvatar / OpenAIAvatar
├── IResponseStreamer → TokenAwareResponseStreamer
│
├── IUiRenderer → RichConsoleRenderer
│   ├── ISpinnerManager → AsyncSpinnerManager
│   └── IStreamingRenderer → ConsoleStreamingRenderer
│
└── IToolRegistry → DynamicToolRegistry
    └── (Injected Tools: FileSystemTool, CmdTool, WebSearchTool)

3. Component Details

3.1 The Messaging Backbone: IEventBus and IMediator (AOT Safe)

The system utilizes a dual-messaging approach to prevent "event spaghetti":

  • Publish-Subscribe (Events): Used for things that happened and might have multiple or zero listeners (e.g., UI updates, diagnostics).
    • EventBus.PublishAsync(EventBase @event)
  • Request-Response (Commands): Used for linear, required actions with a return value.
    • Mediator.Send(IRequest<TResponse> request)

Warning

Standard MediatR relies heavily on runtime reflection for handler discovery, making it incompatible with Native AOT. We must use an AOT-safe source-generated alternative, such as the Mediator library, or implement a simple, source-generated Event/Command bus internally.

Key Events (Pub/Sub):

  • UserInputReceived: Triggered when the user hits Enter.
  • LLMStreamDeltaReceived: Emitted for token-by-token streaming to the UI.
  • ToolExecutionStarted / ToolExecutionCompleted: Emitted for UI spinners and logging.
  • ContextLimitWarning: High token usage indicator.

Key Commands (Request/Response):

  • ExecuteToolCommand: Sent from the Avatar to the Tool Runner, returns a ToolResult.

3.2 The Brain: ISessionManager & Memory

Instead of just a simple list of messages, the new system uses a multi-tiered memory architecture with thread-safe access.

  • Short-Term Memory (Context Window): The active conversation. Must yield Immutable Context Snapshots to prevent collection modification exceptions when tools/LLM run concurrently with background tasks.
  • Long-Term Memory (Vector DB): Indexed facts, summaries, and user preferences.
  • ICompactionStrategy: Instead of implicitly using an LLM on the critical path, the system uses tiered, deterministic strategies:
    1. Sliding Window: Automatically drop the oldest user/assistant message pairs.
    2. Tool Output Truncation: Remove large file reads from old turns.
    3. LLM Summarization (Optional): As a last resort, explicitly lock state and summarize old context into a "Context Digest".

3.3 The Senses & Voice: Event-Driven CLI UI

The UI is strictly separated from business logic, which is an ideal architecture for a dedicated CLI tool. The RichConsoleRenderer only listens to the IEventBus.

  • Input Loop: IUserInputDispatcher sits in a loop reading stdin. When input is received, it fires UserInputReceived. It captures Ctrl+C to trigger a global CancellationToken.
  • Output Loop: IUiRenderer subscribes to LLMStreamDeltaReceived and renders tokens. It subscribes to ToolExecutionStarted and spins up a dedicated UI spinner, preventing async console output from overwriting the active prompt.
  • Headless CLI Mode: For CI/CD environments or scripting, the system can run non-interactively by simply swapping the RichConsoleRenderer with a BasicLoggingRenderer—the core agent logic remains untouched.

3.4 The Hands: Plugins and Tooling

Tools are no longer hardcoded.

  • IToolRegistry: Discovers tools at startup via Reflection or Assembly Scanning.
  • Tool Execution: When the LLM API returns a tool_calls stop reason, the IAgentAvatar iteratively or concurrently sends an ExecuteToolCommand via the Mediator. It directly awaits the results, appends them to the context snapshot, and resumes the LLM generation. This provides explicit, traceable control flow.
  • Cancellation: Every async method across the entire system accepts a CancellationToken to allow graceful termination of infinite loops or runaway processes.

4. Execution Flow (Anatomy of a User Turn)

  1. Input: User types "Find the bug in main.py".
  2. Dispatch: ConsoleInputDispatcher reads it and publishes UserInputReceived.
  3. Routing: Built-in command handler (if applicable) checks if it's a structural command (/clear, /exit). Otherwise SessionManager adds it to the active context.
  4. Inference: A ChatCoordinator service reacts to the updated context and asks the IAgentAvatar for a response.
  5. Streaming: The Avatar calls the Anthropic/OpenAI API. As tokens arrive, it publishes LLMStreamDeltaReceived.
  6. Rendering: RichConsoleRenderer receives the deltas and prints them to the terminal.
  7. Tool Request: The LLM API returns a tool call. The Avatar dispatches an ExecuteToolCommand via the Mediator. The EventBus also publishes a ToolExecutionStarted event for the UI spinner.
  8. Execution & Feedback: ToolRunner handles the command, runs it safely with the CancellationToken, and returns the result back to the Avatar. The Avatar feeds this back to the LLM API automatically.
  9. Completion: The turn ends. The SessionManager checks token bounds and runs compaction if necessary.

5. Conclusion (Native AOT Focus)

While ARCHITECTURE_REFACTOR.md focuses on migrating a legacy "God Class", this new design assumes a green-field, AOT-first approach. To achieve true Native AOT, we must strictly avoid runtime reflection. This means:

  1. Using CreateSlimBuilder() instead of CreateDefaultBuilder() in Microsoft.Extensions.Hosting.
  2. Using Source Generators for Dependency Injection setup.
  3. Using Source Generators for JSON Serialization (System.Text.Json.Serialization.JsonSerializableAttribute).
  4. Replacing reflection-heavy libraries like MediatR and Scrutor with AOT-friendly source-generated alternatives.

By adhering to these constraints, the resulting single-binary Linux executable will have near-instant startup time and a dramatically reduced memory footprint compared to a standard JIT-compiled .NET application.