docs: Document planned architectural refactor and tool consolidation.

2026-03-05 22:10:54 +01:00
parent c7e7976d9d
commit 112f1f3202
3 changed files with 1347 additions and 0 deletions
--- a/docs/NEW_SYSTEM_DESIGN.md
+++ b/docs/NEW_SYSTEM_DESIGN.md
@@ -0,0 +1,134 @@
+# Advanced AI Agent CLI System Design
+
+This document outlines the architecture for a completely new, built-from-scratch AI Agent Command Line Interface system, inspired by the lessons learned from the `Anchor CLI` refactoring.
+
+## 1. Core Principles
+*   **Event-Driven UI & Decoupled State:** The UI and display layers communicate exclusively through an asynchronous Event Bus.
+*   **Explicit Control Flow:** Core agent execution utilizes a Mediator pattern (Request/Response) for predictable, traceable control flow rather than pure event spaghetti.
+*   **Dependency Injection:** A robust IoC container manages lifecycles and dependencies.
+*   **Pluggable Architecture:** Everything—from the LLM provider to the UI renderer and memory storage—is an injectable plugin.
+*   **Stateless Components:** Services maintain minimal internal state. State is managed centrally in a session or context store with immutable snapshots.
+*   **Test-First Design:** Complete absence of static delegates and global mutable state ensures every component is unit-testable in isolation.
+*   **Pervasive Cancellation:** Every asynchronous operation accepts a `CancellationToken` for graceful termination.
+
+## 2. High-Level Architecture & Project Structure (AOT-Ready)
+
+The system is structurally divided into three distinct C# projects to enforce decoupling, testability, and future-proof design, while maintaining strict compatibility with **.NET Native AOT** compilation for single-file, zero-dependency distribution on Linux/Windows.
+
+### 2.1 Project: `Anchor.AgentFramework` (Class Library)
+The core logic and abstractions. It has **no knowledge** of the console, the file system, or specific LLM SDKs.
+*   **Contains:** Interfaces (`IEventBus`, `IMediator`, `IAgentAvatar`), Memory Management (`ISessionManager`), Execution Loop (`ChatCoordinator`), and the `ToolRunner`.
+*   **Responsibilities:** Orchestrating the agent's thought process, managing state, and firing events.
+
+### 2.2 Project: `Anchor.Providers` (Class Library)
+The vendor-specific implementations for Language Models.
+*   **Contains:** `OpenAIAvatar`, `AnthropicAvatar`.
+*   **Responsibilities:** Translating the framework's semantic requests into vendor-specific API calls (e.g., mapping `ToolResult` to OpenAI's tool response format) via SDKs like `Azure.AI.OpenAI`.
+
+### 2.3 Project: `Anchor.Cli` (Console Application)
+The "Hosting Shell" and the physical "Senses/Hands" of the application.
+*   **Contains:** `Program.cs` (Composition Root), `RichConsoleRenderer`, `ConsoleInputDispatcher`, and concrete Tool implementations (e.g., `FileSystemTool`, `CmdTool`).
+*   **Responsibilities:** Wiring up Dependency Injection, reading from stdin, rendering UI/spinners to stdout, and executing side-effects on the host OS.
+
+### 2.4 Logical Layers
+
+Across these projects, the system operates in five primary layers:
+
+1.  **Hosting & Lifecycle (The Host)**
+2.  **Event & Messaging Backbone (The Bus)**
+3.  **State & Memory Management (The Brain)**
+4.  **I/O & User Interface (The Senses & Voice)**
+5.  **Execution & Tooling (The Hands)**
+
+### 2.5 Dependency Injection Graph
+
+```text
+Anchor.Cli (Composition Root - Program.cs)
+│
+├── IEventBus → AsyncEventBus
+│
+├── IMemoryStore → VectorMemoryStore / SQLiteMemoryStore
+├── ISessionManager → ContextAwareSessionManager
+│   └── ICompactionStrategy → SemanticCompactionStrategy
+│
+├── IUserInputDispatcher → ConsoleInputDispatcher
+├── ICommandRegistry → DynamicCommandRegistry
+│
+├── IAgentAvatar (LLM Interface) → AnthropicAvatar / OpenAIAvatar
+├── IResponseStreamer → TokenAwareResponseStreamer
+│
+├── IUiRenderer → RichConsoleRenderer
+│   ├── ISpinnerManager → AsyncSpinnerManager
+│   └── IStreamingRenderer → ConsoleStreamingRenderer
+│
+└── IToolRegistry → DynamicToolRegistry
+    └── (Injected Tools: FileSystemTool, CmdTool, WebSearchTool)
+```
+
+## 3. Component Details
+
+### 3.1 The Messaging Backbone: `IEventBus` and `IMediator` (AOT Safe)
+The system utilizes a dual-messaging approach to prevent "event spaghetti":
+*   **Publish-Subscribe (Events):** Used for things that *happened* and might have multiple or zero listeners (e.g., UI updates, diagnostics).
+    *   `EventBus.PublishAsync(EventBase @event)`
+*   **Request-Response (Commands):** Used for linear, required actions with a return value.
+    *   `Mediator.Send(IRequest<TResponse> request)`
+
+> [!WARNING]
+> Standard `MediatR` relies heavily on runtime reflection for handler discovery, making it **incompatible with Native AOT**. We must use an AOT-safe source-generated alternative, such as the [Mediator](https://github.com/martinothamar/Mediator) library, or implement a simple, source-generated Event/Command bus internally.
+
+**Key Events (Pub/Sub):**
+*   `UserInputReceived`: Triggered when the user hits Enter.
+*   `LLMStreamDeltaReceived`: Emitted for token-by-token streaming to the UI.
+*   `ToolExecutionStarted` / `ToolExecutionCompleted`: Emitted for UI spinners and logging.
+*   `ContextLimitWarning`: High token usage indicator.
+
+**Key Commands (Request/Response):**
+*   `ExecuteToolCommand`: Sent from the Avatar to the Tool Runner, returns a `ToolResult`.
+
+### 3.2 The Brain: `ISessionManager` & Memory
+Instead of just a simple list of messages, the new system uses a multi-tiered memory architecture with thread-safe access.
+
+*   **Short-Term Memory (Context Window):** The active conversation. Must yield **Immutable Context Snapshots** to prevent collection modification exceptions when tools/LLM run concurrently with background tasks.
+*   **Long-Term Memory (Vector DB):** Indexed facts, summaries, and user preferences.
+*   **ICompactionStrategy:** 
+    Instead of implicitly using an LLM on the critical path, the system uses tiered, deterministic strategies:
+    1.  **Sliding Window:** Automatically drop the oldest user/assistant message pairs.
+    2.  **Tool Output Truncation:** Remove large file reads from old turns.
+    3.  **LLM Summarization (Optional):** As a last resort, explicitly lock state and summarize old context into a "Context Digest".
+
+### 3.3 The Senses & Voice: Event-Driven CLI UI
+The UI is strictly separated from business logic, which is an ideal architecture for a dedicated CLI tool. The `RichConsoleRenderer` only listens to the `IEventBus`.
+
+*   **Input Loop:** `IUserInputDispatcher` sits in a loop reading stdin. When input is received, it fires `UserInputReceived`. It captures `Ctrl+C` to trigger a global `CancellationToken`.
+*   **Output Loop:** `IUiRenderer` subscribes to `LLMStreamDeltaReceived` and renders tokens. It subscribes to `ToolExecutionStarted` and spins up a dedicated UI spinner, preventing async console output from overwriting the active prompt.
+*   **Headless CLI Mode:** For CI/CD environments or scripting, the system can run non-interactively by simply swapping the `RichConsoleRenderer` with a `BasicLoggingRenderer`—the core agent logic remains untouched.
+
+### 3.4 The Hands: Plugins and Tooling
+Tools are no longer hardcoded.
+
+*   **IToolRegistry:** Discovers tools at startup via Reflection or Assembly Scanning.
+*   **Tool Execution:** When the LLM API returns a `tool_calls` stop reason, the `IAgentAvatar` iteratively or concurrently sends an `ExecuteToolCommand` via the Mediator. It directly awaits the results, appends them to the context snapshot, and resumes the LLM generation. This provides explicit, traceable control flow.
+*   **Cancellation:** Every async method across the entire system accepts a `CancellationToken` to allow graceful termination of infinite loops or runaway processes.
+
+## 4. Execution Flow (Anatomy of a User Turn)
+
+1.  **Input:** User types "Find the bug in main.py".
+2.  **Dispatch:** `ConsoleInputDispatcher` reads it and publishes `UserInputReceived`.
+3.  **Routing:** Built-in command handler (if applicable) checks if it's a structural command (`/clear`, `/exit`). Otherwise `SessionManager` adds it to the active context.
+4.  **Inference:** A `ChatCoordinator` service reacts to the updated context and asks the `IAgentAvatar` for a response.
+5.  **Streaming:** The Avatar calls the Anthropic/OpenAI API. As tokens arrive, it publishes `LLMStreamDeltaReceived`.
+6.  **Rendering:** `RichConsoleRenderer` receives the deltas and prints them to the terminal.
+7.  **Tool Request:** The LLM API returns a tool call. The Avatar dispatches an `ExecuteToolCommand` via the Mediator. The EventBus also publishes a `ToolExecutionStarted` event for the UI spinner.
+8.  **Execution & Feedback:** `ToolRunner` handles the command, runs it safely with the `CancellationToken`, and returns the result back to the Avatar. The Avatar feeds this back to the LLM API automatically.
+9.  **Completion:** The turn ends. The `SessionManager` checks token bounds and runs compaction if necessary.
+
+## 5. Conclusion (Native AOT Focus)
+While `ARCHITECTURE_REFACTOR.md` focuses on migrating a legacy "God Class", this new design assumes a green-field, **AOT-first** approach. 
+To achieve true Native AOT, we must strictly avoid runtime reflection. This means:
+1.  Using `CreateSlimBuilder()` instead of `CreateDefaultBuilder()` in `Microsoft.Extensions.Hosting`.
+2.  Using Source Generators for Dependency Injection setup.
+3.  Using Source Generators for JSON Serialization (`System.Text.Json.Serialization.JsonSerializableAttribute`).
+4.  Replacing reflection-heavy libraries like `MediatR` and `Scrutor` with AOT-friendly source-generated alternatives.
+
+By adhering to these constraints, the resulting single-binary Linux executable will have near-instant startup time and a dramatically reduced memory footprint compared to a standard JIT-compiled .NET application.