1
0

docs: Document planned architectural refactor and tool consolidation.

This commit is contained in:
2026-03-05 22:10:54 +01:00
parent c7e7976d9d
commit 112f1f3202
3 changed files with 1347 additions and 0 deletions

134
docs/NEW_SYSTEM_DESIGN.md Normal file
View File

@@ -0,0 +1,134 @@
# Advanced AI Agent CLI System Design
This document outlines the architecture for a completely new, built-from-scratch AI Agent Command Line Interface system, inspired by the lessons learned from the `Anchor CLI` refactoring.
## 1. Core Principles
* **Event-Driven UI & Decoupled State:** The UI and display layers communicate exclusively through an asynchronous Event Bus.
* **Explicit Control Flow:** Core agent execution utilizes a Mediator pattern (Request/Response) for predictable, traceable control flow rather than pure event spaghetti.
* **Dependency Injection:** A robust IoC container manages lifecycles and dependencies.
* **Pluggable Architecture:** Everything—from the LLM provider to the UI renderer and memory storage—is an injectable plugin.
* **Stateless Components:** Services maintain minimal internal state. State is managed centrally in a session or context store with immutable snapshots.
* **Test-First Design:** Complete absence of static delegates and global mutable state ensures every component is unit-testable in isolation.
* **Pervasive Cancellation:** Every asynchronous operation accepts a `CancellationToken` for graceful termination.
## 2. High-Level Architecture & Project Structure (AOT-Ready)
The system is structurally divided into three distinct C# projects to enforce decoupling, testability, and future-proof design, while maintaining strict compatibility with **.NET Native AOT** compilation for single-file, zero-dependency distribution on Linux/Windows.
### 2.1 Project: `Anchor.AgentFramework` (Class Library)
The core logic and abstractions. It has **no knowledge** of the console, the file system, or specific LLM SDKs.
* **Contains:** Interfaces (`IEventBus`, `IMediator`, `IAgentAvatar`), Memory Management (`ISessionManager`), Execution Loop (`ChatCoordinator`), and the `ToolRunner`.
* **Responsibilities:** Orchestrating the agent's thought process, managing state, and firing events.
### 2.2 Project: `Anchor.Providers` (Class Library)
The vendor-specific implementations for Language Models.
* **Contains:** `OpenAIAvatar`, `AnthropicAvatar`.
* **Responsibilities:** Translating the framework's semantic requests into vendor-specific API calls (e.g., mapping `ToolResult` to OpenAI's tool response format) via SDKs like `Azure.AI.OpenAI`.
### 2.3 Project: `Anchor.Cli` (Console Application)
The "Hosting Shell" and the physical "Senses/Hands" of the application.
* **Contains:** `Program.cs` (Composition Root), `RichConsoleRenderer`, `ConsoleInputDispatcher`, and concrete Tool implementations (e.g., `FileSystemTool`, `CmdTool`).
* **Responsibilities:** Wiring up Dependency Injection, reading from stdin, rendering UI/spinners to stdout, and executing side-effects on the host OS.
### 2.4 Logical Layers
Across these projects, the system operates in five primary layers:
1. **Hosting & Lifecycle (The Host)**
2. **Event & Messaging Backbone (The Bus)**
3. **State & Memory Management (The Brain)**
4. **I/O & User Interface (The Senses & Voice)**
5. **Execution & Tooling (The Hands)**
### 2.5 Dependency Injection Graph
```text
Anchor.Cli (Composition Root - Program.cs)
├── IEventBus → AsyncEventBus
├── IMemoryStore → VectorMemoryStore / SQLiteMemoryStore
├── ISessionManager → ContextAwareSessionManager
│ └── ICompactionStrategy → SemanticCompactionStrategy
├── IUserInputDispatcher → ConsoleInputDispatcher
├── ICommandRegistry → DynamicCommandRegistry
├── IAgentAvatar (LLM Interface) → AnthropicAvatar / OpenAIAvatar
├── IResponseStreamer → TokenAwareResponseStreamer
├── IUiRenderer → RichConsoleRenderer
│ ├── ISpinnerManager → AsyncSpinnerManager
│ └── IStreamingRenderer → ConsoleStreamingRenderer
└── IToolRegistry → DynamicToolRegistry
└── (Injected Tools: FileSystemTool, CmdTool, WebSearchTool)
```
## 3. Component Details
### 3.1 The Messaging Backbone: `IEventBus` and `IMediator` (AOT Safe)
The system utilizes a dual-messaging approach to prevent "event spaghetti":
* **Publish-Subscribe (Events):** Used for things that *happened* and might have multiple or zero listeners (e.g., UI updates, diagnostics).
* `EventBus.PublishAsync(EventBase @event)`
* **Request-Response (Commands):** Used for linear, required actions with a return value.
* `Mediator.Send(IRequest<TResponse> request)`
> [!WARNING]
> Standard `MediatR` relies heavily on runtime reflection for handler discovery, making it **incompatible with Native AOT**. We must use an AOT-safe source-generated alternative, such as the [Mediator](https://github.com/martinothamar/Mediator) library, or implement a simple, source-generated Event/Command bus internally.
**Key Events (Pub/Sub):**
* `UserInputReceived`: Triggered when the user hits Enter.
* `LLMStreamDeltaReceived`: Emitted for token-by-token streaming to the UI.
* `ToolExecutionStarted` / `ToolExecutionCompleted`: Emitted for UI spinners and logging.
* `ContextLimitWarning`: High token usage indicator.
**Key Commands (Request/Response):**
* `ExecuteToolCommand`: Sent from the Avatar to the Tool Runner, returns a `ToolResult`.
### 3.2 The Brain: `ISessionManager` & Memory
Instead of just a simple list of messages, the new system uses a multi-tiered memory architecture with thread-safe access.
* **Short-Term Memory (Context Window):** The active conversation. Must yield **Immutable Context Snapshots** to prevent collection modification exceptions when tools/LLM run concurrently with background tasks.
* **Long-Term Memory (Vector DB):** Indexed facts, summaries, and user preferences.
* **ICompactionStrategy:**
Instead of implicitly using an LLM on the critical path, the system uses tiered, deterministic strategies:
1. **Sliding Window:** Automatically drop the oldest user/assistant message pairs.
2. **Tool Output Truncation:** Remove large file reads from old turns.
3. **LLM Summarization (Optional):** As a last resort, explicitly lock state and summarize old context into a "Context Digest".
### 3.3 The Senses & Voice: Event-Driven CLI UI
The UI is strictly separated from business logic, which is an ideal architecture for a dedicated CLI tool. The `RichConsoleRenderer` only listens to the `IEventBus`.
* **Input Loop:** `IUserInputDispatcher` sits in a loop reading stdin. When input is received, it fires `UserInputReceived`. It captures `Ctrl+C` to trigger a global `CancellationToken`.
* **Output Loop:** `IUiRenderer` subscribes to `LLMStreamDeltaReceived` and renders tokens. It subscribes to `ToolExecutionStarted` and spins up a dedicated UI spinner, preventing async console output from overwriting the active prompt.
* **Headless CLI Mode:** For CI/CD environments or scripting, the system can run non-interactively by simply swapping the `RichConsoleRenderer` with a `BasicLoggingRenderer`—the core agent logic remains untouched.
### 3.4 The Hands: Plugins and Tooling
Tools are no longer hardcoded.
* **IToolRegistry:** Discovers tools at startup via Reflection or Assembly Scanning.
* **Tool Execution:** When the LLM API returns a `tool_calls` stop reason, the `IAgentAvatar` iteratively or concurrently sends an `ExecuteToolCommand` via the Mediator. It directly awaits the results, appends them to the context snapshot, and resumes the LLM generation. This provides explicit, traceable control flow.
* **Cancellation:** Every async method across the entire system accepts a `CancellationToken` to allow graceful termination of infinite loops or runaway processes.
## 4. Execution Flow (Anatomy of a User Turn)
1. **Input:** User types "Find the bug in main.py".
2. **Dispatch:** `ConsoleInputDispatcher` reads it and publishes `UserInputReceived`.
3. **Routing:** Built-in command handler (if applicable) checks if it's a structural command (`/clear`, `/exit`). Otherwise `SessionManager` adds it to the active context.
4. **Inference:** A `ChatCoordinator` service reacts to the updated context and asks the `IAgentAvatar` for a response.
5. **Streaming:** The Avatar calls the Anthropic/OpenAI API. As tokens arrive, it publishes `LLMStreamDeltaReceived`.
6. **Rendering:** `RichConsoleRenderer` receives the deltas and prints them to the terminal.
7. **Tool Request:** The LLM API returns a tool call. The Avatar dispatches an `ExecuteToolCommand` via the Mediator. The EventBus also publishes a `ToolExecutionStarted` event for the UI spinner.
8. **Execution & Feedback:** `ToolRunner` handles the command, runs it safely with the `CancellationToken`, and returns the result back to the Avatar. The Avatar feeds this back to the LLM API automatically.
9. **Completion:** The turn ends. The `SessionManager` checks token bounds and runs compaction if necessary.
## 5. Conclusion (Native AOT Focus)
While `ARCHITECTURE_REFACTOR.md` focuses on migrating a legacy "God Class", this new design assumes a green-field, **AOT-first** approach.
To achieve true Native AOT, we must strictly avoid runtime reflection. This means:
1. Using `CreateSlimBuilder()` instead of `CreateDefaultBuilder()` in `Microsoft.Extensions.Hosting`.
2. Using Source Generators for Dependency Injection setup.
3. Using Source Generators for JSON Serialization (`System.Text.Json.Serialization.JsonSerializableAttribute`).
4. Replacing reflection-heavy libraries like `MediatR` and `Scrutor` with AOT-friendly source-generated alternatives.
By adhering to these constraints, the resulting single-binary Linux executable will have near-instant startup time and a dramatically reduced memory footprint compared to a standard JIT-compiled .NET application.