# Toak Project Structure This document outlines the high-level architecture and directory structure of the Toak project to help contributors navigate the codebase. ## Overview Toak is designed as a fast, Linux-native dictation application utilizing C# AOT (Ahead-Of-Time compilation) for minimal latency. It operates primarily as a client-daemon architecture where background application state is managed by a daemon process while short-lived CLI commands issue control messages via Unix domain sockets. ## Directory Structure ```text Toak/ ├── Api/ │ ├── GroqApiClient.cs # Client for external transcription and LLM API calls (Groq/Whisper) │ └── Models/ # API payload representations ├── Assets/ # Sound files or other static resources ├── Audio/ │ └── AudioRecorder.cs # Handles audio capture via system utilities (e.g., pw-record from PipeWire) ├── Commands/ │ ├── ToggleCommand.cs # Start/stop recording and pass pipe/copy flags │ ├── DiscardCommand.cs # Abort the current recording │ ├── OnboardCommand.cs # Initial interactive configuration setup │ ├── ConfigUpdaterCommand.cs # Direct configuration modifications │ ├── ShowCommand.cs # Display current configuration │ ├── SkillCommand.cs # CLI controller for discovering and adding Dynamic JSON Skills │ ├── LatencyTestCommand.cs # Benchmark tool for API calls │ ├── HistoryCommand.cs # CLI interface to query, export, or shred past transcripts │ └── StatsCommand.cs # CLI interface to calculate analytics from history ├── Configuration/ │ ├── ConfigManager.cs # Loads and saves JSON configuration from the user's home folder │ └── ToakConfig.cs # Data model for user preferences ├── Core/ │ ├── DaemonService.cs # The background daemon maintaining the socket server and handling states │ ├── Logger.cs # Logging utility (verbose logging) │ ├── HistoryManager.cs # Manages appending and reading the local history.jsonl │ ├── HistoryEntry.cs # The data model for transcription history │ ├── PromptBuilder.cs # Constructs the system prompts for the LLM based on user settings │ ├── StateTracker.cs # Tracks the current application state (e.g. is recording active?) │ └── Skills/ # Data-driven JSON skill integrations │ ├── SkillDefinition.cs # JSON Model │ ├── DynamicSkill.cs # Runtime implementation mapping LLM context to actions │ └── SkillRegistry.cs # Loads and detects skills from ~/.config/toak/skills/ ├── IO/ │ ├── ClipboardManager.cs # Cross-session (Wayland/X11) clipboard manipulation (`wl-copy`, `xclip`) │ ├── TextInjector.cs # Native keyboard injection handling (`wtype`, `xdotool`) │ └── Notifications.cs # System notifications (`notify-send`) and sound playback (`paplay`) ├── Serialization/ │ └── AppJsonSerializerContext.cs # System.Text.Json source generation context for AOT support ├── docs/ # Documentation ├── toak.service # systemd user service file to run the daemon automatically ├── uninstall.sh # Script to completely remove daemon, service, and binaries └── Program.cs # Application entry point using System.CommandLine ``` ## Key Architectural Concepts ### The Daemon Process The `DaemonService` (`toak daemon`) is the heart of Toak. It listens on a Unix domain socket for IPC messages. This allows `toak toggle` to execute almost instantaneously, delegating all heavy lifting and state management to an already-hot background process. ### Unix Sockets IPC Client commands communicate with the daemon via Unix sockets. For details on the byte payloads used for communication, please refer to [PROTOCOL.md](./PROTOCOL.md). ### AOT Compilation The project relies on Native AOT compilation (`dotnet publish -c Release -r linux-x64 --aot`) to avoid JIT-startup time on CLI executions, making `toak toggle` fast enough to bind seamlessly to hotkeys.