1
0
Files
Toak/docs/STRUCTURE.md

3.4 KiB

Toak Project Structure

This document outlines the high-level architecture and directory structure of the Toak project to help contributors navigate the codebase.

Overview

Toak is designed as a fast, Linux-native dictation application utilizing C# AOT (Ahead-Of-Time compilation) for minimal latency. It operates primarily as a client-daemon architecture where background application state is managed by a daemon process while short-lived CLI commands issue control messages via Unix domain sockets.

Directory Structure

Toak/
├── Api/
│   ├── GroqApiClient.cs     # Client for external transcription and LLM API calls (Groq/Whisper)
│   └── Models/              # API payload representations
├── Assets/                  # Sound files or other static resources
├── Audio/
│   └── AudioRecorder.cs     # Handles audio capture via system utilities (e.g., ffmpeg/arecord)
├── Commands/
│   ├── ToggleCommand.cs     # Start/stop recording and pass pipe/copy flags
│   ├── DiscardCommand.cs    # Abort the current recording
│   ├── OnboardCommand.cs    # Initial interactive configuration setup
│   ├── ConfigUpdaterCommand.cs # Direct configuration modifications
│   ├── ShowCommand.cs       # Display current configuration
│   └── LatencyTestCommand.cs # Benchmark tool for API calls
├── Configuration/
│   ├── ConfigManager.cs     # Loads and saves JSON configuration from the user's home folder
│   └── ToakConfig.cs        # Data model for user preferences
├── Core/
│   ├── DaemonService.cs     # The background daemon maintaining the socket server and handling states
│   ├── Logger.cs            # Logging utility (verbose logging)
│   ├── PromptBuilder.cs     # Constructs the system prompts for the LLM based on user settings
│   ├── StateTracker.cs      # Tracks the current application state (e.g. is recording active?)
│   └── Skills/              # Modular capabilities (e.g., Terminal mode, Language Translation)
├── IO/
│   ├── ClipboardManager.cs  # Cross-session (Wayland/X11) clipboard manipulation (`wl-copy`, `xclip`)
│   ├── TextInjector.cs      # Native keyboard injection handling (`wtype`, `xdotool`)
│   └── Notifications.cs     # System notifications (`notify-send`) and sound playback (`paplay`)
├── Serialization/
│   └── AppJsonSerializerContext.cs # System.Text.Json source generation context for AOT support
├── docs/                    # Documentation
├── toak.service             # systemd user service file to run the daemon automatically
└── Program.cs               # Application entry point using System.CommandLine

Key Architectural Concepts

The Daemon Process

The DaemonService (toak daemon) is the heart of Toak. It listens on a Unix domain socket for IPC messages. This allows toak toggle to execute almost instantaneously, delegating all heavy lifting and state management to an already-hot background process.

Unix Sockets IPC

Client commands communicate with the daemon via Unix sockets. For details on the byte payloads used for communication, please refer to PROTOCOL.md.

AOT Compilation

The project relies on Native AOT compilation (dotnet publish -c Release -r linux-x64 --aot) to avoid JIT-startup time on CLI executions, making toak toggle fast enough to bind seamlessly to hotkeys.