27d7d11b63d75b636ecf249ce44c1c848f13e6ab
Toak: High-speed Linux Dictation
Toak is a high-speed, professional-grade dictation tool for Linux. It combines state-of-the-art Speech-to-Text (Whisper via Groq) with LLM refinement (Llama/GPT) to provide a seamless, articulate, and highly configurable dictation experience.
Built with .NET 10 and compiled to Native AOT, Toak runs as a lightning-fast standalone binary with zero runtime overhead.
🚀 Key Features
- Blazing Fast: Uses Groq's API for sub-second Whisper transcription and LLM refinement.
- Native AOT: Compiled to a native Linux binary for instant startup and minimal footprint.
- Intelligent Refinement: Automatically fixes grammar, punctuation, and technical terms while preserving your voice.
- Modular Skills: Actionable "System" commands for translation, terminal execution, professional rewriting, and summarization.
- Multiple Backends: Types directly into your active window (
wtypeorxdotool), copies to clipboard, or pipes to stdout. - Beautiful CLI: Interactive onboarding and configuration powered by
Spectre.Console.
🛠 Prerequisites
- .NET 10 SDK (for building from source)
- ffmpeg (for audio capture processing)
- Typing Backend:
wtype(Wayland) orxdotool(X11) - Groq API Key: Get one at console.groq.com
📦 Installation
Toak includes a self-contained installation script that handles the native compilation and setup:
git clone https://github.com/your-repo/toak.git
cd toak
./install.sh
The script will:
- Publish the project as a Native AOT Release binary.
- Install the executable to
/usr/bin/toak. - Install Zsh completions to
/usr/share/zsh/site-functions/.
🎮 Usage
Core Commands
toak toggle: The primary command. Run it to start recording; run it again to stop, transcribe, and type/copy the result.toak discard: Instantly aborts the current recording without performing any transcription.toak onboard: Launches the interactive configuration wizard.toak latency-test: Benchmarks your network and API latency to ensure optimal performance.toak show: Displays your current configuration in a clean table.toak config <key> <value>: Quickly update a specific setting (e.g.,toak config whisper whisper-large-v3-turbo).
Flags
-p, --pipe: Output the finalized text tostdoutinstead of typing it.--copy: Copy the result to the system clipboard.-v, --verbose: Enable detailed debug logging.
🤖 Skills System
Toak includes a modular skills system triggered by saying the "System" hotword at the start of your dictation.
| Skill | Hotwords | Description |
|---|---|---|
| Terminal | "System terminal", "System run" | Executes the spoken command in your shell. |
| Translate | "System translate to [language]" | Translates your dictation into the target language. |
| Professional | "System professional", "System formalize" | Rewrites your text to be articulate and formal. |
| Summary | "System summary", "System concise" | Strips fluff and provides a direct, crisp summary. |
⚙️ Configuration
Toak's behavior is defined in ToakConfig.cs and can be managed via the CLI. Key settings include:
WhisperModel: The STT model (default:whisper-large-v3-turbo).LlmModel: The refinement model (default:openai/gpt-oss-20b).TypingBackend: Choose betweenwtype(Wayland) orxdotool(X11).ModulePunctuation: Toggle automatic grammar and punctuation fixing.ModuleTechnicalSanitization: Ensures technical terms likeC#,SQL, orAPIare formatted correctly.
Languages
C#
98.4%
Shell
1.6%