diff --git a/README.md b/README.md new file mode 100644 index 0000000..652234f --- /dev/null +++ b/README.md @@ -0,0 +1,78 @@ +# Hush + +A Linux speech-to-text daemon that records audio, transcribes it via the Groq Whisper API, optionally cleans up the text with an LLM, and types the result into whatever window currently has focus. + +The intended workflow: bind `hush toggle` to a global hotkey, press to start recording, press again to stop, and the transcribed text appears in your active application. + +## Requirements + +**Runtime dependencies:** + +- [pw-record](https://pipewire.org/) (PipeWire) or `ffmpeg` (PulseAudio) — audio capture +- [wtype](https://github.com/atx/wtype) (Wayland) or [xdotool](https://github.com/jordansissel/xdotool) (X11) — text injection +- `notify-send` — desktop notifications +- A [Groq API key](https://console.groq.com/) + +**Build dependencies:** + +- [.NET 10.0 SDK](https://dotnet.microsoft.com/download/dotnet/10.0) + +## Installation + +```bash +sudo ./install.sh +``` + +This builds the project, installs the `hush` binary to `/usr/bin/hush`, sets up zsh tab-completion, and registers `hush.service` as a systemd (or OpenRC) user service. + +To uninstall: + +```bash +sudo ./uninstall.sh +``` + +## Setup + +After installing, run the interactive setup wizard to configure your API key and preferences: + +```bash +hush setup +``` + +Config is stored at `~/.config/hush/config` in TOML format. + +## Commands + +| Command | Description | +|---|---| +| `hush toggle` | Start recording if idle, stop and transcribe if recording | +| `hush start` | Start recording | +| `hush stop` | Stop recording and transcribe | +| `hush abort` | Discard the current recording | +| `hush status` | Show daemon state and recording duration | +| `hush show` | Show current configuration | +| `hush latency-test` | Measure STT and LLM round-trip latency | +| `hush daemon` | Run the daemon in the foreground | +| `hush setup` | Interactive configuration wizard | + +## Building Manually + +```bash +./build.sh +``` + +Produces a native AOT binary at `Hush.Cli/bin/Release/net10.0/linux-x64/publish/Hush.Cli`. + +## Configuration + +Key settings in `~/.config/hush/config`: + +| Key | Default | Description | +|---|---|---| +| `groq_api_key` | | Groq API key | +| `whisper_model` | `whisper-large-v3-turbo` | Whisper model to use | +| `llm_model` | `openai/gpt-oss-20b` | LLM model for text cleanup | +| `audio_backend` | `pw-record` | `pw-record` or `ffmpeg` | +| `typing_backend` | `wtype` | `wtype` or `xdotool` | +| `whisper_language` | | Optional ISO 639-1 language code | +| `min_recording_duration` | `500` | Minimum recording length in ms |