hush/README.md

# Hush

A Linux speech-to-text daemon that records audio, transcribes it via the Groq Whisper API, optionally cleans up the text with an LLM, and types the result into whatever window currently has focus.

The intended workflow: bind `hush toggle` to a global hotkey, press to start recording, press again to stop, and the transcribed text appears in your active application.

## Requirements

**Runtime dependencies:**

- [pw-record](https://pipewire.org/) (PipeWire) or `ffmpeg` (PulseAudio) — audio capture
- [wtype](https://github.com/atx/wtype) (Wayland) or [xdotool](https://github.com/jordansissel/xdotool) (X11) — text injection
- `notify-send` — desktop notifications
- A [Groq API key](https://console.groq.com/)

**Build dependencies:**

- [.NET 10.0 SDK](https://dotnet.microsoft.com/download/dotnet/10.0)

## Installation

```bash
sudo ./install.sh
```

This builds the project, installs the `hush` binary to `/usr/bin/hush`, sets up zsh tab-completion, and registers `hush.service` as a systemd (or OpenRC) user service.

To uninstall:

```bash
sudo ./uninstall.sh
```

## Setup

After installing, run the interactive setup wizard to configure your API key and preferences:

```bash
hush setup
```

Config is stored at `~/.config/hush/config` in TOML format.

## Commands

| Command | Description |
|---|---|
| `hush toggle` | Start recording if idle, stop and transcribe if recording |
| `hush start` | Start recording |
| `hush stop` | Stop recording and transcribe |
| `hush abort` | Discard the current recording |
| `hush status` | Show daemon state and recording duration |
| `hush show` | Show current configuration |
| `hush latency-test` | Measure STT and LLM round-trip latency |
| `hush daemon` | Run the daemon in the foreground |
| `hush setup` | Interactive configuration wizard |

## Building Manually

```bash
./build.sh
```

Produces a native AOT binary at `Hush.Cli/bin/Release/net10.0/linux-x64/publish/Hush.Cli`.

## Configuration

Key settings in `~/.config/hush/config`:

| Key | Default | Description |
|---|---|---|
| `groq_api_key` | | Groq API key |
| `whisper_model` | `whisper-large-v3-turbo` | Whisper model to use |
| `llm_model` | `openai/gpt-oss-20b` | LLM model for text cleanup |
| `audio_backend` | `pw-record` | `pw-record` or `ffmpeg` |
| `typing_backend` | `wtype` | `wtype` or `xdotool` |
| `whisper_language` | | Optional ISO 639-1 language code |
| `min_recording_duration` | `500` | Minimum recording length in ms |