feat: Add Together AI and FFmpeg support, introduce core orchestration, and update documentation and install scripts.
This commit is contained in:
@@ -12,43 +12,45 @@ Toak is designed as a fast, Linux-native dictation application utilizing C# AOT
|
||||
Toak/
|
||||
├── Api/
|
||||
│ ├── GroqApiClient.cs # Client for external transcription and LLM API calls (Groq/Whisper)
|
||||
│ ├── OpenAiCompatibleClient.cs # Generic OpenAI-compatible client for Groq and Together AI
|
||||
│ └── Models/ # API payload representations
|
||||
├── Assets/ # Sound files or other static resources
|
||||
├── Audio/
|
||||
│ └── AudioRecorder.cs # Handles audio capture via system utilities (e.g., pw-record from PipeWire)
|
||||
│ ├── AudioRecorder.cs # Handles audio capture via PipeWire (pw-record)
|
||||
│ └── FfmpegAudioRecorder.cs # Universal audio capture via ffmpeg
|
||||
├── Commands/
|
||||
│ ├── ToggleCommand.cs # Start/stop recording and pass pipe/copy flags
|
||||
│ ├── DiscardCommand.cs # Abort the current recording
|
||||
│ ├── OnboardCommand.cs # Initial interactive configuration setup
|
||||
│ ├── ToggleCommand.cs # Client command to start/stop recording via socket
|
||||
│ ├── DiscardCommand.cs # Client command to abort current recording
|
||||
│ ├── OnboardCommand.cs # Interactive configuration setup wizard
|
||||
│ ├── ConfigUpdaterCommand.cs # Direct configuration modifications
|
||||
│ ├── ShowCommand.cs # Display current configuration
|
||||
│ ├── SkillCommand.cs # CLI controller for discovering and adding Dynamic JSON Skills
|
||||
│ ├── LatencyTestCommand.cs # Benchmark tool for API calls
|
||||
│ ├── HistoryCommand.cs # CLI interface to query, export, or shred past transcripts
|
||||
│ └── StatsCommand.cs # CLI interface to calculate analytics from history
|
||||
│ ├── SkillCommand.cs # CLI controller for managing JSON Skills
|
||||
│ ├── LatencyTestCommand.cs # Pipeline benchmark tool
|
||||
│ ├── HistoryCommand.cs # Interface to query past transcriptions
|
||||
│ └── StatsCommand.cs # Aggregated usage analytics
|
||||
├── Configuration/
|
||||
│ ├── ConfigManager.cs # Loads and saves JSON configuration from the user's home folder
|
||||
│ ├── ConfigManager.cs # Loads/saves JSON configuration
|
||||
│ └── ToakConfig.cs # Data model for user preferences
|
||||
├── Core/
|
||||
│ ├── DaemonService.cs # The background daemon maintaining the socket server and handling states
|
||||
│ ├── Logger.cs # Logging utility (verbose logging)
|
||||
│ ├── HistoryManager.cs # Manages appending and reading the local history.jsonl
|
||||
│ ├── HistoryEntry.cs # The data model for transcription history
|
||||
│ ├── PromptBuilder.cs # Constructs the system prompts for the LLM based on user settings
|
||||
│ ├── StateTracker.cs # Tracks the current application state (e.g. is recording active?)
|
||||
│ ├── DaemonService.cs # Background daemon maintaining the socket server
|
||||
│ ├── TranscriptionOrchestrator.cs # Coordinates audio recording, STT, LLM, and output
|
||||
│ ├── Logger.cs # Logging utility
|
||||
│ ├── HistoryManager.cs # Thread-safe history management (.jsonl)
|
||||
│ ├── HistoryEntry.cs # Data model for transcription history
|
||||
│ ├── PromptBuilder.cs # Constructs LLM system prompts
|
||||
│ ├── StateTracker.cs # Tracks application state and recording PIDs
|
||||
│ ├── Interfaces/ # Core abstractions (ILlmClient, IAudioRecorder, etc.)
|
||||
│ └── Skills/ # Data-driven JSON skill integrations
|
||||
│ ├── SkillDefinition.cs # JSON Model
|
||||
│ ├── DynamicSkill.cs # Runtime implementation mapping LLM context to actions
|
||||
│ └── SkillRegistry.cs # Loads and detects skills from ~/.config/toak/skills/
|
||||
├── IO/
|
||||
│ ├── ClipboardManager.cs # Cross-session (Wayland/X11) clipboard manipulation (`wl-copy`, `xclip`)
|
||||
│ ├── TextInjector.cs # Native keyboard injection handling (`wtype`, `xdotool`)
|
||||
│ └── Notifications.cs # System notifications (`notify-send`) and sound playback (`paplay`)
|
||||
│ ├── ClipboardManager.cs # Cross-session clipboard manipulation (wl-copy, xclip)
|
||||
│ ├── TextInjector.cs # Native keyboard injection (wtype, xdotool, ydotool)
|
||||
│ └── Notifications.cs # System notifications and sound playback
|
||||
├── Serialization/
|
||||
│ └── AppJsonSerializerContext.cs # System.Text.Json source generation context for AOT support
|
||||
│ └── AppJsonSerializerContext.cs # System.Text.Json source generation for AOT
|
||||
├── bin/ # Compiler output
|
||||
├── docs/ # Documentation
|
||||
├── toak.service # systemd user service file to run the daemon automatically
|
||||
├── uninstall.sh # Script to completely remove daemon, service, and binaries
|
||||
├── install.sh # Native AOT build and installation script
|
||||
├── toak.service # systemd user service definition
|
||||
└── Program.cs # Application entry point using System.CommandLine
|
||||
```
|
||||
|
||||
|
||||
Reference in New Issue
Block a user