Voice Mode Is Useless Until It Runs On-Device with WhisperKit
Voice Mode Is Useless Until It Runs On-Device with WhisperKit
Every AI assistant now has a voice mode. You talk, it transcribes, it responds. The problem is that cloud-based voice transcription adds 500ms-2s of latency, requires internet, and sends your audio to someone else's server. For quick commands like "open the terminal" or "switch to Figma," that latency makes voice slower than just using the keyboard.
The Latency Problem
Voice should be faster than typing. That is the entire value proposition. But when you factor in cloud round-trip time plus the model processing time, saying "create a new file called index.ts" takes longer than pressing Cmd+N and typing the filename. Voice mode only wins for long-form dictation where you would be typing for 30+ seconds.
WhisperKit Changes Everything
WhisperKit runs Whisper models directly on Apple Silicon. No cloud, no latency, no internet required. Transcription happens in under 200ms on an M1 chip. This makes voice input genuinely faster than typing for short commands.
The setup is straightforward - WhisperKit is an open source Swift package that compiles Whisper models to CoreML format. It runs entirely on the Neural Engine and GPU, leaving the CPU free for your actual work.
Free SuperWhisper Alternative
SuperWhisper is a great app but costs $8/month for on-device transcription. With WhisperKit and a thin wrapper, you get the same functionality for free:
- On-device transcription with no cloud dependency
- Low latency suitable for real-time command input
- Privacy-preserving since audio never leaves your Mac
- Customizable hotkey to toggle listening
Voice as an Agent Interface
The real unlock is not dictation - it is using voice as the primary interface for desktop AI agents. Instead of typing prompts, you speak them. Instead of clicking through menus, you describe what you want. On-device transcription makes this practical for the first time because the latency is low enough to feel natural.
Fazm is an open source macOS AI agent. Open source on GitHub.