Fazm - Open Source Voice-Controlled AI Agent for macOS
Fazm - Open Source Voice-Controlled AI Agent for macOS
Fazm is a macOS app that lets you control your entire computer with your voice. Push a keyboard shortcut, say what you need, and the agent does it - navigating apps, filling forms, sending emails, organizing files.
What Makes It Different
- Fully local. Your screen content and voice recordings never leave your Mac. No cloud processing of sensitive data. Read more about why local-first architecture matters.
- No account needed. Download, install, start using. No signup, no API keys, no subscription.
- Open source. MIT licensed. The entire codebase is on GitHub. Read it, fork it, contribute to it.
- Native Swift/SwiftUI. Not an Electron wrapper. A real macOS app that uses ScreenCaptureKit, accessibility APIs, and Apple Silicon acceleration.
How It Works
- Press the hotkey (Option+Space by default)
- Say what you need ("update the CRM with today's call notes", "send Sarah the project summary", "organize my downloads folder")
- Fazm reads your screen, plans the actions, and executes them
- You approve sensitive actions or let routine ones auto-complete
The Technical Stack
- ScreenCaptureKit for real-time screen capture
- Accessibility APIs for reliable UI control (no fragile screenshot-based clicking) - see DOM vs screenshots explained
- WhisperKit for local voice transcription on Apple Silicon
- Claude or Ollama for action planning (your choice of cloud or local LLM)
- Swift concurrency for the async capture-plan-execute pipeline
Getting Started
git clone https://github.com/m13v/fazm
cd fazm
open Fazm.xcodeproj
# Build and run (Cmd+R)
Or download the latest release from fazm.ai. New to AI agents? Our beginner's guide walks through setup step by step.
Discussed across r/macapps, r/opensource, r/SideProject, r/coolgithubprojects, and r/foss.