Back to Blog

Fazm - Open Source Voice-Controlled AI Agent for macOS

Fazm Team··2 min read
fazmopen-sourcemacosvoice-controlannouncement

Fazm - Open Source Voice-Controlled AI Agent for macOS

Fazm is a macOS app that lets you control your entire computer with your voice. Push a keyboard shortcut, say what you need, and the agent does it - navigating apps, filling forms, sending emails, organizing files.

What Makes It Different

  • Fully local. Your screen content and voice recordings never leave your Mac. No cloud processing of sensitive data. Read more about why local-first architecture matters.
  • No account needed. Download, install, start using. No signup, no API keys, no subscription.
  • Open source. MIT licensed. The entire codebase is on GitHub. Read it, fork it, contribute to it.
  • Native Swift/SwiftUI. Not an Electron wrapper. A real macOS app that uses ScreenCaptureKit, accessibility APIs, and Apple Silicon acceleration.

How It Works

  1. Press the hotkey (Option+Space by default)
  2. Say what you need ("update the CRM with today's call notes", "send Sarah the project summary", "organize my downloads folder")
  3. Fazm reads your screen, plans the actions, and executes them
  4. You approve sensitive actions or let routine ones auto-complete

The Technical Stack

  • ScreenCaptureKit for real-time screen capture
  • Accessibility APIs for reliable UI control (no fragile screenshot-based clicking) - see DOM vs screenshots explained
  • WhisperKit for local voice transcription on Apple Silicon
  • Claude or Ollama for action planning (your choice of cloud or local LLM)
  • Swift concurrency for the async capture-plan-execute pipeline

Getting Started

git clone https://github.com/m13v/fazm
cd fazm
open Fazm.xcodeproj
# Build and run (Cmd+R)

Or download the latest release from fazm.ai. New to AI agents? Our beginner's guide walks through setup step by step.


Discussed across r/macapps, r/opensource, r/SideProject, r/coolgithubprojects, and r/foss.

Related Posts