Fazm Just Went Live on Show HN - Voice Controlled AI Agent for macOS

Fazm Team··2 min read

Fazm Just Went Live on Show HN

We launched Fazm on Show HN - a voice controlled AI agent for macOS that uses accessibility APIs instead of screenshots to interact with your computer.

Why Accessibility APIs Instead of Screenshots

Most desktop AI agents work by taking screenshots and using computer vision to figure out what is on screen. This approach has fundamental problems:

  • It is slow - capturing, uploading, and analyzing a screenshot takes seconds
  • It is fragile - different display resolutions, color themes, and font sizes break recognition
  • It is expensive - every screenshot requires a vision model API call
  • It guesses - the agent does not actually know what a UI element is, it infers from pixels

Accessibility APIs solve all of these. The operating system already knows what every element on screen is - its type, label, value, and position. The API just exposes this information directly. No guessing, no vision models, no resolution issues.

Voice Control Changes the Interaction Model

Typing commands to an AI agent is already better than doing things manually. But speaking commands is a step further - you can automate tasks while your hands are busy with something else.

"Fazm, send the latest invoice to the client" while you are reviewing code. "Fazm, find the last email from the design team" while you are eating lunch. The voice interface makes the agent feel like an assistant rather than a tool.

Open Source

Fazm is fully open source. The codebase is on GitHub and anyone can inspect, modify, or extend it. We believe AI agents that control your computer should be transparent about what they do and how they do it.

No telemetry, no cloud dependency for core features, no black boxes.

Fazm is an open source macOS AI agent. Open source on GitHub.

More on This Topic

Related Posts