Fazm: Open Source macOS AI Agent on GitHub
Fazm: Open Source macOS AI Agent on GitHub
Fazm is an open source macOS AI agent that controls your desktop through the native Accessibility API. The full source code lives on GitHub at github.com/m13v/fazm, where anyone can inspect, fork, and contribute to the project.
This post covers what Fazm does, how its architecture works, and how to get it running on your Mac from the GitHub repository.
What Fazm Does
Fazm watches your screen, understands what applications are open, and takes actions on your behalf. Unlike browser-only automation tools, Fazm operates at the OS level. It can interact with any macOS application: Finder, Xcode, Terminal, Slack, email clients, creative tools, and everything in between.
The key differentiator is that Fazm uses the macOS Accessibility API rather than pixel-matching or screen recording. This means it reads the actual UI element tree (buttons, text fields, menus, labels) rather than trying to interpret screenshots. The result is faster, more reliable automation that does not break when you change your wallpaper or resize a window.
Architecture Overview
Fazm's architecture has four main layers, each with a distinct responsibility:
<svg viewBox="0 0 700 340" xmlns="http://www.w3.org/2000/svg" style={{width:'100%',height:'auto'}}>
<defs>
<marker id="arrow" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto">
<polygon points="0 0, 10 3.5, 0 7" fill="#14b8a6"/>
</marker>
</defs>
<rect x="20" y="20" width="660" height="60" rx="8" fill="#0d1117" stroke="#14b8a6" strokeWidth="2"/>
<text x="350" y="45" textAnchor="middle" fill="#2dd4bf" fontSize="14" fontWeight="bold">LLM Layer (Claude / OpenAI)</text>
<text x="350" y="62" textAnchor="middle" fill="#8b949e" fontSize="11">Interprets user intent, plans multi-step actions, generates tool calls</text>
<line x1="350" y1="80" x2="350" y2="110" stroke="#14b8a6" strokeWidth="2" markerEnd="url(#arrow)"/>
<rect x="20" y="110" width="660" height="60" rx="8" fill="#0d1117" stroke="#14b8a6" strokeWidth="2"/>
<text x="350" y="135" textAnchor="middle" fill="#2dd4bf" fontSize="14" fontWeight="bold">Agent Orchestrator (Swift)</text>
<text x="350" y="152" textAnchor="middle" fill="#8b949e" fontSize="11">Session management, memory, tool routing, permission checks</text>
<line x1="350" y1="170" x2="350" y2="200" stroke="#14b8a6" strokeWidth="2" markerEnd="url(#arrow)"/>
<rect x="20" y="200" width="320" height="60" rx="8" fill="#0d1117" stroke="#14b8a6" strokeWidth="2"/>
<text x="180" y="225" textAnchor="middle" fill="#2dd4bf" fontSize="14" fontWeight="bold">Accessibility Bridge (Rust)</text>
<text x="180" y="242" textAnchor="middle" fill="#8b949e" fontSize="11">AXUIElement traversal, element queries</text>
<rect x="360" y="200" width="320" height="60" rx="8" fill="#0d1117" stroke="#14b8a6" strokeWidth="2"/>
<text x="520" y="225" textAnchor="middle" fill="#2dd4bf" fontSize="14" fontWeight="bold">Action Executor</text>
<text x="520" y="242" textAnchor="middle" fill="#8b949e" fontSize="11">Clicks, keystrokes, drag, scroll, text input</text>
<line x1="180" y1="260" x2="180" y2="290" stroke="#14b8a6" strokeWidth="2" markerEnd="url(#arrow)"/>
<line x1="520" y1="260" x2="520" y2="290" stroke="#14b8a6" strokeWidth="2" markerEnd="url(#arrow)"/>
<rect x="20" y="290" width="660" height="40" rx="8" fill="#0d1117" stroke="#0d9488" strokeWidth="1.5" strokeDasharray="6,3"/>
<text x="350" y="315" textAnchor="middle" fill="#8b949e" fontSize="13">macOS Accessibility API (AXUIElement, CGEvent, NSWorkspace)</text>
</svg>
The Rust layer is critical. It wraps Apple's AXUIElement C API in a safe, performant interface that the Swift orchestrator calls through FFI. This is the same API that screen readers like VoiceOver use, which means Fazm gets first-class access to every UI element without any hacks or private APIs.
GitHub Repository Structure
The Fazm GitHub repository is organized into clear modules:
| Directory | Language | Purpose |
|---|---|---|
| fazm/ | Swift | Main macOS app, menu bar UI, agent orchestrator |
| accessibility/ | Rust | AXUIElement bindings, tree traversal, element caching |
| core/ | Swift | Shared types, LLM client, tool definitions |
| memory/ | Swift | Persistent agent memory, session context |
| actions/ | Swift | Desktop action execution (click, type, scroll) |
Key Files to Start With
If you are exploring the codebase for the first time, these files give you the best overview:
fazm/App.swiftis the entry point. It sets up the menu bar app and initializes the agent.core/Agent.swiftcontains the main agent loop: observe the screen, send context to the LLM, parse tool calls, execute actions, repeat.accessibility/src/lib.rsis the Rust FFI bridge that reads the accessibility tree.memory/MemoryStore.swifthandles persistent memory so the agent learns your preferences across sessions.
Getting Started from GitHub
Prerequisites
You need macOS 14 (Sonoma) or later, Xcode 15+, and Rust installed via rustup.
Clone and Build
git clone https://github.com/m13v/fazm.git
cd fazm
cargo build --release --manifest-path accessibility/Cargo.toml
open fazm.xcodeproj
Build and run from Xcode. On first launch, macOS will prompt you to grant Accessibility permissions in System Settings > Privacy & Security > Accessibility.
Grant Permissions
Fazm requires exactly one macOS permission: Accessibility access. This is the same permission granted to tools like BetterTouchTool, Raycast, and Hammerspoon. Without it, the agent cannot read the UI tree or perform actions.
No screen recording permission is needed. Fazm does not capture screenshots or record video. It reads the structured accessibility tree, which is far more efficient and privacy-preserving.
How Fazm Compares to Other Approaches
| Feature | Fazm (Accessibility API) | Screenshot-based agents | Browser-only agents | |---|---|---|---| | Works with all macOS apps | Yes | Yes | No, browser only | | Reads actual UI elements | Yes | No, uses OCR | Partial, DOM only | | Breaks on theme/resolution change | No | Often | No | | Requires screen recording | No | Yes | No | | Open source on GitHub | Yes | Varies | Varies | | Speed (element lookup) | Under 50ms | 500ms-2s | Under 100ms | | Privacy | Reads UI tree only | Captures full screen | Reads page content |
Contributing on GitHub
The project uses standard GitHub workflows. Fork the repository, create a branch, make changes, and open a pull request. The main areas where contributions are most impactful:
- New tool definitions in
core/Tools/for teaching the agent new capabilities - Accessibility improvements in the Rust layer for better element matching
- Memory system enhancements for smarter context retention across sessions
- Documentation and example workflows
Issues are tracked on GitHub. If you find a bug or have a feature request, open an issue with reproduction steps.
Why Open Source Matters for Desktop AI Agents
Desktop AI agents have deep access to your computer. They can read your screen, click buttons, type text, and navigate between applications. That level of access demands transparency.
With Fazm being fully open source on GitHub, you can verify exactly what the agent does before granting it permissions. You can audit the code that handles your data, confirm that nothing is sent to external servers without your knowledge, and modify the behavior to match your security requirements.
Closed-source desktop agents ask you to trust a binary. Open source asks you to verify. For something that controls your computer, verification is the right default.
What People Build With Fazm
Common workflows automated with Fazm include:
- Email triage: scanning inbox, categorizing messages, drafting replies
- Development workflows: running builds, reading error logs, applying fixes across files
- Data entry: filling forms, copying data between applications, updating spreadsheets
- Research: opening links, extracting information, organizing notes
- System maintenance: clearing caches, managing files, updating configurations
The agent handles these through natural language. Describe what you want done, and Fazm figures out the sequence of UI interactions to accomplish it.
Getting Help
- GitHub Issues: github.com/m13v/fazm/issues for bugs and feature requests
- GitHub Discussions: for questions and community conversation
- Documentation: the README in the GitHub repository covers setup, configuration, and common workflows
Fazm is an open source macOS AI agent. Open source on GitHub.