The open source macOS AI agent that bundles five MCP servers inside a signed .app and drives any Mac app through the real accessibility tree
Most "open source macOS AI agent" projects are Ollama wrappers, browser-only demos, or developer frameworks that expect you to know Python and CDP. Fazm is MIT-licensed, ships as a consumer .dmg, and the agent inside it drives any AX-exposing Mac app through a native binary called mcp-server-macos-use that lives inside the app bundle. Every claim below has a source path.
What the top search results for "open source macOS AI agent" actually ship
If you searched for this keyword and spent an evening in the top five results, you would find three categories: an Ollama wrapper (local LLM chat, no agent loop), a developer framework (Python package, bring your own glue code), or a browser-only computer-use demo (Chrome via CDP, no native Mac apps). None of them install as a consumer .app whose agent drives Mail, Notes, Figma, or Xcode the moment you launch it.
Sources: acp-bridge/src/index.ts line 1266 (BUILTIN_MCP_NAMES), Desktop/Sources/BundledSkills/ (17 files), Desktop/Sources/SkillInstaller.swift (auto-discovery), and the repo README at fazm.ai/gh.
The shape of the install
0 MCP servers · 0 skills · 0 pip installs
A fresh install of Fazm on a clean Mac takes one drag-and-drop into /Applications, one permission grant for accessibility, and one optional grant for screen recording. The five MCP servers, the 17 skills, the Node runtime, and the Python venv for the Google Workspace server are already inside the bundle.
A framework install vs. a consumer install
This is the core difference between most "open source macOS AI agent" projects and Fazm. Same goal. Completely different install burden. Same license family.
Install flow for the two categories
# typical "open source mac ai agent" framework
git clone https://github.com/someone/mac-agent-framework
cd mac-agent-framework
# install the runtime
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# paste your keys
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
# configure the model router
vim config/models.yaml
# run the agent loop
python -m agent.cli --profile mac
# now write glue code to call Chrome via CDP,
# or screenshot + vision-model the screen,
# because the framework does not ship with
# accessibility-API bindings out of the box.Anchor fact: the five MCP servers the ACP bridge registers
This is the single most uncopyable thing about Fazm. Cloning the visual design is easy. Bundling five compiled MCP servers, signing them, notarizing them, and wiring them into an ACP bridge so an LLM can call any of them from a consumer app is the work.
“mcp-server-macos-use (native macOS accessibility automation)”
acp-bridge/src/index.ts comment at line 1056, right above the registerMcpServer call
Clone the repo, open acp-bridge/src/index.ts, and the literal Set at line 1266 contains exactly these five names. Anything after that comes from ~/.fazm/mcp-servers.json and is user-added.
What the bundle actually contains
Not a diagram. A real directory listing from an installed copy. Two native MCP binaries sit alongside the main executable and the bundled Node runtime. The Python-based Google Workspace MCP lives under Contents/Resources/google-workspace-mcp/ with its own .venv.
Tool pipeline: three routes, one decision hub
When a prompt comes in, the Claude Code subprocess picks the right MCP server for the surface. Native Mac apps route through the accessibility tree. Browser-only flows route through Playwright. Google REST surfaces route through the Google Workspace server. The decision is tool-level, not model-level.
How a Fazm prompt turns into an action
Apps that work through the generic macos-use MCP
Any Mac app that exposes the accessibility tree is reachable through the same five generic tools: click_and_traverse, type_and_traverse, scroll_and_traverse, press_key_and_traverse, refresh_traversal. No per-app adapter code. The list below is not exhaustive, it is just a sample of apps that work out of the box on a fresh install.
Electron apps with partial AX trees, pre-notarized legacy apps, and apps that disable accessibility on purpose are the soft spots. For those, the agent routes to playwright, a dedicated MCP (like whatsapp), or shell tools.
Framework vs. consumer .app, side by side
The split below is not a caricature. It is a summary of what you actually get when you install the top five search results for this keyword and what you actually get when you install Fazm.
| Feature | Typical open source macOS AI agent project | Fazm |
|---|---|---|
| Distribution shape | git clone + venv + API keys | Signed, notarized .dmg. Drag to /Applications, launch. |
| License | Mixed: MIT, AGPL, proprietary core with OSS wrapper | MIT. Full desktop source public at fazm.ai/gh. |
| How it perceives a Mac app | Screenshots + vision model, or CDP for Chrome only | Real macOS accessibility tree through mcp-server-macos-use |
| App coverage | Browser only, or a hardcoded allow-list | Any app that exposes AX: Mail, Notes, Finder, Xcode, Notion, Linear, Slack, Figma, Catalyst, SwiftUI, AppKit |
| MCP servers out of the box | BYO | Five: fazm_tools, playwright, macos-use, whatsapp, google-workspace |
| Agent policy | Black-box system prompt buried in code | 17 markdown files in ~/.claude/skills/. Open, edit, delete. |
| Model backend | Model-locked (usually OpenAI) | Claude Code subprocess; use your Anthropic account or subscription |
| User needed | Python dev with Xcode and homebrew set up | Anyone who can drag an icon to /Applications |
Install, launch, first action: five steps
None of these steps require a terminal. The ACP bridge, the skill installer, and the MCP registration all run automatically when the app starts.
Download the signed .dmg
From fazm.ai (or fazm.ai/gh for the source tree). This is the consumer path. Drag the app to /Applications like any other Mac app. No homebrew, no git, no pip.
Grant accessibility and screen recording
The onboarding chat walks you through System Settings once. Without accessibility, macos-use cannot call AXUIElement APIs. Without screen recording, the optional observer buffer cannot fill.
First launch auto-installs 17 skills
SkillInstaller.swift walks Desktop/Sources/BundledSkills/*.skill.md inside the bundle, SHA-256 diffs each file against ~/.claude/skills/<name>/SKILL.md, and copies anything new or changed. Your local edits to unchanged skills survive.
The ACP bridge registers five MCP servers
On the first session/new, acp-bridge/src/index.ts registers: fazm_tools, playwright, macos-use, whatsapp, google-workspace. The macos-use server only loads if the binary exists at Contents/MacOS/mcp-server-macos-use, which it always does on a clean install.
Hit Cmd+\ and ask the agent to drive a real app
The floating bar is the user surface. Type or speak a task. The Claude Code subprocess picks the right tool, which means picking macos-use for anything AX-backed, playwright for browser flows, google-workspace for Gmail/Drive/Calendar REST.
Try the accessibility-native agent yourself
Download the signed .dmg, grant accessibility, and ask it to do something in any Mac app. If the app exposes AX, the agent can touch it.
Install Fazm →The 17 skills that ship with the app
These are flat markdown files with YAML frontmatter. Each one lands at ~/.claude/skills/<name>/SKILL.md on first launch. You can open any of them in a text editor and read what your agent knows how to do.
The installer SHA-256 diffs each file on every launch and only overwrites when the bundled version changed. Your edits to unchanged skills survive an update. Delete one and it will reinstall on next launch unless you also remove it from the bundle and rebuild.
What you get on a fresh install
Six things most "open source macOS AI agent" projects leave as an exercise for the reader. Fazm puts them inside the .app bundle so the person running it does not have to care.
Five native MCP servers, already inside the bundle
fazm_tools, playwright, macos-use, whatsapp, google-workspace. No npx, no pip, no Docker. The ACP bridge registers them on session/new.
mcp-server-macos-use is a real binary
Sits at Contents/MacOS/mcp-server-macos-use. Code-signed, notarized, shipped on every release. Drives AX directly.
No screenshots in the critical path
The tool is a structured role+label tree. Vision only kicks in when the model explicitly asks for it, not by default.
Works with any AX-exposing app
Native SwiftUI, AppKit, Catalyst apps all work through the same generic macos-use tool set: click, type, scroll, press, refresh.
17 skills you can audit in a text editor
Auto-installed to ~/.claude/skills/<name>/SKILL.md from the app bundle on first launch. SHA-256 diffed, user edits preserved.
Consumer installer, not a framework
Signed .dmg, onboarding chat, one-click permission grants. The person running it never opens Terminal.
The thing nobody else says
Open source is not the interesting claim here. The interesting claim is that it arrives as a consumer .app.
The Mac AI agent space has plenty of MIT code. It has fewer MIT code paths that a non-developer can install and use on the same afternoon. Fazm draws the line at the bundle: if something has to exist for the agent to work, it ships inside Fazm.app. The Node runtime, the Python venv, the Playwright CLI, the mcp-server-macos-use binary, the whatsapp-mcp binary, the Google Workspace server, the 17 skills, the ACP bridge, the protobuf definitions. No runtime fetch, no pip install, no brew install. That is what it takes to hand this kind of tool to someone who does not want to open Terminal.
The accessibility-API piece is the second-most important decision, but it is the first one you feel. Ask the agent to rename three tabs in Safari, then to rename three filenames in Finder, then to open a Notion page and fill a row. All three run through the same mcp-server-macos-use binary. A vision-only agent would need a different recipe for each.
Frequently asked questions
Is Fazm actually open source, or is it 'source available'?
MIT license. The desktop source tree (Desktop/, acp-bridge/, dmg-assets/) is public at fazm.ai/gh. You can read it, fork it, build it, ship your own signed version. The README makes this explicit: 'Free to start. Fully open source. Fully local.'
What is actually inside the Fazm.app bundle that makes it different from other open source macOS AI agents?
Five pre-integrated MCP servers. Three are native binaries, two are runtime-bundled: mcp-server-macos-use and whatsapp-mcp sit at Contents/MacOS/, Node sits in the same directory and runs playwright plus the internal fazm_tools server, and Contents/Resources/google-workspace-mcp/ holds a bundled Python venv for the Google Workspace server. The ACP bridge registers them by name (fazm_tools, playwright, macos-use, whatsapp, google-workspace) in BUILTIN_MCP_NAMES at acp-bridge/src/index.ts line 1266.
Why does 'uses accessibility APIs' matter if other agents can just screenshot the screen?
Three reasons that matter in practice. Speed: AX tree extraction is a single syscall chain that returns structured text in tens of milliseconds. A screenshot round-trip through a vision model is hundreds of milliseconds to seconds. Accuracy: the AX tree gives role, label, value, and bounding box for every actionable element. The vision model has to infer all of that from pixels, which is where most 'it clicked the wrong button' failures come from. Coverage: accessibility APIs work on any AX-exposing Mac app (SwiftUI, AppKit, Catalyst, and even many Electron apps that implement AX). A vision-only agent is effectively browser-only because browsers are the only surface where the screenshot is predictable.
What is mcp-server-macos-use and where does it come from?
It is a native MCP server that exposes the macOS accessibility tree as tools an LLM can call: click_and_traverse, type_and_traverse, scroll_and_traverse, press_key_and_traverse, refresh_traversal. Fazm bundles the compiled binary inside the .app at Contents/MacOS/mcp-server-macos-use and registers it as the MCP server named 'macos-use' in acp-bridge/src/index.ts. The tool names on the server prefix to mcp__macos-use__click_and_traverse and friends, which is what the Claude Code subprocess actually calls when it needs to drive a Mac app.
How does Fazm avoid the usual 'open source Mac AI agent' install pain?
The .app bundle is self-contained. It ships its own Node runtime (Contents/MacOS/Node), its own Python venv for the Google Workspace MCP (Contents/Resources/google-workspace-mcp/.venv/), its own Playwright CLI (bundled @playwright/mcp), its own mcp-server-macos-use, and its own whatsapp-mcp. There is no brew install step, no python -m venv step, no npm install step. The only thing the user has to grant is macOS permissions: accessibility and (optionally) screen recording.
Does Fazm run the LLM locally on my Mac?
No. The 'local' part is the agent loop, the tools, the SQLite database, the permission state, the bundled MCP servers, and your screen recording buffer. Inference is cloud: the Claude Code subprocess calls api.anthropic.com using your Anthropic account or subscription, and the optional observer calls Gemini for multimodal video analysis. If you want pure local inference, the Claude Code side can be pointed at a compatible endpoint, but the consumer default is cloud models with local tools.
What are the 17 bundled skills and why does the SHA-256 check matter?
The 17 markdown files in Desktop/Sources/BundledSkills/ are: ai-browser-profile, canvas-design, deep-research, doc-coauthoring, docx, find-skills, frontend-design, google-workspace-setup, pdf, pptx, social-autoposter, social-autoposter-setup, telegram, travel-planner, video-edit, web-scraping, xlsx. Each is a plain markdown file with YAML frontmatter that Claude Code reads at session start. On launch, SkillInstaller.swift runs a SHA-256 checksum compare between the bundled version and whatever is currently at ~/.claude/skills/<name>/SKILL.md. It only overwrites when the bundled version actually changed, so if you edit a skill and the bundle has not changed in the next update, your edits survive. You can delete any of the 17 and it will come back on the next launch unless you also delete the bundle file and rebuild.
What happens if an app does not expose accessibility APIs?
Fazm falls back by tool, not by hallucinating. Browser work routes to the bundled playwright MCP, which gives CDP-level control of Chrome. Google Workspace work (Gmail, Drive, Calendar, Sheets, Docs, Meet) routes to the bundled google-workspace MCP, which is REST-based and does not need AX. WhatsApp routes to the bundled whatsapp-mcp, which knows the WhatsApp Catalyst app layout specifically. For anything else with no AX and no API, the agent uses shell tools (also bundled) or explains the limit in chat. It does not pretend to click when it cannot.
How do I verify the BUILTIN_MCP_NAMES claim myself?
Clone the source from fazm.ai/gh. Open acp-bridge/src/index.ts. Search for BUILTIN_MCP_NAMES. You will find a Set literal with exactly five strings: 'fazm_tools', 'playwright', 'macos-use', 'whatsapp', 'google-workspace'. That is the full list of servers the bridge registers by default. User-added MCP servers from ~/.fazm/mcp-servers.json are appended after, but those five are the built-in floor.
Is this comparable to Anthropic's computer use, Simular, or Manus?
All three are prompt-driven and mostly screenshot-based. Anthropic's computer use is a model capability, not a shipped Mac app; you need to build the harness. Simular is browser-focused. Manus is a hosted service. Fazm is the only option in the category that is (a) MIT-licensed open source, (b) shipped as a signed consumer .app, (c) AX-tree-native by default, and (d) able to touch any Mac app that exposes AX, not just the browser. The tradeoff is that the observer's task-discovery path uses Gemini and the agent path uses Claude, so you bring your own inference provider.
Where can I read the exact lines of code behind these claims?
acp-bridge/src/index.ts:63 and :64 define the binary paths. Lines 1056 to 1074 register macos-use and whatsapp as MCP servers if their binaries exist. Line 1266 defines BUILTIN_MCP_NAMES. Desktop/Sources/SkillInstaller.swift contains the SHA-256 diff logic and the copy loop. Desktop/Sources/BundledSkills/ is a flat directory of 17 *.skill.md files. GeminiAnalysisService.swift line 69 is where the 60-minute observer window is defined. The README at repo root confirms the MIT license and the 'fully open source' position. Nothing on this page is hidden.
An open source Mac AI agent you can install without a terminal
Signed .dmg. MIT license. Five MCP servers inside the bundle. Drives any AX-exposing Mac app, not just the browser. 17 skills auto-installed and editable.
Download Fazm
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.