APRIL 2026 / WHAT THE INSTALL ACTUALLY DOES

Open source Mac AI agents, April 2026: the ones that work the second you launch them

Every roundup this month sorts the same six projects by GitHub stars or by which automation primitive they use. The split that actually decides whether a Mac AI agent is useful by Friday is the one nobody covers: how much of the work is already done when the .app finishes copying. Fazm ships 17 capability skills inside the signed bundle and auto-installs them on first launch through 160 lines of Swift in SkillInstaller.swift. Most other open source Mac AI agents hand you a CLI and a README. This guide walks through what is bundled, where the files live, and how the install pass actually decides what to copy, update, or delete.

Matthew Diakonov, Fazm

Published April 24, 202611 min read

4.9from Written from Fazm v2.4.0 source, SkillInstaller.swift, and the April 2026 changelog

17 *.skill.md files in the signed .app

SHA-256 checksum compare on every launch

obsoleteSkills array deletes removed skills

Installed at ~/.claude/skills/, reusable from Claude Code

MIT licensed at github.com/mediar-ai/fazm

Mac AI agents that work on first launch

What April 2026 actually shipped, in 24 seconds

17 skills inside the signed Fazm.app bundle

SkillInstaller copies them to ~/.claude/skills

SHA-256 update on every launch, no overwrite if matched

AXUIElement drives any Mac app, screenshots are the fallback

Install once, available to every Claude Code session

0:00 / 0:05

The split nobody covers: what arrives ready

The popular guides this month treat "open source Mac AI agent" as one category and rank everything in it by GitHub stars. That sort mixes products that look almost nothing alike. Goose runs in a terminal and expects you to wire your own tools. UI-TARS Desktop screenshots your screen and asks a vision model where to click. OpenHands runs in a Docker container and writes code. Fazm runs as a signed macOS app with native accessibility binaries and 17 pre-bundled capability skills. They occupy the same shelf and they do not solve the same problem.

The more honest split is what the .app can do before you read the docs. Two camps:

Toolkits. The project hands you an agent loop and a small set of generic tools (shell, file system, web search). Capability for any specific job is something you assemble. Goose 1.2, OpenHands, browser-use 0.7, and most Python frameworks live here.
Consumer apps. The .app ships with native binaries that drive your operating system, plus a library of skills that already know how to do common multi-step jobs. The first launch grants permissions and installs everything that follows. Fazm 2.4.0 is the open source example of this shape.

What actually ships inside Fazm.app

The signed bundle is a regular macOS .app. If you crack it open in Finder via Show Package Contents, the layout is roughly this. Three things matter for the "works on first launch" story: a native binary that drives any app through accessibility, a Python MCP server for Google Workspace, and a folder of seventeen markdown skill files.

Inside the signed Fazm.app on disk

The seventeen bundled skills

Each is a markdown file with YAML frontmatter (name, description, when_to_use) and a body that teaches the agent how to do the thing. The folder is the source of truth. Adding or removing a skill is a file edit; no Swift code change required.

pdf.skill.mddocx.skill.mdxlsx.skill.mdpptx.skill.mdvideo-edit.skill.mddeep-research.skill.mdfrontend-design.skill.mdcanvas-design.skill.mddoc-coauthoring.skill.mdweb-scraping.skill.mdtravel-planner.skill.mdsocial-autoposter.skill.mdsocial-autoposter-setup.skill.mdai-browser-profile.skill.mdgoogle-workspace-setup.skill.mdtelegram.skill.mdfind-skills.skill.md

Source listing taken from Desktop/Sources/Resources/BundledSkills/ in the fazm repo, MIT licensed.

The 160 lines that put them on your machine

The install pass lives at Desktop/Sources/SkillInstaller.swift. It runs once on launch and again whenever the app is relaunched after an update. The whole file is under 300 lines; the install logic is about 160. Two passes matter. The first discovers what is in the bundle.

Desktop/Sources/SkillInstaller.swift

The second pass copies, updates, or skips each skill based on a SHA-256 compare. If the bundled hash matches the hash of the installed copy, nothing happens. If they diverge, the file at ~/.claude/skills/<name>/SKILL.md is replaced. If the destination does not exist yet, the directory is created and the bundled file is copied in.

Desktop/Sources/SkillInstaller.swift

What the install actually does, on a fresh Mac

On a Mac with no existing ~/.claude/skills/ directory, the first launch produces a log line like this. After the second launch, all seventeen are skipped because the digests match. After an update that ships a newer version of one of the skills, that single file is "updated" and the rest are still "skipped".

~/Library/Logs/Fazm/fazm.log

What you can do at minute one, by skill

The descriptions below are paraphrased from each skill's YAML frontmatter as it ships in the April 2026 build. Every entry below is a markdown file you can read at~/.claude/skills/<name>/SKILL.mdafter the first launch.

pdf, docx, xlsx, pptx

Read, write, edit, fill, and convert Office documents and PDFs without leaving the chat. The four Office skills cover form fills, table extraction, slide rebuilds, and PDF redaction.

deep-research

Multi-source synthesis with citation tracking. Web fetch, dedup, summarize, and emit a verifiable brief with sources inline.

video-edit

Trim, transcribe, edit, and stitch long videos into short story-driven clips using ffmpeg under the hood.

frontend-design and canvas-design

Design system aware UI generation and static visual artifacts (posters, infographics, landing pages) using design tokens.

web-scraping

Playwright-driven dynamic scraping, auth flows, pagination, and structured extraction with screenshot fallback.

google-workspace-setup

Personal Google Cloud OAuth app provisioning. Project creation, API enablement, consent screen, and credential generation.

social-autoposter

Reddit, X, LinkedIn, and Moltbook posting and tracking. Original posts, comments, engagement stats.

ai-browser-profile, telegram, travel-planner, find-skills

Personal browser data lookup, Telegram messaging, full trip itineraries with budget, and a discovery tool that surfaces other skills when you ask 'how do I do X?'.

doc-coauthoring

Structured workflow for writing technical specs, proposals, and decision docs collaboratively rather than in one shot.

Side by side with the rest of April 2026

The matrix below is what the install actually buys you, by project. The point is not to declare a winner; the four projects solve different problems. The point is that "ranked by GitHub stars" is the wrong axis when the install footprints differ this much.

Feature	UI-TARS Desktop v1.6.0, Goose 1.2, OpenHands	Fazm 2.4.0
How you install	Mostly: clone repo, install deps, configure env, run CLI or container	Drag .app to Applications, grant accessibility once
What it can do at minute one	Generic shell, file system, browser, or vision-language click-and-type	PDFs, Word, Excel, slide decks, video edit, deep research, web scraping, social posting, plus accessibility-driven control of any Mac app
Skill layer	Bring your own; most have a tools/ folder you populate	17 *.skill.md files bundled, SHA-256 updates on launch
Drives native Mac apps	UI-TARS via screenshots; Goose and OpenHands via shell only	Yes, via AXUIElement accessibility tree
Extends with custom MCP servers	Goose and OpenHands yes; UI-TARS no	Yes, ~/.fazm/mcp-servers.json since 2.4.0
License	Apache or MIT depending on project	MIT, full source at github.com/mediar-ai/fazm
Code signing and notarization	Mostly unsigned; you grant exceptions manually	Signed and notarized by Codemagic, no Gatekeeper popup

Numbers that matter, with the source for each

None of these are guesses. Every figure below points to a file in the repo or to a published changelog entry from April 2026.

0skills bundled in the signed Fazm.app

0lines of Swift in SkillInstaller.swift

0MCP servers shipped in the bundle

0config steps after grant accessibility

17 / 17

“The skills directory is what every other open source Mac AI agent expects you to populate yourself. Bundling it inside the signed .app is the simplest, most boring way to make a consumer install actually consumer-grade.”

SkillInstaller.swift, Desktop/Sources/SkillInstaller.swift in github.com/mediar-ai/fazm

From double-click to first useful task

For Fazm 2.4.0 specifically, this is the sequence. The same flow on a CLI-style open source Mac AI agent is usually a longer list, with a handful of optional steps that turn into mandatory ones on the first run.

Drag Fazm.app into /Applications

The release artifact is a notarized .dmg from the v*-macos tag. Codemagic signs and notarizes; Gatekeeper does not show a warning.

Launch and grant accessibility permission

AXIsProcessTrustedWithOptions opens System Settings to the right pane. Toggle Fazm on and the app verifies the trust state via an event tap.

SkillInstaller copies 17 skills into ~/.claude/skills/

Every *.skill.md in BundledSkills/ is hashed and copied. A toast confirms the count. Subsequent launches are SHA-256 no-ops unless an update changed something.

Five bundled MCP servers boot

fazm_tools, playwright, macos-use, whatsapp, google-workspace. Defined at acp-bridge/src/index.ts line 1266 and started by the agent loop on session start.

Type a request

Fazm chooses a skill, opens or focuses an app through accessibility, and runs the multi-step job. No additional install required for any of the seventeen capability domains.

What this means for picking one this month

Three honest recommendations, depending on what you want to do. Nothing in this list is exclusive; you can run more than one at the same time and they do not fight each other.

If you have one evening

Install Fazm if you want native Mac app control plus PDFs, slide decks, video edits, and deep research on first launch.
Install UI-TARS Desktop alongside it if you target apps that block accessibility (some Electron apps, some games).
Install Goose 1.2 if you live in a terminal and you would rather compose your own tools.
Skip OpenHands unless you genuinely want a Docker-isolated coding agent. It is well built; it just does not solve the Mac control problem.

The reason the picks differ is the same reason the install footprints differ. A vision-only project optimizes for visual surfaces. A terminal-only project optimizes for shells. A consumer .app with a bundled skill layer optimizes for "the user double-clicks and immediately gets a job done". Pick the one whose first-launch surface matches what you actually want to do.

Want a 20-minute walkthrough of the bundled skills?

We will share screen, install Fazm on a fresh Mac, run through what the 17 bundled skills do, and answer specifics about how the SkillInstaller pass updates them on relaunch.

Frequently asked questions

What does 'open source Mac AI agent' actually mean in April 2026?

Three different products fall under that label and they look nothing alike. There are command-line agents like Goose 1.2 and OpenHands that run in a terminal and depend on you wiring up tools yourself. There are vision-language agents like UI-TARS Desktop v1.6.0 that screenshot your screen and ask a vision model where to click. There are accessibility-tree agents like Fazm and the standalone macos-use MCP server that read AXUIElement data and act on the live UI tree. Same label, three different surfaces. Pick the surface first, then the project, not the other way around.

Which open source Mac AI agents are worth installing as of April 2026?

Four are worth a serious look this month. Fazm (mediar-ai/fazm, MIT) is the consumer macOS app that bundles a Swift accessibility-API server plus 17 capability skills inside the signed .app, so it does PDFs, Word docs, spreadsheets, slide decks, video edits, and deep research on first launch. UI-TARS Desktop (bytedance/ui-tars-desktop) shipped v1.6.0 on 2026-04-04 with smaller dark-mode OCR and is the strongest vision-language alternative when accessibility data is missing. Goose 1.2 (block/goose) shipped 2026-04-10 with automatic MCP server discovery and is best if you want a CLI-driven agent loop. OpenHands (All-Hands-AI/OpenHands) is the right pick if you want a full autonomous coding agent and you do not mind running a Docker container.

What does Fazm bundle inside the signed .app, and where does it actually live?

Inside the signed bundle, Contents/Resources/BundledSkills/ contains seventeen *.skill.md files: pdf, docx, xlsx, pptx, video-edit, deep-research, frontend-design, canvas-design, doc-coauthoring, web-scraping, travel-planner, social-autoposter, social-autoposter-setup, ai-browser-profile, google-workspace-setup, telegram, find-skills. Each is a markdown file with YAML frontmatter and a body that teaches the agent how to do the thing. They are installed into ~/.claude/skills/ on first launch by SkillInstaller.swift and they live next to two native Mach-O binaries at Contents/MacOS/mcp-server-macos-use and Contents/MacOS/whatsapp-mcp, plus a Python MCP server bundled under Contents/Resources/google-workspace-mcp/. The full set is what arrives when you double-click the .app the first time.

How does Fazm decide whether to overwrite an installed skill on relaunch?

SkillInstaller.swift hashes both the bundled file and the file already installed at ~/.claude/skills/<name>/SKILL.md using SHA-256, and only overwrites if the digests differ. If a hash matches, the skill is logged as skipped and left alone. If the bundled file changed (an app update shipped a newer version of the same skill), the installed copy is replaced. Skills that exist on your machine but are no longer bundled are deleted on every launch through the obsoleteSkills array, which currently lists hindsight-memory. The whole pass is bookkept and a toast is shown if anything was updated.

How is bundling skills different from what other open source Mac AI agents do?

Goose, OpenHands, and most CLI agents ship with a small set of generic tools (file system, shell, web search) and expect you to assemble the rest. UI-TARS ships with a vision model and assumes the model will figure out the task from pixels. Fazm ships with native binaries that drive any Mac app through the accessibility tree plus seventeen pre-written skills that teach the agent how to do specific multi-step jobs (parse a PDF, draft a slide deck, scrape a site, edit a video). The category split is roughly 'agent loop with no domain knowledge' on one side and 'agent loop with domain skills already loaded' on the other. Both are valid; they suit different users.

Are the bundled skills locked to Fazm, or can I use them outside the app?

They install into ~/.claude/skills/ which is the standard Claude Code skills directory. Once installed, the same skill files are visible to any Claude Code session you run on your Mac, not just to Fazm. That means you can launch Fazm once to get the skill set on disk, then keep using the skills from a Claude Code terminal session even if Fazm is closed. Removing a skill from Fazm's BundledSkills folder and relaunching the app will delete it from ~/.claude/skills/ if you list it in obsoleteSkills, so the install logic is reversible.

Does Fazm use screenshots or accessibility APIs to drive Mac apps?

Accessibility APIs by default. The native binary Contents/MacOS/mcp-server-macos-use uses AXUIElement, AXUIElementCopyAttributeValue, and kAXFocusedWindowAttribute to read the live UI tree as text: roles, titles, values, focus state, frames, children. That costs almost nothing per call and survives dark mode, DPI changes, and theme switches. Screen capture is only invoked when visual reasoning is genuinely required, like reading a PDF figure or interpreting a design canvas. The behavior matches what the README at github.com/mediar-ai/mcp-server-macos-use documents, and the same binary is reusable from Claude Desktop, Cline, Zed, or any other MCP-aware host.

What changed for open source Mac AI agents between March and April 2026?

The shape of the install changed more than the model layer did. UI-TARS Desktop v1.6.0 landed on 2026-04-04 with a smaller vision encoder. Goose 1.2 shipped on 2026-04-10 with automatic MCP server discovery. browser-use 0.7 shipped on 2026-04-08 with native MCP transport. Fazm 2.4.0 shipped on 2026-04-20 with custom MCP server support, opening ~/.fazm/mcp-servers.json to any third-party server you drop in. The shared move is that MCP went from a Claude Desktop curiosity to the default extension protocol across every host. Bundled skills inside a signed .app is one expression of that move: the agent loop is portable, the capability layer travels with the app.

If I want one open source Mac AI agent installed by tonight, which one is it?

If you want the agent to do PDFs, slide decks, deep research, and screen automation without a setup weekend, install Fazm. The 17 bundled skills land in ~/.claude/skills/ on first launch, the macOS accessibility binary is already signed and notarized, and the only setup step is granting accessibility permission in System Settings. If you want a vision-language alternative because the apps you target hide their accessibility tree, install UI-TARS Desktop. If you live in a terminal and you want to compose your own tools, install Goose 1.2. The three are not substitutes; they are different tools for different surfaces.

Is Fazm itself open source, including the skill installer?

Yes. The Fazm desktop app is MIT licensed at github.com/mediar-ai/fazm. SkillInstaller.swift lives at Desktop/Sources/SkillInstaller.swift in that repo. The macos-use MCP server is MIT at github.com/mediar-ai/mcp-server-macos-use. The seventeen bundled skill markdown files live at Desktop/Sources/Resources/BundledSkills/ in the same repo, each with its own license header. You can fork the repo, change which skills ship, rebuild the .app with build.sh, and distribute your own customized signed bundle.