Build macOS app with Claude CodeFive repo disciplinesReal shipping app

Build a macOS app with Claude Code: the five repo disciplines that make a real Mac app Claude-buildable

Most posts on this query are weekend stories: someone vibe-coded a backup tool in two hours and shipped a screenshot. They are fine. They are not what you need if you want Claude Code to continuously edit a real Mac app that ships to real users. That job needs five specific repo disciplines, and Fazm's open source desktop repo demonstrates each. The shape of the project, not the model, is what makes the loop safe.

Fazm

Published April 18, 202614 min read

Try Fazm free

4.9from 200+

Every claim grep-able against the open source Fazm desktop repo

Five disciplines, exact file paths and line numbers

Pattern that survives multiple Claude Code agents in parallel

Build a macOS app with Claude Code

Five repo disciplines, one shipping example

SPM, no .xcodeproj for the model to corrupt

./run.sh is the only build command, ever

/tmp/fazm-dev-status is the lifecycle source of truth

DistributedNotificationCenter triggers test non-visual features

Accessibility tree, not screenshots, verifies UI

0:00 / 0:05

The numbers that govern whether a Mac app is actually Claude-buildable

0Repo disciplines that make sustained Claude Code work safe

0Build entry point (./run.sh) and one lock (/tmp/fazm-build.lock)

0Line in FazmApp.swift where the test-trigger handler starts

0.xcodeproj files in the repo (it is all Package.swift)

The "build a macOS app with Claude Code" SERPs all lead with model choice and prompt tricks. The four numbers above are the ones that actually decide whether Claude can drive your repo for hours without breaking anything.

0 .xcodeproj

“executableTarget(name: 'Fazm', dependencies: [PostHog, Sentry, GRDB, Sparkle, MarkdownUI, Firebase, SessionReplay, Highlightr, BrowserProfileLight])”

Desktop/Package.swift, the entire app target

Discipline #1: Swift Package Manager, no .xcodeproj anywhere

This is the single highest-leverage decision. .xcodeproj is a binary blob whose internal object IDs are not safe to text-edit. Add a file, rename a target, wire a new dep, and you are doing line surgery on pbxproj that breaks the next time Xcode rewrites the file. Claude Code corrupts these almost every try. Package.swift is just Swift. The model reads it like any other source file.

Desktop/Package.swift

Fazm builds a real signed, notarized .app bundle out of this. The run.sh script wraps swift build, copies resources into a Foo.app/Contents structure, signs with Developer ID, and launches. Codemagic uses a different flavor of the same flow for the production builds that go to users. Xcode never owns the project file. Claude Code does.

Discipline #2: one command, one lock, every time

Fazm's CLAUDE.md is blunt about this: ./run.sh is the ONLY command you ever run. Not swift build. Not xcodebuild. Not open from build/. The script does kill, build, sign, install, launch, in that exact order. The first thing it does is acquire a file lock so parallel agents serialize automatically.

run.sh, lines 1 to 12

The lock lives at /tmp/fazm-build.lock as a directory (mkdir is atomic on macOS, so it is a free mutex). The wait timeout is 300 seconds. If a second agent calls run.sh while the first is mid-build, it blocks until the first releases. No build corruption, no partial app bundles, no race-condition crashes on launch.

Discipline #3: a status file is the only allowed lifecycle reading

CLAUDE.md says: NEVER use pgrep, ps aux, or log file checks to determine whether to build/kill/restart. Use the status file. That file is /tmp/fazm-dev-status, written by run.sh and the watchdog. One line, four possible states. The decision tree is simple enough that a model gets it right every time.

Lifecycle inspection: the wrong way and the right way

# Racy. Lies during the build/launch transition.
# Different agents will read different states.
if pgrep -f "Fazm Dev" > /dev/null; then
  echo "running"
else
  ps aux | grep -i fazm
  tail -n 50 /private/tmp/fazm-dev.log
fi

13% cleaner

Two agents reading the same file at the same time get the same answer. Two agents running pgrep do not.

Discipline #4: every non-visual feature gets a distributed-notification trigger

Claude Code cannot click buttons. It can fire a DistributedNotificationCenter notification from a one-line xcrun swift -e invocation. So every non-visual feature in Fazm registers a com.fazm.* notification name and exercises itself when the notification arrives. The model writes the feature, writes the trigger, fires the trigger, reads the log, and is done.

Desktop/Sources/FazmApp.swift, lines 295 to 305

Notice the comment on the line above the registration. That is the literal one-liner Claude Code copies and runs to fire the trigger. Every feature follows the pattern. Replay tutorial? com.omi.replayTutorial. Send a chat query without UI? com.fazm.testQuery with a "text" key in userInfo. Inspect the floating bar state? com.fazm.control with command "getState", which writes JSON to /tmp/fazm-control-state.json.

Claude Code → xcrun swift -e → DistributedNotificationCenter → live app

Discipline #5: verify UI through the accessibility tree, not screenshots

For visual changes, Claude Code needs to confirm the new view actually rendered. The naive path is a screenshot plus an OCR or vision model. That is slow and lossy. Fazm uses the same accessibility-API engine it sells: AXUIElementCreateApplication on the running app, walked recursively, dumped to a structured text tree of role / label / value / coordinates. Verifying "the button now reads Send" is a text grep, not a vision pass.

Verifying a UI change after a Claude Code edit

The agent screenshots the window, hands the PNG to a vision model or OCR pipeline, parses the result, and compares against expected text. Slow per check, lossy on small text, totally blind to disabled states or hidden values.

Per-check latency dominated by image inference
Loses element identities (which button is which)
Coordinates are pixel-relative, not stable across resolutions
Requires a vision-capable model in the loop

All five disciplines, condensed

Each one is grep-able against the public Fazm desktop repo. None of them are theory. They are the working setup that ships the app to users every week.

1. Swift Package Manager, no .xcodeproj

Desktop/Package.swift declares an executableTarget named Fazm with PostHog, Sentry, GRDB, Sparkle, swift-markdown-ui, Firebase, and macos-session-replay. Every dep change is a text edit a model can make. No pbxproj line surgery.

2. One command, one lock

./run.sh is the only build entry point. It sources scripts/fazm-lock.sh and acquires /tmp/fazm-build.lock at line 9. Parallel agents serialize automatically; one agent's build never stomps another.

3. Status file as lifecycle source of truth

/tmp/fazm-dev-status holds one line: <state> <pid> <ts>, where state is building, running, exited, or failed. Before any build/kill/test decision, every agent reads this file first. No pgrep, no ps aux guessing.

4. DistributedNotificationCenter test hooks

Every non-visual feature exposes a com.fazm.* notification name. The model fires the trigger from xcrun swift -e in one line, the app exercises the feature end to end, and results land in /tmp/fazm-control-state.json or the dev log.

5. Accessibility-API verification

UI verification reads the AXUIElement tree (role, label, value, coordinates) as structured text, not pixels. Claude Code confirms 'the button now reads Send' in milliseconds. Same engine Fazm sells, used to test Fazm.

What a single Claude Code edit-to-verify cycle looks like inside Fazm

The model edits a Swift file, calls run.sh, the lock serializes against any other agent, the build runs, the status file flips from building to running, the model fires a distributed notification to exercise the change, and reads the log to confirm. End to end, no human in Xcode.

Claude Code → run.sh → status file → app → notification → log

The exact triggers Claude Code fires against the live app

These are not pseudocode. Each one runs on a real macOS terminal and exercises a real feature in the running Fazm Dev app. The model writes them, runs them, parses the response.

Three real test triggers from the Fazm CLAUDE.md

The files Claude Code actually reads when it builds Fazm

No .xcodeproj. No .pbxproj. Plain text everywhere.

Desktop/Package.swiftrun.shscripts/fazm-lock.shDesktop/Sources/FazmApp.swiftCLAUDE.md/tmp/fazm-build.lock/tmp/fazm-dev-status/private/tmp/fazm-dev.log/tmp/fazm-control-state.jsonDesktop/Sources/Chat/ACPBridge.swiftDesktop/Sources/AppState.swiftDesktop/CHANGELOG.json

Discipline-driven Mac app vs the vibe-coded weekend app

The "I shipped a Mac app with Claude Code in a weekend" posts are real. They are also load-bearing on a single agent, a clean machine, and a willingness to babysit. The Fazm pattern is what you reach for when you want the loop to keep running while you sleep.

Feature	Weekend vibe-coded Mac app	Fazm pattern (discipline-driven)
Project format	Usually .xcodeproj	Package.swift, plain text
Build entry points	Many (xcodebuild, swift build, Xcode UI)	1 (./run.sh)
Multi-agent safe	No (races on first conflict)	Yes (file lock + status file)
Non-visual feature testing	Manual click-through	DistributedNotificationCenter triggers
UI verification	Screenshot + vision model	Accessibility tree (text)
Survives long Claude Code sessions	Usually drifts in 2 to 3 hours	Yes
Real production shipping	Usually a personal .app on Desktop	Codemagic, Sparkle auto-update, signed DMG

Three things the "build a Mac app with Claude Code" posts skip

Not instead of those posts. Beside them. They prove the model can. The disciplines below are what makes the loop safe to leave running.

.xcodeproj is the silent killer

Every "I built a Mac app with Claude Code" post that uses Xcode hits this within hours. The model edits pbxproj, Xcode rewrites it, the diff explodes. Switch to Package.swift on day one and this entire failure mode disappears.

Tests need a trigger surface

You cannot ask Claude Code to "click the button and check the result" reliably. You can ask it to fire a distributed notification and read the log. Build the trigger surface as you build the feature, not after.

Multi-agent is the actual workload

A single agent in a single window is the demo. Real Claude Code work is two or three agents in parallel on the same repo. Without a build lock and a status file you discover this the bad way: a half-built .app launching against a half-killed previous instance.

What gets easier the day you switch your Mac app to this pattern

Adding a new dependency

One edit to Package.swift, run.sh resolves and builds. No 'edit the project, then add to target, then check Embed & Sign' Xcode dance.

Renaming a file

rg + sd + git mv. SPM picks up the new path automatically. No pbxproj patching.

Adding a feature

Write the SwiftUI view, register a com.fazm.* trigger, fire the trigger from a one-liner, watch the log. Done.

Reviewing what changed

Plain Swift diff. No 'why did 80 lines of pbxproj IDs shuffle?' noise in PRs.

Running multiple agents

First agent acquires /tmp/fazm-build.lock. Second one waits, then proceeds. No collisions, no half-built bundles.

Shipping to users

Same Package.swift drives Codemagic. The CI build is just run.sh in a different costume. What works locally works in CI.

See the pattern in production

Fazm is open source. The five disciplines in this guide are visible in the public repo: Desktop/Package.swift, run.sh, scripts/fazm-lock.sh, Desktop/Sources/FazmApp.swift line 296, and CLAUDE.md. Clone it, read it, lift the parts you want into your own Mac app. The Fazm consumer app itself is free to download and runs on real accessibility APIs, the same engine that lets the model verify its own UI changes.

Download Fazm →

Frequently asked questions

Can Claude Code actually build a real macOS app, not just a toy?

Yes, and Fazm is the existence proof. The Fazm desktop app is a Swift Package Manager binary signed and notarized through Codemagic, shipped via Sparkle auto-update, with PostHog, Sentry, GRDB, Firebase, and macOS session replay all in production. The whole repo is structured so Claude Code can build, test, and verify it without a human in Xcode. The pattern is not 'use Claude Code instead of Xcode'; it is 'restructure the repo so Claude Code can drive the same build Xcode would have driven.'

Why use Swift Package Manager instead of an .xcodeproj?

.xcodeproj is a binary blob with internal IDs that Claude Code cannot reliably edit. Adding a file, renaming a target, or wiring a new dependency means hand-editing pbxproj line by line, which corrupts under any non-trivial diff. Package.swift is plain Swift code. Claude Code reads it like any other source file. Fazm's Desktop/Package.swift declares an executableTarget named Fazm with PostHog, Sentry, GRDB, Sparkle, swift-markdown-ui, Firebase, and macos-session-replay as dependencies. Every change to that file is a normal text edit a model can make safely.

What is the 'one command' rule and why does it matter?

Fazm's CLAUDE.md says: ./run.sh is the ONLY command you ever run. Not xcrun swift build, not xcodebuild, not open. The script handles killing the previous instance, building the ACP bridge, building the Swift app, copying resources, signing, and launching. There is exactly one entry point and exactly one lock at /tmp/fazm-build.lock. With multiple Claude Code agents running in parallel (which is normal once you taste it), this is the only way to keep them from clobbering each other's builds.

How does Claude Code verify a non-visual change without clicking through the UI?

Through DistributedNotificationCenter test triggers. Fazm registers handlers like com.fazm.testQuery, com.fazm.testTutorial, and com.fazm.control inside FazmApp.swift (line 296 onward). Claude Code fires them from a single line: xcrun swift -e 'import Foundation; DistributedNotificationCenter.default().postNotificationName(.init("com.fazm.testQuery"), object: nil, userInfo: ["text": "hello"], deliverImmediately: true); RunLoop.current.run(until: Date(timeIntervalSinceNow: 1.0))'. The app receives the notification, exercises the feature end to end, and writes results to /tmp/fazm-control-state.json or /private/tmp/fazm-dev.log. No GUI automation needed.

What is /tmp/fazm-dev-status and why is it required?

It is the single source of truth for the app lifecycle. One line in the format <state> <pid> <unix_timestamp>, where state is one of building, running, exited, failed. Before Claude Code decides to build, kill, or send a test trigger, it reads this file. Without it, parallel agents resort to pgrep and ps aux, which are racy and produce contradictory readings during the build/launch transition. The file is the contract that lets multiple agents share one running app without stepping on each other.

Does Fazm use accessibility APIs instead of screenshots?

Yes, and it is what makes the testing story work. Verifying a UI change in a screenshot-driven loop means OCR-ing a PNG, which is slow and lossy. Fazm uses macOS accessibility APIs (AXUIElementCreateApplication and friends) to read element identities, labels, values, and coordinates as structured text. Claude Code can verify 'the button now reads Send' by walking that tree in milliseconds, no vision model required. The same engine that makes Fazm's product work also makes Fazm itself testable by Claude Code.

What about Xcode previews and Interface Builder?

Fazm does not use Interface Builder. The entire UI is SwiftUI. Previews still work in Xcode if you open the package, but the build/test loop does not depend on them. The model writes SwiftUI views, run.sh builds and launches the app, distributed notifications drive the new view, and a screenshot or accessibility tree confirms it rendered correctly. Previews are a developer convenience, not part of the agent's verification path.

How do you keep multiple Claude Code agents from breaking each other?

Three mechanisms. First, /tmp/fazm-build.lock is a directory-based file lock acquired at the top of run.sh by sourcing scripts/fazm-lock.sh. Second, /tmp/fazm-dev-status records who is doing what. Third, the activity tail of /private/tmp/fazm-dev.log marks recent test traffic so a second agent will wait rather than rebuild over the first. CLAUDE.md tells every agent: do not delete the lock manually, do not kill the app externally, always read the status file first.

Is Fazm itself open source so I can read this myself?

Yes. github.com/mediar-ai/fazm is the public repository. Every file referenced in this guide (Desktop/Package.swift, run.sh, scripts/fazm-lock.sh, Desktop/Sources/FazmApp.swift, CLAUDE.md) is browsable. The patterns described here are not theory; they are the working production setup that ships Fazm to thousands of users via Codemagic-signed DMGs and Sparkle updates.

Why not just use Cursor or Xcode's built-in AI?

Cursor is great in the editor but does not give you a long-running multi-agent loop driving build, test, and verification autonomously. Xcode's Swift Assist is single-file completion. The Fazm pattern is for the case where you want Claude Code to operate on the whole repo, run tests, fix what it broke, and keep going while you do other things. The disciplines in this guide are what make that loop safe to leave running.

The shape of the project, not the model, is what makes the loop safe

The honest read on "build a macOS app with Claude Code" in April 2026 is that the model is no longer the bottleneck. Sonnet 4.6 and Opus 4.7 happily write SwiftUI views, wire up Combine pipelines, and reason about Sparkle update channels. What breaks is the project around them: a binary .xcodeproj the model corrupts, a build command that races against the next agent, a UI verification path that waits on a vision model.

The five disciplines in this guide remove every one of those failure modes. SPM instead of .xcodeproj kills project-file corruption. One command and one lock kills build collisions. A status file kills lifecycle ambiguity. DistributedNotificationCenter triggers kill the GUI-clicking dependency. Accessibility-tree verification kills the screenshot loop. None of it is exotic. All of it is grep-able in the Fazm repo today.

0 disciplines. Try Fazm.