FOLDER SCOPE / ATOMIC LOCK / IDLE WINDOW

Folder rules are not enough. Here is the 144-line lock we ship.

Every guide on parallel Claude Code agents tells you the same thing: give the frontend agent components/, give the backend agent api/, give the test agent __tests__/. That advice is correct for source files. It is also incomplete. On a real shipping project, the artifacts that actually collide are not in any folder: the build binary, the running app process, the append-only test log. This page is the working production answer we use to keep multiple Claude Code agents from breaking each other on the Fazm codebase, with the exact lock script and the rule set our CLAUDE.md enforces.

M
Matthew Diakonov
9 min read
4.9from Working production setup, open source
scripts/fazm-lock.sh, 144 lines, in the public repo
mkdir-based atomic lock at /tmp/fazm-build.lock
Stale PID detection via kill -0
30-second idle window against the live app log
Status file at /tmp/fazm-dev-status as lifecycle source of truth

DIRECT ANSWER, VERIFIED 2026-05-12

Two layers. Layer one is folder scope: each Claude Code agent owns a directory and only writes there. Layer two is an atomic file-system lock for the shared artifacts that do not live in any one folder (the binary, the log, the running process). Fazm uses a mkdir-based lock directory at /tmp/fazm-build.lock containing the holder PID, plus a 30-second idle-window check against the live app log so a fresh build never wipes out another agent's in-flight test. The lock script is open source at github.com/mediar-ai/fazm, file path scripts/fazm-lock.sh.

THE PART NOBODY MENTIONS

Folder ownership solves only the easy half of the problem

The official guidance is fine as far as it goes. Two teammates editing the same file leads to overwrites, so you break the work so each teammate owns a different set of files. The frontend agent gets src/components/, the backend agent gets src/api/, the test agent writes inside src/__tests__/. No agent writes outside its domain without an explicit handoff. That contract, written into the project's CLAUDE.md, is the cheapest and highest-leverage thing you can do.

Now look at any real project shipping a binary. The Fazm desktop app, for instance, has exactly one build output (build/Fazm Dev.app), exactly one running process (a single instance of Fazm Dev matched by pkill -f), exactly one bundle identifier (com.fazm.desktop-dev), exactly one append-only log at /private/tmp/fazm-dev.log, and exactly one status file at /tmp/fazm-dev-status. None of those are in any subdirectory of the source tree. A folder partition cannot represent them. If agent A is mid-test against the running app and agent B starts a fresh build, the first thing agent B's script does is pkill -f "Fazm Dev.app", which kills agent A's app mid-query. Agent A's run is now meaningless.

This is the gap. The folder rule is sound. It does not extend to runtime artifacts. You need a second mechanism for those, and that mechanism cannot just be "wait for the lock"; it also has to respect that the singleton might be in use right now even if no build is in progress.

THE ANCHOR FACT

What the working lock actually looks like

The lock script is 144 lines. Most of those lines are timestamps, logging, and the activity-window helper. The acquire loop itself is short, and the part that matters is which atomic primitive it uses. Compare the shape most "lock file" blog posts suggest with what scripts/fazm-lock.sh in the Fazm repo actually does:

Naive vs production

# What most blog posts suggest
echo $$ > /tmp/build.lock
trap 'rm -f /tmp/build.lock' EXIT

# Problems:
# - 'echo > file' is not atomic; two procs can both win
# - No way to know if the holder is still alive
# - No respect for a running app mid-test
# - Other agents must rm -f manually when it gets stuck
-222% more correctness for ~3x lines

Three things make this work where the naive version fails. First, mkdir is atomic; only one caller returns success. Second, the holder PID lets a future caller distinguish a live agent from a dead one. Third, the idle check (fazm_app_is_idle) keeps a fresh build from killing an in-flight test by examining the app's own log for activity in the last 30 seconds.

THE IDLE WINDOW

Reading the live app log instead of guessing

The idle-window check is the load-bearing piece. The naive question "is anything happening in this app?" has many wrong answers (pgrep returns yes any time the process exists; ps aux returns yes even when the app is paused). The right question is "did any user-driven or agent-driven activity log a line in the last N seconds?" The Fazm script answers that by tailing the last 2000 lines of /private/tmp/fazm-dev.log and grepping for a small set of patterns that only fire on real activity.

fazm_app_is_idle, in practice

The patterns we look for are deliberately user-facing: Chat query started, Test query received, Control command received, PushToTalkManager: started listening, floating_bar_query_sent, floating_bar_ptt_started, chat_agent_query_completed. Background noise (GeminiAnalysis cron lines, PostHog batching, layout resize events) is intentionally excluded; matching those would make every agent wait forever.

WHAT IS UNDER CONTENTION

The artifacts that cannot be partitioned by folder

The build output

build/Fazm Dev.app is one bundle on disk. Two concurrent builds writing into it produce a corrupt bundle that fails Codemagic notarization. The lock serializes the build phase so only one writer touches it at a time.

The running app process

Exactly one Fazm Dev process is allowed at a time; the second pkill -f 'Fazm Dev.app' kills whichever instance the previous agent just launched. Without the idle window, every parallel rebuild is a kill -9 to whichever agent was testing.

The append-only log

/private/tmp/fazm-dev.log is never truncated between runs (other agents may be tailing it). It is the source of truth for fazm_app_is_idle, so it is also the source of truth that decides whether the next agent can start.

The status file

/tmp/fazm-dev-status holds <state> <pid> <ts>. Every agent reads it before deciding to build, kill, or test. State transitions (building, running, exited, failed) are exclusive; the script writes them inside its own lock, so the file is never half-written.

Bundle identifier and defaults

com.fazm.desktop-dev is the namespace for cfprefsd keys (auth_isSignedIn, hasCompleted, currentTier). Two builds racing against the same defaults database produce non-deterministic auth state. The lock keeps the writes serial.

Crash detection flag files

Each user has a .fazm_running file in ~/Library/Application Support/Fazm/users/<uid>/. The dev relaunch path deletes these. Two parallel run.sh invocations would compete on the same flag files and one would always lose.

A CONCRETE COLLISION

Two Claude Code agents arrive ten seconds apart

Below is what the lock does when agent A is mid-test and agent B fires off a separate task that requires a fresh build. Without the lock, agent B's first command (pkill) would have killed agent A's app mid-query. With the lock plus the idle window, agent B simply parks for a few seconds, then proceeds when the live session goes quiet.

Two parallel agents, one shared app

Agent ALock dirApp logAgent Bmkdir /tmp/fazm-build.lockok, pid=A writtenChat query started @ 18:42:17mkdir (10s later)EEXIST, holder=A alivetail -2000, scan activitymatch: 4s ago, idle_window=30ssleep 5, retryrelease lock when run.sh exitsmkdir succeedsok, pid=B written

INSIDE fazm_acquire_lock

The full acquire flow as the script runs it

One loop iteration

1

mkdir lock dir

atomic primitive

2

Win? write PID

EEXIST means lose

3

Check idle

tail app log, last 30s

4

Busy? release

rm -rf, sleep 5

5

Idle? trap EXIT

auto-release on exit

6

Return 0

caller runs run.sh

The release path is just rm -rf "$FAZM_LOCK_FILE", but it is wrapped in a guard: the function only releases the lock if the PID written inside matches $$. If you somehow source the script in a shell that did not acquire the lock, you cannot accidentally drop another agent's hold.

THE OTHER FILE THAT MATTERS

/tmp/fazm-dev-status, read first, every time

The lock decides who gets to build; the status file decides whether you need to build at all. CLAUDE.md tells every agent to read this file before any build, test, or kill decision. It contains one line. The decision tree it drives is short:

reading /tmp/fazm-dev-status

This is the part most parallel-agent setups miss. The lock prevents collisions when you do need to build, but the cheapest way to coordinate is to not build at all. If the app is already running and your task is to test a feature inside it, send a distributed notification (Fazm registers com.fazm.testQuery as a programmatic test hook) and skip the entire build pipeline. No lock, no pkill, no killed app.

/tmp/fazm-build.lock (directory)mkdir is atomic on every filesystemPID file inside the lock dirkill -0 $holder_pid for stale detectionFAZM_IDLE_WINDOW_SEC=30tail -2000 /private/tmp/fazm-dev.logChat query startedTest query receivedfloating_bar_ptt_startedchat_agent_query_completed/tmp/fazm-dev-statusscripts/fazm-lock.sh (144 lines)

PUTTING IT TOGETHER

The contract every parallel Claude Code agent on this repo follows

The whole point of writing rules into a CLAUDE.md is that every fresh Claude Code session reads them before doing anything. The Fazm CLAUDE.md says, in essence, four things, and these four lines are what make parallel work on the codebase actually possible:

  1. Just run ./run.sh. The script handles locking, waiting, status updates, and lifecycle. Do not call swift build or pkill or rm by hand.
  2. Never delete /tmp/fazm-build.lock manually. The lock is a directory managed by the scripts. Manually deleting it defeats the whole concurrency system. Stale locks are auto-detected by kill -0 and cleaned up by the next agent that needs the lock.
  3. Never kill the app externally. run.sh stops the old app as part of its flow. Killing it from outside (pkill -f from your shell) orphans the lock and corrupts the status file.
  4. Read /tmp/fazm-dev-status before any build/test decision. If it says running and the PID is alive, do not rebuild; send a distributed notification instead.

The user's global ~/.claude/CLAUDE.md adds the matching guardrail for source files: if you see build errors in files you did not edit, do not try to fix them, wait 30 seconds and retry the build; another agent is likely mid-edit and will fix their own errors. That rule plus the file-system lock plus the idle window plus the status file is the whole multi-agent coordination story on this codebase. Nothing fancier is needed.

Want a walkthrough of the Fazm lock setup on a call?

Twenty minutes to look at scripts/fazm-lock.sh, the status file plumbing, and how the CLAUDE.md contract drops parallel-agent collisions to near zero on a real shipping macOS app.

Parallel Claude Code agents and file ownership, in detail

How do parallel Claude Code agents avoid clobbering each other on shared files?

Two layers. Layer one is folder scope. The frontend agent owns components/, the backend agent owns api/, the test agent owns __tests__/. Most file edits never collide because the agents are working in different directories. Layer two is an atomic file-system lock for shared artifacts that do not live in any one folder (the build output, the running process, the append-only test log). Fazm uses a directory-based lock at /tmp/fazm-build.lock created with mkdir, which is atomic on every filesystem. The holder writes its PID into the directory so a second agent can detect a dead holder with kill -0 and reclaim a stale lock. CLAUDE.md tells each agent: do not delete the lock by hand, do not kill the app externally, always read the status file first.

What is wrong with just giving each subagent a folder and calling it done?

It works for source files. It does not work for the artifacts that exist outside any folder. A native macOS app has one binary in build/, one bundle ID, one running process, one append-only log at /private/tmp/fazm-dev.log, and one status file. None of those belong to a subdirectory of src/. If two agents both call run.sh, one will compile while the other is mid-test; the test agent's logged-in app will get killed by the next pkill in line 75 of run.sh, mid-query. The folder partition cannot represent that. You need a real lock that serializes the actions on the shared singleton, plus a separate signal for whether the running app is currently busy.

Why is the lock a directory, not a file?

Because mkdir is atomic on every filesystem and every macOS version that Fazm ships to. The classic 'echo $$ > /tmp/foo.lock' pattern is not atomic on its own. Two processes can both succeed in writing to it, and the last one wins; whichever process reads next sees a single PID and assumes it owns the lock, but actually two of them are running concurrently. mkdir cannot lose this race: only one mkdir call returns success, the rest return EEXIST. Inside /tmp/fazm-build.lock you will find a 'pid' file with the owner's process ID and a 'script' file with the name of the script that grabbed it (so the failure message can tell you which run.sh is holding things up).

What is the idle window check, and why does it exist?

Folder ownership and a binary lock together still are not enough. Consider this case: agent A finished its build, the app is running, and the user is mid-conversation with the app while a test runs. Agent B arrives, acquires the lock, runs run.sh, which kills the existing app at line 75 (pkill -f 'Fazm Dev.app'). The user's in-flight test dies mid-query. The fix is fazm_app_is_idle, which scans the last 2000 lines of /private/tmp/fazm-dev.log for activity patterns (Chat query started, Test query received, Control command received, floating_bar_query_sent, floating_bar_ptt_started, chat_agent_query_completed). If any of those landed in the last FAZM_IDLE_WINDOW_SEC seconds (default 30), agent B releases the lock it just acquired, sleeps 5 seconds, and retries. The app gets to finish its work before the next rebuild.

How does Fazm detect a stale lock from a dead agent?

When mkdir on /tmp/fazm-build.lock fails, the script reads the holder PID from the lock directory and runs kill -0 $holder_pid. If that returns nonzero, the holding process is dead. The script logs 'Removing stale lock from dead process' along with the script name that grabbed it, deletes the lock directory, and retries the mkdir on the next loop iteration. This handles the common case where a previous Claude Code session was Ctrl-C'd at the wrong moment, or a SIGKILL came from somewhere unexpected. No human ever has to rm -rf the lock; the next agent that arrives sees the dead PID and cleans up.

What status does a third agent check before deciding to build?

/tmp/fazm-dev-status holds one line in the format '<state> <pid> <ts>'. State is one of 'building', 'running', 'exited', or 'failed'. The agent reads the file, parses the state, and acts: if 'running', it checks 'kill -0 $pid' to confirm the listed app is actually alive (the file can lie if the app was force-killed). If alive, the agent does not need to build; it can send a distributed notification straight to the running app (com.fazm.testQuery is the production hook). If the file shows 'building', the agent waits. If it says 'exited' or is missing, the agent is free to run run.sh. Crucially, the agent does not use pgrep or ps aux to figure this out; those answer a different question (is any process matching this name alive) than the one that matters (is the app I want to test currently in the state I expect).

What does this look like in the global Claude Code rule set?

Two lines in /Users/matthewdi/fazm/CLAUDE.md spell it out. The first tells every agent: just run ./run.sh, it handles everything; if another agent holds the lock, it waits up to 5 minutes. The second tells every agent: never manually delete /tmp/fazm-build.lock, never run rm -rf on it; the lock is a directory managed by the scripts and manually deleting it defeats the concurrency system. Combined with the Multiple Agents in Parallel section in the user's global ~/.claude/CLAUDE.md (build errors you didn't cause, wait 30 seconds and retry), this gives a complete contract: scoped files for source edits, atomic locks for shared artifacts, idle window for running processes, and explicit don'ts for the failure modes that look tempting in the moment.

Why is this specific to a macOS app and not a generic answer?

It is not. The same shape applies to any project where the artifact under contention is not a file: a shared dev server, a Docker container, a port, a database, a queue. The folder partition handles source edits; the file-system lock handles the singleton artifact; the idle window handles the case where the singleton is currently being used by another agent's in-flight task. The specific binaries change (it is sips and pkill on macOS, it would be docker and curl on a server project) but the three-layer pattern is the same. Folder, lock, idle.

Is the lock script open source so I can read it myself?

Yes. The repository is github.com/mediar-ai/fazm. The lock implementation is at scripts/fazm-lock.sh (144 lines, 5687 bytes). The status file logic is inside run.sh. The CLAUDE.md file at the repo root is the contract every Claude Code agent reads when it picks up work in that repo. You can clone the repo, point your own Claude Code session at it, and the rules apply automatically. Nothing here is theory or a stylized example; this is the working production setup that ships Fazm to thousands of macOS users.

What happens if a parallel agent ignores the rules and deletes the lock by hand?

Two builds run concurrently. They both write to build/Fazm.app, they both run pkill -f 'Fazm Dev.app' (so each kills the other's just-launched app), they both append to /private/tmp/fazm-dev.log (the log interleaves and stops being readable), and at some point Codemagic notarization fails because the bundle signature does not match the binary it expects. The cost is roughly a 15-minute rebuild plus a few minutes of confusion reading the interleaved log. That is why CLAUDE.md is explicit about never deleting the lock manually: the cost of doing so falls on whoever has to debug the resulting mess, not on the agent that took the shortcut.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.