Five layers, not one file

AI coding agent spec files, five layers a shipping repo uses

Most guides on this topic answer one question. Should you write a CLAUDE.md, an AGENTS.md, or a .cursorrules. That picks one file and calls it done. In a real production repo the spec layer is five files in different roles, loaded at different times, for different reasons. Below is what the layered stack actually looks like, with line counts from the Fazm Mac agent's open MIT source so you can verify the numbers yourself.

Matthew Diakonov, Written with AI

Published May 11, 20269 min read

Direct answer, verified 2026-05-11

A real production AI coding agent reads five layers of spec files: (1) a project-root context file, CLAUDE.md or AGENTS.md or .cursorrules, that is always loaded, (2) named procedure files under .claude/skills/ or .claude/commands/ that load on demand when the agent matches a request to their description, (3) runtime status files the agent reads on disk before acting (build state, lock state, app pid), (4) a persistent memory directory for cross-session notes, separate from the policy you author, and (5) a settings.json that the harness reads for permissions, hooks, env, and model choice. CLAUDE.md alone is the most-talked about layer and the least sufficient one.

Why the single-file frame is incomplete

The current crop of writeups treats CLAUDE.md, AGENTS.md, and Copilot Instructions as three competing formats. Pick one, write a good one, ship. The framing comes from the early Claude Code days when a single project file was genuinely the whole spec surface. That frame is still useful for a tiny repo. It breaks the moment you have more than one named procedure to teach the agent, more than one ephemeral state that affects the next decision, or more than one machine the same project runs on.

The break happens for a reason. CLAUDE.md is always loaded into context. Every token in it pays rent on every conversation, every subagent, every cron. If you inline a 400-line release procedure into CLAUDE.md, every chat about an unrelated bugfix carries those 400 lines for no reason. The frontmatter-described procedure file fixes that by deferring the load. If you inline runtime state into CLAUDE.md ("the app is currently running at pid 49281"), the file is wrong five seconds later. The status file fixes that by being read fresh.

The frame that actually fits production is: one always-loaded brief, a library of on-demand procedures, a small number of runtime state files, a memory dir the agent writes to itself, and a settings file the harness reads. Same building blocks every major coding agent gives you, used together rather than interchangeably.

The five layers

Each layer answers a different question. Together they describe everything a coding agent needs to know about a repo and the machine running it.

Root context

CLAUDE.md, AGENTS.md, or .cursorrules at the repo root. Always loaded. Project shape, hard rules, conventions, where things live. Should be short. Fazm's is 311 lines.

Named procedures

.claude/skills/<name>/SKILL.md or .claude/commands/<name>.md. Each one has a frontmatter description; the agent loads it on demand when the description matches the request. Fazm has 47 of these, 9,767 lines total.

Runtime status

Small files the agent reads to know live state before acting. /tmp/fazm-dev-status holds <state> <pid> <ts>; the agent checks it before every build or kill. Fixes a class of bug context alone cannot.

Memory directory

Where the agent writes notes across sessions, separate from CLAUDE.md. User role, feedback, project state, references. Claude Code keeps it under ~/.claude/projects/<id>/memory/ with a MEMORY.md index.

Settings

.claude/settings.json (committed) plus .claude/settings.local.json (gitignored). Permissions, hooks, env vars, model choice. Not context for the agent; instructions for the harness that runs it.

What the numbers actually look like in a shipping repo

The Fazm desktop app is an open-source Mac agent. The repo is at github.com/mediar-ai/fazm, MIT licensed, so you can clone it and re-run the counts. As of 2026-05-11 the spec-layer counts are:

0lines in fazm/CLAUDE.md

0SKILL.md files under .claude/skills/

0total lines across the skill files

0layers in the real stack

Generated with wc -l on the working tree. The 311-line CLAUDE.md covers project shape and the hard rules an agent needs across every task. The 47 procedure files cover concrete workflows (release, codesign, sentry triage, db migration, build-fix, test-local, user-issue-triage, etc.) that load only when the description matches. Both numbers grow slowly; neither is a stop-the-world rewrite at any single moment.

Layer 1, root context

The always-loaded file. Claude Code reads it from ~/.claude/CLAUDE.md (your global rules), ./CLAUDE.md (the repo root), and any nested CLAUDE.md under a subdirectory. AGENTS.md is the vendor-neutral equivalent that Cursor, Codex, Gemini CLI, Aider, Windsurf, Warp, and others all read. Same idea. The right contents are project shape, conventions, hard rules, common commands, and pointers to deeper files. The wrong contents are long procedures, large reference tables, and anything that changes between sessions.

A good test for what belongs in this file: would a new contributor (human or agent) need this on day one of every task. If yes, it goes in. If it is only relevant to one workflow, it belongs in a procedure file. The Fazm CLAUDE.md picks this trade-off by mostly enumerating sections (Inbox Pipelines, Routines, Logs & Debugging, Release Pipeline) with the high-level shape, then pointing at deeper skill files for the long stuff.

Layer 2, named procedures

The on-demand library. Each file is a small markdown doc with a frontmatter header that includes a name and a description. The agent loads the file when the user's request matches the description. The trick that makes this work is the description itself: it lists the actual phrases or intents that should trigger the procedure, so the agent picks it without having to guess. Here is the frontmatter of the test-local skill from the Fazm repo, so you can see what a real one looks like:

.claude/skills/test-local/SKILL.md

The body of a procedure file is whatever the agent needs to execute the named workflow correctly: command catalog, common pitfalls, expected outputs, decision tree. Procedures stay long because they only load when matched. The Fazm .claude/skills/test-local/SKILL.md is 210 lines; the .claude/skills/user-logs/SKILL.md is 210; the .claude/skills/sentry-release/SKILL.md is in the same range. None of them ride on every conversation.

Layer 3, runtime status files

The newest layer, and the one missing from almost every writeup on this topic. A runtime status file records something the agent needs to know that did not exist when CLAUDE.md was written: is the app currently building, is the dev process up, what pid is the floating bar, did the last build fail and why. The agent reads the file fresh before acting, which means the context never carries stale facts about the live process.

In the Fazm repo the file lives at /tmp/fazm-dev-status. It holds one line of the form state pid unix_timestamp where state is building, running, exited, or failed. The CLAUDE.md tells the agent: always read this file first before running ./run.sh or killing a process; never use pgrep, ps aux, or log tails to guess. The decision tree is short enough to fit in the brief, but the live value is small enough to belong on disk.

A real decision a coding agent makes from a status file

The pattern generalizes beyond Mac apps. A web project might write /tmp/project-build-status with the last vercel deploy id. A monorepo might write a per-package lock so multiple agents can coordinate. Anything ephemeral the agent needs to read goes here. The contract is just "a small file the agent can stat and parse in one shell command," and the value is that CLAUDE.md stops accumulating stale text about live state.

Layer 4, the memory directory

CLAUDE.md is the policy you wrote for the agent. The memory directory is the notebook the agent keeps about you. Claude Code uses ~/.claude/projects/<project-path>/memory/ with a MEMORY.md index file and one markdown file per memory. Each memory file has a small frontmatter (name, description, type) and a body. The types are user (who is working with me), feedback (corrections to apply going forward), project (initiatives, why things are happening), and reference (where to look in external systems).

The reason this is separate from CLAUDE.md is authorship and decay. CLAUDE.md is hand-authored and stable; you change it deliberately. Memories are agent-authored and decay fast. A memory written six months ago about "the auth flow lives in AuthService.swift line 735" should be re-verified before acting on it, because the file may have changed. The split lets the always-loaded brief stay durable while the volatile notes live somewhere the agent can audit and rewrite without touching policy.

Layer 5, settings.json

The harness configuration. Not context the agent reads, but configuration that the runtime running the agent enforces. .claude/settings.json is committed and shared with the team; .claude/settings.local.json is gitignored for personal overrides. The file holds tool permissions (which commands run without prompting), hooks (shell scripts that fire on events like PreToolUse, PostToolUse, Stop), env vars, the default model, and a handful of behavioral knobs.

A common confusion to flag: instructions like "always run tests after editing X" do not belong in CLAUDE.md as a plain-text rule. The agent will mostly comply, but not deterministically. They belong in settings.json as a PostToolUse hook that fires on every Edit and runs the test command. CLAUDE.md tells the agent what to do when it has a choice; settings.json tells the harness what to do regardless of the agent. The split is policy versus enforcement.

A five-minute audit of any repo

The fastest way to know whether a codebase is set up for AI coding agents at production scale is to walk the five layers. Each one is a single shell command. If a layer is missing, that is fine; what matters is which layers are there and which are carrying weight they shouldn't.

1
Find the root context
Look for CLAUDE.md, AGENTS.md, .cursorrules, or .windsurfrules at the repo root.
2
Find the procedures
ls .claude/skills/ and .claude/commands/. Count the files. Total lines tells you how much offloaded weight CLAUDE.md is carrying.
3
Grep for status files
Search the root context for /tmp/, .lock, -status, or similar runtime paths. These are the files agents read before acting.
4
Check for a memory dir
ls ~/.claude/projects/<id>/memory/ or any other agent's memory path. The presence of a MEMORY.md index is the tell.
5
Open settings.json
.claude/settings.json (and .claude/settings.local.json if present). Permissions, hooks, model, env. This is what the runtime enforces.

How to actually start writing them

If you are starting from zero, write CLAUDE.md (or AGENTS.md, if your team uses more than one agent) first. Keep it under 200 lines. Cover project shape, conventions, hard rules, common commands, and pointers to deeper files. Resist the urge to inline procedures. The goal is the brief a new contributor reads on day one.

Add SKILL.md (or commands/) files the second time you find yourself explaining the same workflow to the agent. Each file gets a frontmatter description that lists the phrases that should trigger it. The procedure body can be long because it only loads when matched. Anti-pattern: a CLAUDE.md that grows past 400 lines because every procedure was inlined.

Add runtime status files when you notice the agent guessing about live state from logs or ps output. Pick the smallest file that answers the question (one line with state, pid, timestamp is usually enough). Tell CLAUDE.md to read it before acting.

Let memory accrue on its own; do not pre-write entries. Periodically (every few weeks) read MEMORY.md and prune anything that turned out wrong or stale. The signal that the system is working is that the agent stops needing to re-learn things across sessions.

Touch settings.json last. Start with permissions (the commands you trust the agent to run without prompting). Add hooks when you notice a rule the agent keeps forgetting; the hook makes compliance deterministic rather than aspirational. A clean settings.json with five entries is better than a messy one with fifty.

Want to walk the five layers on your own repo?

A short call to look at your current spec setup and figure out which layers are missing or overloaded.

Common questions about AI coding agent spec files

Is CLAUDE.md the same thing as AGENTS.md?

Close, not identical. AGENTS.md is a vendor-neutral format that Claude Code, Cursor, Codex, Gemini CLI, Aider, Windsurf, Warp, RooCode, Zed, and others have all agreed to read. CLAUDE.md is Anthropic's project memory file that Claude Code reads from three locations (your home, the repo root, and any subdirectory) and merges. The practical rule: if your team uses more than one agent, start with AGENTS.md so every tool inherits the same brief, then add a CLAUDE.md only for instructions that are genuinely Claude-specific. The format inside is plain markdown either way, no special schema.

Where does the context file actually need to live?

Claude Code looks in three paths and merges them in order: ~/.claude/CLAUDE.md (your global rules, applies to every project on the machine), ./CLAUDE.md at the repo root (shared with the team via git), and any nested CLAUDE.md under a subdirectory (scoped instructions for that area, e.g. a Backend/CLAUDE.md that only loads when the agent is editing files there). AGENTS.md follows the same pattern, vendor-neutrally. For other agents, check their docs; Cursor uses .cursor/rules/*.mdc, Windsurf uses .windsurfrules, Aider uses .aider.conf.yml plus an optional CONVENTIONS.md.

What goes in a SKILL.md file that doesn't already belong in CLAUDE.md?

CLAUDE.md is the always-loaded brief: project shape, hard rules, conventions, where things live. A SKILL.md is a named procedure the agent loads on demand when its description matches the user's request. The pattern works because the global brief stays cheap (one file, always in context) while the long procedures (release flow, sentry triage, codesign debug, db migration) live in separate files that only load when needed. In the Fazm repo, CLAUDE.md is 311 lines and the 47 SKILL.md files under .claude/skills/ total 9,767 lines. Inlining the procedures into CLAUDE.md would 30x the always-on token cost for every conversation.

What's a runtime status file and why do I need one?

A small file on disk that records the live state of the app or build so the agent can read it before deciding to act. In Fazm the file is /tmp/fazm-dev-status. It holds one line: <state> <pid> <unix_timestamp>, where state is building, running, exited, or failed. The CLAUDE.md tells the agent to always read this file first before running ./run.sh or killing a process. Without it, the agent guesses from pgrep and ps aux, gets the answer wrong about a third of the time (stale pids, multiple instances, races with other agents), and ends up killing builds in progress. The status-file pattern fixes a problem context files alone cannot.

Do I need a memory directory if I already have CLAUDE.md?

Yes for any non-trivial project. CLAUDE.md is fixed text you wrote. A memory directory is where the agent writes notes across conversations: user role, feedback that came up during a session, project state that changed, references to external systems. Claude Code uses ~/.claude/projects/<project-id>/memory/ with an index file (MEMORY.md) plus one markdown file per memory. The split matters: CLAUDE.md is policy you author for the agent, memory is observations the agent author for itself. Both load on session start.

Where does settings.json fit in?

It controls permissions and behavior, not project knowledge. .claude/settings.json (and the gitignored .claude/settings.local.json for personal overrides) declares which tools are allowed without prompting, which hooks run on which events, env vars, the model to use, and so on. It is not loaded as context the agent reads; it is loaded by the harness that runs the agent. The simplest split: if a fact is something the agent should think about, it goes in CLAUDE.md. If a fact is something the runtime should enforce, it goes in settings.json.

How do I avoid bloating CLAUDE.md until it eats the whole context window?

Three habits, in order of importance. First, push long procedures into separate SKILL.md files under .claude/skills/ and let the agent load them on demand by their description. Second, push large reference tables (SQL schemas, API specs, command catalogs) into the codebase itself and point CLAUDE.md at the file path instead of inlining the table. Third, keep CLAUDE.md to project shape (what lives where, hard rules, conventions, common debug commands) and leave the topic-specific depth for procedure files. In Fazm, CLAUDE.md is 311 lines; almost every section ends with a pointer to a deeper file rather than the full content.

Does Fazm itself use these files when it runs as a desktop agent on my Mac?

Different surface. Fazm is the agent that runs on your machine and drives your apps; it does not read your project's CLAUDE.md the way Claude Code does, because most Fazm tasks aren't coding tasks. The files matter when you are using Fazm (or any AI coding agent like Claude Code, Cursor, Codex) inside a repo to write code. Fazm's own repo has the full five-layer stack because Fazm is itself developed by an agent loop, and that loop reads CLAUDE.md plus 47 SKILL.md files plus the runtime status file every session. The repo is open MIT at github.com/mediar-ai/fazm, you can grep it yourself.

Adjacent guides on the same architecture

Keep reading

CLAUDE.md

CLAUDE.md specs for advanced vibe coding: project memory, test hooks, and multi-agent coordination

How specification files, project memory, test hooks, and multi-agent coordination take vibe coding from one-shot prompts into persistent verifiable workflows.

Read

Scale

Spec-first AI coding: why your CLAUDE.md matters more than your code at scale

At 15+ files, you should spend more time writing specs than code. A practical guide to spec-first development with CLAUDE.md, cursor rules, and structured prompting for large codebases.

Read

Privacy

Local AI privacy beyond inference: the seven other surfaces a desktop agent touches

Local LLM inference is one slice of a desktop AI agent's privacy story. The other seven slices decide whether your data actually stays on your Mac.

Read