From the source of one shipping macOS agent
AI agent context checkpoint, the two-layer pointer most guides skip
The published material on this topic treats a checkpoint like a save file in a video game: serialize the agent’s state to disk, deserialize it on restore. That works when the workflow runtime owns orchestration. It breaks the first day you build on top of a frontier-model SDK that already owns the conversation transcript. This is what the real shape looks like, with file paths from one open-source macOS agent you can read yourself.
Direct answer (verified 2026-05-18)
A discrete point in an agent’s run from which the agent can resume or branch later, with the same context and tool state it had at that point. In an agent built on top of a frontier-model SDK (Claude Code, Codex via ACP, the OpenAI Agents SDK), the checkpoint is two layers, not one: the SDK’s own append-only JSONL transcript on disk, plus a stable client-side pointer that owns the chain of every upstream session ID the conversation has ever held. Resume always walks the chain. Forking from a checkpoint is a native SDK call (session/fork in ACP), not a state copy.
Verified against the open-source Fazm implementation at github.com/m13v/fazm: bridge handler acp-bridge/src/index.ts line 4098, chain helpers Desktop/Sources/Providers/ChatProvider.swift lines 622 to 674.
The mental model most articles import from the wrong domain
The phrase “context checkpoint” gets its connotation from two places. One is ML training, where you periodically write model weights and optimizer state to disk so the run survives a crash. The other is workflow orchestration, where a runtime like LangGraph or Microsoft Agent Framework serializes its state graph at each pause point so it can resume after a failure. Both are about the runtime owning the thing it serializes.
A frontier-model agent built on a vendor SDK is a different shape. The SDK owns the model session, the agent loop, the tool dispatch, the on-disk transcript, and the resume protocol. Your host code does not control those internals. If your host writes a checkpoint blob of “the agent’s state” you are duplicating something the SDK already maintains, and you will diverge from it within one rate limit. The right move is to use the SDK’s own on-disk transcript as the durable representation, and have your host write down only what the SDK does not: a stable identity that survives upstream rollover.
That is the two-layer pointer this page is about. Layer one is the SDK’s transcript file on disk. Layer two is a chain of upstream session IDs your host maintains so it can always tell the SDK which transcript to load.
The two-layer pointer
The diagram below is what a single checkpoint actually points at, in a live ACP-based agent. The center is the host’s stable conversation identity. On the left are the things that can roll the upstream session ID forward at any time. On the right are the durable artifacts a resume or fork eventually addresses.
One conversation, many upstream session IDs, one transcript on disk
Anything that arrives on the left re-emerges from the hub as a fresh upstream session ID, which gets appended to the chain. The transcript on the right is immutable per session ID, so each chain entry addresses its own JSONL file. To resume, the host hands the SDK the current head of the chain; to walk further back in history, the host can hand the SDK any older entry. The chain is the only piece of state that has to be yours.
One-layer versus two-layer, side by side
Toggle to see how the same conversation plays out the first time the upstream session ID rolls. The one-layer pattern is what almost every tutorial on AI agent context checkpointing actually implements, even when the prose talks about robustness.
What a checkpoint actually points at
The host treats the upstream session ID as the conversation's identity. Messages are stamped with it, resume hands it back to the SDK, fork copies the prompt list into a fresh session because the SDK does not expose branching for a session it has already abandoned. The first time the upstream side rolls the ID (rate limit, bridge restart, OAuth rotation, credit cap, TTL), the host's pointer is now stale. Pre-rollover messages stamped with the old ID are stranded in the message table, the resume call lands on a session the SDK has never heard of, and the model wakes up with no memory of the conversation it was in the middle of.
- Session ID is treated as durable, but is actually transient
- Resume after rollover lands on a stale ID and re-creates an empty session
- Fork is a copy: re-tokenize the whole history, no cache hit
- Pre-rollover messages stranded in the message store
What happens when you click fork
This is the lifecycle of one fork operation through the Fazm stack, from the button press in SwiftUI to the new branch landing in the chat. Every step here is a function you can open in the public repo.
Fork lifecycle, end to end
UI click
SwiftUI calls ChatProvider.forkSession(key:) at ChatProvider.swift line 1735.
Bridge call
Provider awaits acpBridge.forkSession(fromKey:toKey:) over the JSON-lines protocol.
Lookup source
Bridge resolves fromKey to the live ACP sessionId via the sessions Map.
session/fork
One JSON-RPC call to the upstream ACP server with sessionId, cwd, mcpServers.
Branch returned
Upstream replies with a new sessionId pointing at the branch's transcript.
Re-register
Bridge unregisters the source key and registers the new sessionId under it.
Replay model
session/set_model is re-issued so the branch inherits the source's model.
Notify UI
Bridge emits session_forked; ChatProvider clears the per-key feed and renders.
Notice what does not happen. The host never serializes the conversation. The host never copies the prompt history into a new session. The host never reaches into the JSONL file. The branch is created by handing the upstream side a sessionId and letting the SDK manage the new on-disk transcript. The source’s JSONL is still there afterwards, recoverable via Conversation History because that view simply scans ~/.claude/projects for sessions on the current cwd.
The chain, written out
The chain itself is small. Five helpers in one file. The interesting one is appendToSessionChain at line 634 of Desktop/Sources/Providers/ChatProvider.swift. It de-duplicates older entries, no-ops if the new ID is already at the head, and bounds the chain at sixteen via sessionChainMaxSize at line 622. Every write to the primary acpSessionId_* UserDefaults key goes through persistSessionId (line 663), which sets the head AND appends to the chain in one call, so there is no path where the host writes a new head and forgets to extend the chain.
// Desktop/Sources/Providers/ChatProvider.swift
private static let sessionChainMaxSize = 16
private static func appendToSessionChain(_ id: String, storageKey: String) {
let trimmed = id.trimmingCharacters(in: .whitespaces)
guard !trimmed.isEmpty else { return }
let chainK = chainKey(forStorageKey: storageKey)
var chain = UserDefaults.standard.stringArray(forKey: chainK) ?? []
if chain.last == trimmed { return }
chain.removeAll { $0 == trimmed }
chain.append(trimmed)
if chain.count > sessionChainMaxSize {
chain = Array(chain.suffix(sessionChainMaxSize))
}
UserDefaults.standard.set(chain, forKey: chainK)
}
private static func persistSessionId(_ id: String, storageKey: String) {
guard !id.isEmpty else { return }
UserDefaults.standard.set(id, forKey: storageKey)
appendToSessionChain(id, storageKey: storageKey)
}The chain is reset only by clearSessionId (line 671), which is called from New Chat, sign-out, and explicit pop-out close paths. Never from a transient failure. A rate limit is not a new conversation, it is a glitch in an old one, and resetting the chain on it is the entire bug this pattern exists to prevent.
The fork call, written out
And here is the actual JSON-RPC call that branches the upstream session. From acp-bridge/src/index.ts line 4136, insidehandleForkSession:
// acp-bridge/src/index.ts
const result = (await acpRequest("session/fork", {
sessionId: sourceEntry.sessionId,
cwd,
mcpServers: buildMcpServers(mode, cwd, msg.toSessionKey),
})) as { sessionId: string };
// In-place fork: re-bind the same UI key to the new sessionId.
// Side-by-side fork: leave the source under fromSessionKey, register the new one under toSessionKey.
if (inPlace) {
unregisterSession(msg.fromSessionKey);
}
registerSession(msg.toSessionKey, { sessionId: result.sessionId, cwd, model });Two valid modes, called out in the inline comments at lines 4109 to 4117. The in-place fork replaces the source under the same UI key (this is the “fork the current chat” button most users interact with). The side-by-side fork registers the branch under a different key while leaving the source live, which is the primitive a future “open fork in new pop-out” UX is built on.
In both cases the source’s sessionId is left alive on disk. The bridge does not delete the JSONL, the upstream does not delete the JSONL, and the Conversation History view that scans ~/.claude/projects/<encoded-cwd> will still surface it next time the user opens the picker. A fork creates a branch, it does not erase the trunk.
Branching from a non-head position
ACP exposes session/fork only at the head of a live session, which is enough for “branch from current point” but not for “branch from a turn three positions back.” That second pattern is what an edit-and-resubmit UX needs: the user clicks an old prompt, edits the text, sends it, and expects everything after that turn to be replaced with a fresh response.
The Fazm implementation lives at truncateForEdit in ChatProvider.swift around line 1800. It drops the edited message and every later message from the in-memory list, the persisted store, and the visible chat history, tears down the upstream ACP session for the affected key, and clears the streaming state. The caller then re-sends the edited text through the normal send path. The next prompt the model sees was issued against a fresh session, with the truncated conversation reconstructed via the priorContext preamble path.
There is a real fidelity caveat here, called out in the function’s comment at lines 1796 to 1799: the preamble is text-only (no tool calls, no thinking blocks) and capped to roughly 20 turns. A tool-heavy conversation that gets truncated and resumed will diverge somewhat from one that ran continuously, because the model sees a summary of past tool work rather than the raw turn-by-turn record. That cost is the price of having a non-head branching primitive at all: the SDK does not natively support resuming mid-transcript, so something has to be discarded.
What this means if you are building on Claude Code, Codex, or any agent SDK
One: do not write a state-blob checkpointer. The SDK already has one, on disk, and you cannot beat its fidelity because it is fed from the inside of the tool-dispatch loop. Use the JSONL transcript as your durable representation. Address it by upstream session ID.
Two: own a stable client-side identity for each conversation. Do not treat the upstream ID as identity. Maintain an append-only chain of every upstream ID the conversation has held. Bound the chain so it does not grow without limit on a long-lived conversation. Reset only on explicit user actions.
Three: branch via the SDK’s native primitive. For ACP that is session/fork, accessed through the Claude Agent SDK as unstable_forkSession. For Codex via ACP it is the same call against codex-acp. Do not copy prompt history into a new session, because you will pay the prompt-caching penalty on every fork and your branches will be perceptibly slower than the source.
Four: accept that mid-transcript branching is a separate problem. The SDK does not natively branch from arbitrary positions in the JSONL, only from the head of a live session. If you need edit-and-resubmit, you are buying truncation plus a capped preamble replay, and you should be honest with the user about the fidelity tradeoff (especially on tool-heavy conversations).
Building a checkpoint layer on top of an agent SDK?
Happy to compare notes on what works against Claude Code via ACP, what we tried that did not, and which calls in the SDK actually do what their docs imply.
Frequently asked questions
What is an AI agent context checkpoint?
A discrete point in an agent's run from which the agent can resume or branch later with the same context and tool state it had at that point. In a workflow runtime like Microsoft Agent Framework or LangGraph, a checkpoint is a serialized blob: the in-memory state of the workflow, written to a checkpointer (file, Redis, SQLite). In a frontier-model agent built on top of an SDK like Claude Code or Codex via ACP, the checkpoint is not a blob you own. It is the SDK's own on-disk conversation transcript plus a pointer to a specific position in it, and resuming or branching is done by handing the SDK that pointer back, not by reconstituting state yourself.
Why is a single session ID not enough as a checkpoint pointer?
Because the upstream session ID is a transient handle the SDK can lose at any time. A rate limit, a credit cap, a bridge process killed by macOS, an OAuth token rotation, a session simply expiring on a TTL, any of these will roll the ID forward to a new one. If the only thing you wrote down was the old ID, your checkpoint is now pointing at a session the SDK has already abandoned. The fix is structural: the conversation has its own stable identity in your client, and the upstream ID is one item in an append-only chain of every upstream ID that conversation has ever owned. Resume always walks the chain.
How is forking from a checkpoint different from copying the prompt history into a new chat?
A copy gives you a fresh upstream session that has never seen any of the prior turns. The new session has to re-process every prompt and tool result as a preamble, which means you pay the prompt-caching penalty (nothing is cached), you may run out of context window if the conversation was long, and you lose any cached intermediate computation. A real fork via the SDK (`session/fork` in ACP, the unstable `unstable_forkSession` in the Claude Agent SDK) creates a new session that branches at the source's last position. The model sees the conversation as continuous, the cache stays warm, and the source session is left intact on disk so you have two live branches you can return to independently.
Where does the durable transcript actually live for Claude Code via ACP?
On disk, in `~/.claude/projects/<encoded-cwd>/<sessionId>.jsonl`. The encoded-cwd is your absolute working directory with every non-alphanumeric character replaced by a hyphen, so `/Users/me/proj` becomes `-Users-me-proj`. The JSONL is append-only as the conversation runs and contains every prompt, every tool call, every tool result, and every assistant response in order. The Claude Agent SDK addresses sessions by this path. If you move the file into a different encoded-cwd directory, `session/resume` under the new cwd will find it and replay everything (see `migrateJsonlForCwdChange` in the Fazm bridge for the production version of that migration). For Codex, the path shape is different (date-bucketed under `~/.codex/sessions`, not cwd-addressed), but the principle is the same.
When does a checkpoint event actually get created, implicitly?
Every time the SDK appends a turn to the JSONL. So every prompt you send, every tool the model calls, every result that comes back, every response chunk that streams, all of it is a checkpoint event in the sense that you can resume from after that line lands on disk. There is no explicit `flush` you need to call. Branching from any of those points is not free, though: ACP exposes `session/fork` only at the head of a live session, not arbitrary mid-transcript. If you want to branch from a turn three positions back, the path is to truncate the in-memory and on-disk state at that turn, then re-send. The Fazm implementation of that truncation lives at `truncateForEdit` in `Desktop/Sources/Providers/ChatProvider.swift` around line 1800, with the fidelity caveat that the resumed conversation rebuilds context via a capped priorContext preamble rather than the full per-turn detail.
What does the explicit fork-from-current-position button actually do?
One JSON-RPC call to `session/fork` against the live ACP session. The bridge sends `{ sessionId: <source>, cwd: <same>, mcpServers: <same> }` and gets back a new `{ sessionId: <branch> }`. The bridge then unregisters the source key (so the same UI key now points at the branch) and registers the new sessionId. The source session is not destroyed: its JSONL is still on disk under `~/.claude/projects/...` and is reachable through the Conversation History list, which scans that directory. Whatever model was set on the source is re-applied to the branch via `session/set_model`. The user sees an instant new chat that knows everything the original knew.
Why cap the session-ID chain at 16 entries?
Because each rollover is a discrete event, not continuous. Even a heavy day with multiple rate limits, a bridge OOM, and an OAuth rotation will rarely produce more than three entries in a single conversation's chain. Sixteen is the headroom for a multi-day conversation that survives the worst plausible week without unbounded growth in UserDefaults. The chain lives in UserDefaults in Fazm because the upstream IDs are short, the lookup is O(N) at send time, and UserDefaults is already where the head ID is stored, so co-locating is the simplest correct thing. If your storage layer is something other than UserDefaults the cap can be larger, but you still want one. An unbounded chain becomes a slow query in the resume path on long-lived conversations.
What is the difference between checkpoint, snapshot, save state, and fork in this domain?
Checkpoint and snapshot are usually used interchangeably and refer to the durable representation of an agent at a moment in time. Save state is a more general term that sometimes includes ephemeral in-memory state the agent can lose without consequence. Fork is the action of branching from a checkpoint into a new live execution that shares history up to the branch point but diverges after. In a frontier-model agent via an SDK, the checkpoint is the SDK transcript on disk plus the chain pointer, and fork is the SDK's native branching primitive (`session/fork` in ACP). In a workflow runtime, the checkpoint is a serialized blob and fork would mean deserializing twice and running both copies, which most workflow runtimes do not expose as a first-class operation.
How does this compare to LangGraph's checkpointer or Microsoft Agent Framework's checkpoint storage?
LangGraph's checkpointer (SqliteSaver, PostgresSaver, RedisSaver) writes the workflow's StateGraph snapshot per superstep. Microsoft Agent Framework workflows expose `add_checkpointing(file_storage)` and write a binary checkpoint at each pause point. Both are workflow-runtime patterns: the checkpoint is your state graph, not the model session. They work because the runtime owns the orchestration loop, and the model is one node in the graph. With a frontier-model SDK that owns its own agent loop (which is the case for Claude Code, Codex via ACP, and most computer-use agents), the model session is not yours to serialize. The checkpoint pattern shifts: the SDK transcript on disk is the source of truth, and your job is to keep a stable pointer to it that survives upstream rollover. The two patterns are not in conflict, they sit at different layers.
If the SDK owns the transcript, what does the host actually have to write down?
Two things. One: a stable conversation identity that lives in your local database and never changes for the life of the conversation. The Fazm version is a UUID per window or popout, persisted alongside the chat row. Two: an append-only chain of every upstream session ID that conversation has ever held in this bridge mode, capped (16 in Fazm), reset only on explicit user actions like New Chat or sign-out. The helpers are `appendToSessionChain`, `loadSessionChain`, `persistSessionId`, and `clearSessionId` in `ChatProvider.swift` lines 634 to 674. Messages themselves are stamped with the current upstream ID (not the stable conversation ID) so the chain mechanism stays purely additive and the message store does not need to know rollovers exist.
What about the model context window itself, does fork inherit cache?
Yes, when fork is done through the SDK's native primitive. Because `session/fork` creates the branch at the source's last position in the upstream session, the prefix is shared and prompt caching applies. The first prompt sent into the branch sees the same cached prefix the source's next prompt would have seen. This is the reason a real fork is dramatically faster than a copy on long conversations: a 100,000-token conversation forked via `session/fork` reuses the cached prefix, while a copy would re-tokenize and re-process the entire history with no cache hit. The win shows up most clearly on the first reply latency in the branch.
Where can I read the actual implementation?
All MIT-licensed at github.com/m13v/fazm. The bridge-side fork handler is in `acp-bridge/src/index.ts`, function `handleForkSession` around line 4098, calling the upstream JSON-RPC `session/fork` at line 4136 and emitting a `session_forked` event back to Swift at line 4162. The Swift entry points are `forkSession(key:)` and `forkSession(fromKey:toKey:)` in `Desktop/Sources/Providers/ChatProvider.swift` around lines 1735 and 1752. The session-ID chain helpers, including `sessionChainMaxSize = 16`, are in the same file lines 622 to 674. The cwd-migration helper that moves a JSONL transcript between encoded-cwd directories is at `acp-bridge/src/index.ts` line 2614, function `migrateJsonlForCwdChange`. The truncate-and-resubmit path that lets you branch from a non-head position is `truncateForEdit` around line 1800 in `ChatProvider.swift`.
Related field notes
Agent persistent session state, the rollover trap nobody warns you about
Why the upstream session ID is not a stable handle, and what an append-only chain plus per-send priorContext replay actually buys you in a shipping agent.
Claude Code persistent sessions, what works out of the box and what you have to wrap
The CLI already writes every conversation to disk. Auto-restore on launch, one-click fork, and no auto-compact mid-task are what you have to add yourself.
Control Claude Code context compaction
Where the SDK decides to compact, what gets dropped when it does, and how a wrapper keeps the full transcript live in the model's context for the lifetime of the window.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.