Field guide

How to verify what an AI agent actually did

An agent ran on your machine. It edited files, ran commands, maybe clicked around your browser. At the end it wrote a tidy paragraph saying it is done. The question this guide answers: how do you know that paragraph is true?

M
Matthew Diakonov
9 min read

The short version, verified 2026-05-15

You verify an AI agent by reading its action record (every tool call it made and the result that came back) and cross-checking that record against the real end state (git diff, the actual file, the live UI). You do not verify it by trusting the summary it writes at the end. The summary is a hypothesis. The tool-call record is evidence. Two things quietly destroy that evidence before you get to read it: auto-compacting and a scrolling terminal. The rest of this page is about keeping the record intact and legible enough to actually read.

The summary is the one thing you must not trust

Search for this topic and most of what comes back is about something else. One half is cloud-agent identity: cryptographic agent IDs, delegation receipts, policy engines that allow or deny an action at an API boundary before it reaches a system. The other half is pre-production evaluation: trajectory evaluators, test suites, scoring an agent on a benchmark before you ship it. Both are real disciplines. Neither is the problem you have at 6pm on a Tuesday.

Your problem is smaller and more concrete. An agent already ran, on your own laptop, in your own logged-in session. There was no policy engine in front of it and no benchmark involved. It did things, and now it has handed you a paragraph: "I updated the config, migrated the schema, and the tests pass." The paragraph reads as a report. It is not one. It is generated text. The model wrote "the tests pass" because that is the plausible continuation of the conversation, not because it re-ran the tests and read the exit code a second time.

When the summary and reality diverge, you get what people have started calling a ghost action: the agent sounded confident, but the write never landed, the migration silently no-op'd, the click hit the wrong element. The whole job of verification is catching ghost actions. And you cannot catch them by reading the summary, because the summary is exactly where the ghost lives. You catch them by reading the record of what the agent actually did, call by call.

Two models: gate before, or verify after

There are two ways to keep an agent honest. You can gate it: make it stop and ask before every action. Or you can let it run and verify the record afterward. The common advice leans hard on the first one. In daily use, the first one quietly fails.

The two ways people try to stay honest

The agent pauses on every tool call and waits for you to click allow. In theory you are in control of each step.

  • After 30 prompts you stop reading and click allow on reflex
  • A 40-minute run turns into 40 minutes of babysitting
  • You approve the intent, never the outcome, so ghost actions still pass
  • Doesn't scale to a multi-window or overnight workflow

This is why Fazm auto-approves permissions instead of prompting. When the agent sends a session/request_permission request, the ACP bridge selects allow without asking you. That code is in acp-bridge/src/index.ts around line 1204, and it is deliberate. The bet is that verification belongs after the run, not scattered across 200 interruptions during it. But that bet only pays off if the record you verify against is genuinely good. So the real engineering question becomes: what does a verifiable action record look like?

What a verifiable action record looks like

A raw terminal gives you a record, technically. Every tool call scrolls past as text. But it is one undifferentiated stream: intent, output, model chatter, and progress spinners all blur together, and pairing "the agent asked to edit this file" with "here is what the edit returned" means scanning by eye. That is a record you can read in principle and will not read in practice.

Fazm wraps the same Claude Code agent loop through ACP, but it splits each tool call into two events that share one identity. The translation lives in acp-bridge/src/acp-translate.ts, lines 63 to 118. When the agent invokes a tool, the bridge emits a tool_use event carrying the call id, the tool name, and the raw input arguments the agent chose. When the result comes back, the bridge emits a tool_result_display event carrying the same id, the name, and the actual output (truncated at 2000 characters so a giant blob cannot bury the rest). One action, two records, joined by toolUseId: the intent and the outcome.

The two paired events, per action

// intent: what the agent asked to do
{ "type": "tool_use",
  "callId": "toolu_01ABC...",
  "name": "Edit",
  "input": { "file_path": "config.ts", "...": "..." } }

// outcome: what actually came back
{ "type": "tool_result_display",
  "toolUseId": "toolu_01ABC...",   // same id
  "name": "Edit",
  "output": "Applied 1 edit to config.ts" }

Source: acp-translate.ts cases tool_call and tool_call_update. The Swift side (ACPBridge.swift, the toolUse and toolResultDisplay cases near line 200) renders each pair as its own card in the chat window.

That pairing is the whole point. Verification is a comparison operation, and a comparison needs two operands. With the intent and the outcome captured separately and tied by one id, you can walk the session and ask, for every action: did the thing it asked for produce the result it expected? Here is that flow across the layers.

One action, from agent to a record you can read

Claude Code agentACP bridgeFazm windowtool_call: name + rawInput (the intent)tool_use event renders the intent cardtool_call_update: status + result contenttool_result_display renders the outcome cardyou read both cards, paired by toolUseId

Two things silently destroy the record

A good record is not enough if it does not survive to the moment you read it. Two ordinary features of agent clients quietly shred it.

The first is auto-compacting. When a long session nears the context limit, many clients compact: they replace a stretch of real messages, tool calls and their results included, with a model-written summary, then keep going. The session continues fine. But the actual tool_result for an edit you care about, 40 turns back, no longer exists in the window. It was summarized into a sentence. You now cannot verify that action against anything, because the evidence was compressed into the same kind of generated text you were trying not to trust. Fazm does not auto-compact. The full history, every paired event, stays live in the window for its entire lifetime, so a three-hour session still has a complete record at the end.

The second is the scrolling terminal. Even when the bytes are technically all there, a flat stream is not a readable record. If you run Claude Code in a bare terminal it does keep a session transcript on disk as JSONL under ~/.claude/projects/, which is a genuine, complete record. But verifying from it means reading raw JSON line by line, or writing a script to do it. Structured cards in a window are the same information arranged so a human can actually scan it: collapse the calls that obviously succeeded, stop on the one that looks wrong, read its input and its output side by side.

The check the record cannot do for you

Even a perfect, uncompacted, perfectly paired record has a ceiling. It tells you what the agent asked for and what each tool returned. It does not tell you the end state is correct. A tool_result_display that says "file written" is the tool reporting its own success. It is not proof the file holds what you wanted, or that the agent did not also touch four other files you never mentioned.

So the last move of verification always leaves the record and looks at reality. For code that means git diff HEAD to see every file actually touched, then opening the ones the agent claims it changed. For a browser or desktop action it means looking at the live UI. This is where Fazm's approach to computer use matters for verification specifically: its desktop and browser tools read state through the macOS accessibility tree, not screenshots. When the agent reports "clicked Submit and the form saved," you can check that against the real control and field state, instead of against a picture the agent took of itself and then described. A screenshot the agent narrates is just another summary.

A verification pass you can run after any session

Put it together and verifying a run is a short, repeatable pass. Worth doing after any session longer than half an hour, or any session where the agent had write access to more than one place.

After-run verification pass

  • Skim the tool-call cards top to bottom. Collapse the obvious successes, stop on anything whose output does not match its input.
  • For each action that matters, read the intent card and the outcome card as a pair. The agent's prose between them is not evidence.
  • Run git diff HEAD --stat to map every file actually touched. Flag anything outside the scope you asked for.
  • Open the files the agent claims it changed. Confirm the change is the one you wanted, not just a change.
  • For any browser or desktop action, look at the live UI. Confirm the real end state, not the agent's description of it.
  • If something is off, fork the chat from before the bad action and re-run, instead of arguing with a session whose context is now polluted.

When gating before is still the right call

Verify-after is the right default for everyday work on your own machine, where almost everything is reversible: a bad edit is a git checkout away, a bad file is in the Trash. It is not the right model when an action is irreversible or hits something you do not own.

Deleting production data, sending email to real recipients, moving money, force-pushing to a shared branch, running anything with credentials that reach beyond your laptop: for those, reading the record afterward is too late, because there is no afterward you can undo. The honest position is that verification has two regimes. For reversible local work, verify-after with a strong record wins, and a per-call prompt is mostly theater. For irreversible or outward-facing actions, you want a real gate, and ideally you want the agent to not have those credentials in scope at all.

Verifying is also only half of staying in control. The other half is being able to stop a run that has clearly gone wrong before it finishes. That is a separate mechanism, the interrupt path and the stuck-tool watchdog, covered in the companion guide on the containment-action gap on the desktop.

Want to see a real agent run, then verified, end to end?

Book 20 minutes and we will walk a live Fazm session, then read its tool-call record together and cross-check the end state.

Frequently asked questions

What does verifying an AI agent actually mean when it runs on my own machine?

It means confirming that the actions the agent claims it took match the actions it actually took, and that those actions match the end state you wanted. On your own Mac there is no IAM layer, no policy engine, and no cryptographic agent identity in front of the work. The agent ran as a process in your logged-in session and already did things. Verification is therefore a reading task done after the run: you read the record of every tool call, then you cross-check that record against the real artifacts (the files on disk, the diff, the live UI). It is not the cloud-governance problem that most articles on this describe, where verification happens at an API boundary before the action reaches a system.

Why can't I just trust the summary the agent writes at the end?

Because the summary is generated text, not evidence. The model writes "I updated the config and the tests pass" because that is the plausible next token given the conversation, not because it re-checked the disk. When the summary and reality diverge you get what people call a ghost action: the agent sounded confident, but the write never landed, the test was never run, the click hit the wrong element. The summary is a hypothesis about what happened. The tool-call record is the closest thing you have to what happened. Verification means reading the second one, not the first.

How does Fazm record each action so I can verify it?

Every tool call the agent makes crosses the ACP bridge as two events that share one toolUseId. The first is tool_use, which carries the call id, the tool name, and the raw input arguments the agent chose. The second is tool_result_display, which carries the same id, the name, and the actual output that came back (truncated at 2000 characters). That translation lives in acp-bridge/src/acp-translate.ts, lines 63 to 118. The Swift side renders each pair as its own card in the chat window (ACPBridge.swift defines the toolUse and toolResultDisplay cases around line 200). So for every action you can line up what the agent asked to do against what actually returned, instead of reading one undifferentiated wall of terminal output.

Does auto-compacting really lose verification data?

Yes, and that is the part most people miss. When a long session approaches the context limit, many agent clients silently compact: they replace a stretch of real messages, including tool calls and their results, with a model-written summary. After that point the original tool_result for an action 40 turns back is gone from the live context. You cannot verify against it because it no longer exists in the window. Fazm does not auto-compact. The full chat history, every tool_use and tool_result_display, stays live in context for the lifetime of the window, so the record you would verify against is still complete at the end of a three-hour session.

Fazm auto-approves every tool permission. Isn't that the opposite of safe?

Fazm does auto-approve. When the agent sends session/request_permission, the bridge selects allow without prompting you (index.ts, around line 1204, matching the bypassPermissions behavior). That is a deliberate design choice, not an oversight. A per-call approval prompt either trains you to rubber-stamp (you click allow 200 times and stop reading) or it stalls you (you babysit a 40-minute run). Fazm bets the other way: let the agent run, but make the record of what it did so legible and so durable that verifying after the fact is fast and honest. The safety does not disappear, it moves from a gate you click to a record you read. For the cases where gate-before is genuinely the right call, see the body of this guide.

What is the one check the in-app record can't do for me?

The record tells you what the agent asked for and what the tool returned. It cannot tell you that the end state is correct. A tool_result that says "file written" is the tool reporting success, not proof the file contains what you wanted. So the last step of verification always lives outside the record: run git diff HEAD to see every file actually touched, open the file the agent claims it changed, and look at the live UI for a browser or desktop action. Fazm's desktop and browser tools read state through the macOS accessibility tree rather than screenshots, so when the agent reports "clicked Submit" you can check that against the real control state instead of a picture the agent took of itself.

Can I still verify a session after I restart my Mac?

Yes. Fazm persists sessions. After a restart the app auto-restores every chat window with its full conversation history, including the tool_use and tool_result_display cards, intact. So a verification pass is not a now-or-never thing. You can close the lid, come back the next morning, and the record of what the agent did the day before is still there to read, paired and uncompacted.

Fazm is open source. The bridge code referenced here, acp-translate.ts, index.ts, and ACPBridge.swift, is readable in full at github.com/m13v/fazm. The record this guide tells you to verify against is itself auditable, which is the point.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.