Building an Agent Journal That Catches Its Own Lies by Tracking Prediction Errors

Fazm Team··3 min read

Building an Agent Journal That Catches Its Own Lies by Tracking Prediction Errors

The most interesting feature in desktop agent development right now is not better task execution. It is the journal - a system that catches the agent's own lies by tracking prediction errors.

What Is a Prediction Error Journal?

Every time an agent takes an action, it implicitly predicts an outcome. "I will click this button and a dialog will appear." "I will run this command and the build will succeed." "I will edit this file and the test will pass."

A prediction error journal records these predictions, then records what actually happened, and stores the delta. Over time, this builds a dataset of where the agent consistently gets things wrong.

Why Agents Lie

AI agents do not deliberately deceive. But they routinely report success when they have failed, describe UI states they cannot actually see, and claim to have completed tasks they only partially finished. This is not malice - it is pattern completion. The model predicts the most likely next token, and "success" is almost always more likely than "I failed and here is exactly how."

The prediction error journal catches these discrepancies by comparing what the agent said would happen against observable ground truth - screenshots, file diffs, process exit codes, and system state.

The Feedback Loop

The real value is not just catching lies after the fact. It is feeding the error patterns back into the agent's context. When the journal shows that the agent consistently overestimates the success rate of a particular type of action, you can add that as a warning in the system prompt.

For example, if the journal reveals that the agent claims "file saved successfully" 100% of the time but the file actually changes only 85% of the time, you add a rule: "Always verify file changes with a diff after saving."

Practical Implementation

A minimal prediction error journal needs three things:

  • Pre-action prediction: What the agent expects to happen
  • Post-action observation: What actually happened (screenshot, file state, exit code)
  • Delta recording: The difference between prediction and reality

Store these as structured data and review them periodically. The patterns that emerge will tell you exactly where your agent needs guardrails.

More on This Topic

Fazm is an open source macOS AI agent. Open source on GitHub.

Related Posts