How to Monitor What Your AI Agent Is Actually Doing
How to Monitor What Your AI Agent Is Actually Doing
You ship an AI agent that automates a workflow. The logs say it completed successfully - all tool calls returned 200, every step executed in order, no exceptions thrown. But the user reports that nothing actually happened. The form was not submitted. The email was not sent. The file was not saved.
This is the observability gap in AI agents. Traditional logging captures what the agent intended to do, not what actually happened on screen.
Logs Lie
When an agent calls a click action at coordinates (300, 450), the tool call succeeds. The click was dispatched. But if the page layout shifted and there is nothing interactive at those coordinates, the click hits empty space. The log entry looks identical to a successful click.
Same thing with text input. The agent types into what it thinks is a search box, but focus was on a different element. The tool call logs show the text was typed. The actual result is gibberish in the wrong field.
Screen Recording Fills the Gap
The fix is dead simple - record the screen while the agent runs. Not screenshots at each step, but continuous video. When something goes wrong, you scrub through the recording and see exactly what happened. The button was not visible. The page was still loading. A popup blocked the target element.
For macOS agents, ScreenCaptureKit makes this lightweight. You can record at low resolution and frame rate - 720p at 5fps is enough for debugging - and the performance overhead is minimal. Store the last hour of recordings and garbage-collect older ones.
Structured Video Logs
The next level is syncing your tool call logs with video timestamps. Each log entry gets a frame number so you can jump directly to the moment a specific action executed. When a user reports a failure, you pull up the tool call log, click on the failed step, and the video jumps to that exact moment.
This is not optional for production agents. If your agent touches a UI and you cannot see what it saw, you are debugging blind.
Fazm is an open source macOS AI agent. Open source on GitHub.