Building Screen Recording Tools for AI Agent Session Replay
Building Screen Recording Tools for AI Agent Session Replay
Recording what an AI agent does on screen sounds straightforward. Capture frames, encode video, done. But when we built screen recording for agent session replay, the cursor was the hardest part to get right.
The Cursor Problem
AI agents don't move the mouse like humans do. They teleport the cursor from one position to the next - click here, jump 800 pixels, click there. When you play back the recording, it looks jarring and robotic. Users can't follow what happened.
Smooth cursor interpolation fixes this. Instead of showing the raw cursor positions, you generate intermediate frames where the cursor glides between click targets. A cubic bezier curve between points looks natural. Linear interpolation looks like a robot pretending to be human.
The tricky part is timing. Too fast and the cursor zips around unintelligibly. Too slow and a 30-second agent session becomes a 5-minute video. We found that 200-300ms of travel time between clicks hits the sweet spot.
Frame Capture on macOS
ScreenCaptureKit on macOS gives you high-performance frame capture with minimal CPU overhead. The key is capturing at a lower frame rate during idle periods and ramping up during active agent interactions. 5 fps while waiting, 30 fps during mouse movement and typing.
This keeps file sizes reasonable without missing important actions.
Why Session Replay Matters
Debugging AI agents without session replay is like debugging code without logs. You need to see exactly what the agent saw, where it clicked, and what happened next. When an agent fails on step 7 of a 12-step workflow, the recording tells you why in seconds instead of minutes.
For demo videos, smooth cursor movement is the difference between "this looks like a bot" and "this looks like magic."
Fazm is an open source macOS AI agent. Open source on GitHub.