Browser Agents Need Human Checkpoints - Read Autonomously, Write With Confirmation
Giving AI agents access to your real browser is powerful and dangerous in equal measure. The permission model that actually works in practice is simple - reading is autonomous, writing requires confirmation.
The Read/Write Permission Split
Let the agent browse freely. It can read pages, extract data, navigate between tabs, and gather information without asking permission. This is the high-volume, low-risk work that agents excel at.
But the moment the agent needs to submit a form, send a message, make a purchase, or post content - it stops and asks. Human checkpoints for anything that sends data externally. No exceptions.
This sounds obvious, but most browser agent implementations either give full autonomy (risky) or require confirmation for everything (unusable). The read/write split hits the sweet spot.
Persistent Sessions Over Reconnecting
One of the practical lessons from running browser agents daily - persistent sessions are dramatically more reliable than reconnecting to a new browser instance each time.
When you reconnect, you lose cookies, session state, and login status. The agent spends the first few minutes of every task logging back in, navigating to the right page, and restoring state. With persistent sessions, it picks up exactly where it left off.
This also means the agent can maintain context across tasks. It knows what tabs are open, what pages it has already visited, and what data it has already extracted.
Why This Matters for Desktop Agents
Browser automation is one component of a broader desktop agent workflow. The agent might gather information from a web app, process it locally, then update a spreadsheet or send an email.
The permission model needs to be consistent across all of these surfaces. Reading data from any source is autonomous. Taking actions that affect the outside world requires confirmation. This principle scales from browsers to desktop apps to API calls.
The Trust Gradient
As you build trust with an agent over time, you can selectively expand its autonomous capabilities. Maybe after a month of confirming every email send, you let it handle routine replies autonomously. The key is that you are making that decision consciously, not by default.
Fazm is an open source macOS AI agent. Open source on GitHub.