Perplexity's Computer Agent Controls a Browser - But Your Workflow Is More Than One App
Perplexity's Computer Agent Controls a Browser - But Your Workflow Is More Than One App
Perplexity's computer agent is impressive. It can navigate websites, fill out forms, extract data, and complete multi-step browser workflows autonomously. For tasks that live entirely inside a browser, it works well.
But your actual workflow does not live entirely inside a browser.
The Browser Boundary
Think about what you did in the last hour. You probably switched between a browser, a messaging app, a text editor, a terminal, maybe a design tool, maybe a spreadsheet. A real workflow crosses application boundaries constantly.
A browser-only agent cannot open your email client to check a thread. It cannot switch to Figma to grab a design spec. It cannot run a terminal command to deploy code. It cannot update a native spreadsheet application. Every time the workflow leaves the browser, the agent stops and you take over manually.
Desktop Agents Work Across Everything
Desktop agents interact with your Mac through accessibility APIs, which work across every application. The same agent can read a Slack message, open a browser to check a dashboard, switch to the terminal to run a command, and paste the result back into Slack. No application boundary stops it.
This is not just a convenience difference - it changes what you can automate. Browser-only agents can automate browser tasks. Desktop agents can automate workflows. The distinction matters because most real work involves multiple applications talking to each other through you as the intermediary.
The Workflow Is the Unit
Individual app automation is useful but limited. The real time savings come from automating the connections between apps - the copy-paste, the context-switching, the "let me check this in another tool before updating this one" loops. These cross-app workflows are where you spend most of your coordination time.
An agent that controls your entire desktop can own the full workflow end to end. An agent trapped in a browser can only own fragments of it.
Fazm is an open source macOS AI agent. Open source on GitHub.