Claude Can Control Your Entire Desktop Through Accessibility APIs

Fazm Team··3 min read

Claude Can Control Your Entire Desktop Through Accessibility APIs

Browser-based AI agents can automate web apps. But most knowledge work happens across multiple native applications - Xcode, Figma, Excel, Slack, Terminal, Preview, and dozens of others. A browser agent cannot touch any of them.

Desktop-level AI agents use OS accessibility APIs to control every application on your machine. Not just browsers. Everything.

How It Works

macOS exposes an accessibility API that describes every visible element in every running application. Buttons, text fields, menus, sliders, tables - all represented as structured data with labels, positions, and available actions.

An AI agent with accessibility access can read the contents of any window, click any button, type into any text field, and navigate any menu. It works the same way screen readers work, except instead of reading the interface to a human, it reads the interface to an LLM that can decide what to do next.

What This Enables

Full desktop control means an agent can work across applications the way you do. Copy data from a spreadsheet, paste it into a presentation, adjust the formatting, switch to email, attach the file, and send it. Each step involves a different app, and the agent handles all of them through the same accessibility interface.

It also means the agent can automate apps that have no API, no plugin system, and no scripting support. If it has a UI, the agent can use it. Legacy enterprise software, proprietary tools, apps that were never designed for automation - all become accessible.

The Permission Model

This level of control requires explicit permission. On macOS, you grant accessibility access per-application through System Settings. The OS enforces this - no app can use accessibility APIs without the user's explicit approval.

This is actually a better security model than many alternatives. Browser extensions often request broad permissions silently. Desktop accessibility access is a deliberate, visible, per-app grant. You know exactly which application has control, and you can revoke it at any time.

Beyond Simple Automation

The real power is not just automating repetitive clicks. It is giving an AI agent the same interface you have. The agent can observe what is on screen, understand the context, make decisions, and take action - across your entire desktop environment. It turns your Mac into a platform the agent can operate, not just a terminal it can type into.

Fazm is an open source macOS AI agent. Open source on GitHub.

More on This Topic

Related Posts