Giving Claude Code Eyes and Hands with macOS Accessibility APIs
Giving Claude Code Eyes and Hands
Claude Code is powerful inside the terminal. But most real work happens in native apps - Figma, Slack, browsers, email clients, spreadsheets. To bridge that gap, you need to give it the ability to see and interact with the full macOS desktop.
That is exactly what macOS accessibility APIs do.
How Accessibility APIs Work as Agent Eyes
Every macOS application exposes its UI through the accessibility tree - a structured representation of every button, text field, menu, and element on screen. This is the same system that screen readers use, but it turns out to be perfect for AI agents too.
When you connect Claude Code to an MCP server that wraps these APIs, it can:
- Read any app's UI state - see what is on screen without screenshots
- Click buttons and fill forms - interact with apps programmatically
- Navigate menus and dialogs - traverse the full accessibility tree
- Get structured data - not pixels, but actual text content and element types
The MCP Server Approach
The mcp-server-macos-use project exposes macOS accessibility APIs as an MCP server. Claude Code connects to it like any other MCP tool, gaining the ability to traverse and interact with any application.
The key advantage over screenshot-based approaches is speed and reliability. Instead of taking a screenshot, sending it to a vision model, and hoping it interprets the pixels correctly, accessibility APIs give you the exact element hierarchy. A button is a button - not a cluster of pixels that looks like one.
What This Enables
With accessibility APIs connected through MCP, Claude Code becomes a true desktop agent. It can file expenses in your company's internal tool, reorganize files in Finder, manage browser tabs, or fill out forms in native apps.
The limitation is that the app needs to implement accessibility properly. Most major macOS apps do, but some custom or Electron apps have incomplete accessibility trees.
This is not about replacing your mouse. It is about letting an AI agent do the repetitive desktop tasks that take up your morning.
- Why AI Agents Need Mac Accessibility
- Accessibility API vs Screenshot for Computer Control
- MCP Server for macOS Accessibility and Screen Capture
Fazm is an open source macOS AI agent. Open source on GitHub.