Back to Blog

Giving Claude Code Eyes and Hands with macOS Accessibility APIs

Fazm Team··2 min read
claude-codeaccessibility-apimcpmacosdesktop-agentautomation

Giving Claude Code Eyes and Hands

Claude Code is powerful inside the terminal. But most real work happens in native apps - Figma, Slack, browsers, email clients, spreadsheets. To bridge that gap, you need to give it the ability to see and interact with the full macOS desktop.

That is exactly what macOS accessibility APIs do.

How Accessibility APIs Work as Agent Eyes

Every macOS application exposes its UI through the accessibility tree - a structured representation of every button, text field, menu, and element on screen. This is the same system that screen readers use, but it turns out to be perfect for AI agents too.

When you connect Claude Code to an MCP server that wraps these APIs, it can:

  • Read any app's UI state - see what is on screen without screenshots
  • Click buttons and fill forms - interact with apps programmatically
  • Navigate menus and dialogs - traverse the full accessibility tree
  • Get structured data - not pixels, but actual text content and element types

The MCP Server Approach

The mcp-server-macos-use project exposes macOS accessibility APIs as an MCP server. Claude Code connects to it like any other MCP tool, gaining the ability to traverse and interact with any application.

The key advantage over screenshot-based approaches is speed and reliability. Instead of taking a screenshot, sending it to a vision model, and hoping it interprets the pixels correctly, accessibility APIs give you the exact element hierarchy. A button is a button - not a cluster of pixels that looks like one.

What This Enables

With accessibility APIs connected through MCP, Claude Code becomes a true desktop agent. It can file expenses in your company's internal tool, reorganize files in Finder, manage browser tabs, or fill out forms in native apps.

The limitation is that the app needs to implement accessibility properly. Most major macOS apps do, but some custom or Electron apps have incomplete accessibility trees.

This is not about replacing your mouse. It is about letting an AI agent do the repetitive desktop tasks that take up your morning.


More on This Topic

Fazm is an open source macOS AI agent. Open source on GitHub.

Related Posts