Back to Blog

Open Source MCP Server for macOS Accessibility Tree Control

Fazm Team··2 min read
mcpaccessibility-apimacosopen-sourcedesktop-agent

Open Source MCP Server for macOS Accessibility Tree Control

We open sourced an MCP server that uses the macOS accessibility API to traverse the UI tree, take screenshots, and click elements. This gives AI agents the ability to control any native macOS application - not just browsers.

How It Works

The MCP server sits between an AI agent (like Claude) and macOS. When the agent needs to interact with an app, the flow is:

  1. Traverse - the server walks the accessibility tree of the frontmost application, returning a structured list of every UI element: buttons, text fields, menus, labels
  2. Screenshot - it captures what is on screen so the agent can see the current state visually
  3. Click - the agent picks an element by reference and the server clicks it using accessibility APIs

This is fundamentally different from pixel-based approaches. Instead of trying to find a button by matching pixels in a screenshot, the agent knows exactly where every element is, what it is called, and what type it is.

Why Accessibility APIs Beat Screenshots

Screenshot-based agents guess where to click. Accessibility-based agents know:

  • Element type - is it a button, a text field, a checkbox?
  • Element label - what does the OS call it?
  • Element state - is it enabled, focused, selected?
  • Exact coordinates - no pixel matching needed

This means higher reliability, fewer hallucinated clicks, and the ability to interact with elements that are visually identical but functionally different.

The Open Source Angle

By open sourcing this, anyone can extend it. Add support for specific apps, build custom traversal logic, or integrate it into their own agent framework. The code handles the hard parts - TCC permissions, AXUIElement traversal, coordinate mapping - so you can focus on what your agent should do, not how to make it click things.

Getting Started

The server runs locally and communicates over the standard MCP protocol. Point your agent at it, grant accessibility permissions in System Settings, and your agent can control any app on your Mac.

Fazm is an open source macOS AI agent. Open source on GitHub.


More on This Topic

Related Posts