How an MCP Server Lets Claude Control Any Mac App

Fazm Team··2 min read

How an MCP Server Lets Claude Control Any Mac App

There's an open source MCP server that lets Claude control any Mac application through accessibility APIs. The repo is mcp-server-macos-use and it turns Claude into a full desktop agent.

How It Works

The MCP server connects to macOS accessibility APIs - the same system that powers VoiceOver and screen readers. It reads the accessibility tree of any running application, which gives Claude a structured view of every button, text field, menu item, and UI element on screen.

From there, Claude can:

  • Read what's on screen - get the full accessibility tree with element labels, roles, and positions
  • Click any element - target buttons, links, and controls by their accessibility reference
  • Type text - fill in forms, write in text editors, enter search queries
  • Navigate menus - open dropdown menus, select options, use keyboard shortcuts

Why Accessibility APIs Beat Screenshots

Most computer-use agents rely on screenshots and vision models to understand what's on screen. That approach is slow, expensive, and unreliable. You're burning tokens on image processing and hoping the model correctly identifies a tiny button.

Accessibility APIs give you structured data directly. The agent knows exactly what every element is, where it is, and what it does. No guessing, no pixel matching, no vision model overhead.

Getting Started

Install the MCP server, grant accessibility permissions in System Settings, and add it to your Claude Code configuration. Once connected, Claude can interact with any app - Finder, Safari, Slack, your IDE, literally anything with a UI.

The entire project is open source, so you can inspect exactly what it does and modify it for your workflow.

Fazm is an open source macOS AI agent. Open source on GitHub.

More on This Topic

Related Posts