Using MCP to Let AI Agents Control macOS via Accessibility APIs

Matthew Diakonov

Updated March 19, 2026

mcp macos accessibility ghost-os automation

The Model Context Protocol gives AI agents a standardized way to use tools. On macOS, the most powerful tool available is the accessibility API - the same system that powers VoiceOver and other assistive technologies. Wrap that API in an MCP server and any LLM-based agent can control your entire desktop.

How It Works

An MCP server exposes macOS accessibility functions as callable tools. The agent can query the UI tree of any running application, read element properties like labels and roles, click buttons, type into fields, select menu items, and navigate between windows. All through structured API calls, not pixel-level interactions.

The key advantage is that accessibility APIs provide semantic information. The agent doesn't just see a rectangle at coordinates (340, 220) - it sees "Send button, enabled, in Mail compose window." This makes tool selection reliable because the agent understands what elements actually are, not just where they appear on screen.

Voice as the Input Layer

Once the agent can control macOS through MCP, voice becomes the natural input. You speak a command, the agent maps it to a sequence of MCP tool calls, and the tools execute against the accessibility API. The entire chain runs locally - speech-to-text on Apple Silicon, LLM inference for planning, and MCP execution for actions.

This creates a hands-free automation layer for your entire desktop. "Move yesterday's downloads into the project folder" becomes a voice command that triggers Finder operations through accessibility APIs. "Reply to Sarah's last email with the updated timeline" chains Mail operations together.

Why MCP Matters Here

Without MCP, every agent builds its own tool interface. With MCP, the accessibility layer is a shared resource that any compatible agent can use. Build the server once, and it works with Claude, local models, or whatever comes next. The protocol handles the plumbing so the agent can focus on understanding what you want.

Fazm is an open source macOS AI agent. Open source on GitHub.

Building a Universal macOS Automation API

AppleScript, accessibility APIs, and shell commands each solve part of macOS automation. A unified API layer combines them into one consistent interface for

Mar 18, 2026

Giving Claude Code Eyes and Hands with macOS Accessibility APIs

macOS accessibility APIs give Claude Code the full accessibility tree of any app - turning a coding assistant into a desktop agent with real eyes and hands

Mar 17, 2026

Skip MCP for Native Mac Apps - Use the Accessibility API Instead

Why setting up MCP servers for native Mac app control is overkill when the accessibility API already gives you everything you need - no servers, no config.