14 Releases of an MCP Server for macOS Accessibility: What We Learned
14 Releases of an MCP Server for macOS Accessibility: What We Learned
Fourteen releases into building an MCP server for macOS accessibility control, the lessons are clear. The hard parts are not where you expect them. Apple's documentation covers the happy path. Production behavior is something you can only learn by shipping.
The API Surface Is Deceptively Small
Apple's Accessibility API surface area is manageable. A dozen functions in ApplicationServices.framework let you query elements, read attributes, perform actions. The basics - clicking a button, reading a text field, navigating a menu - work in an afternoon.
Production reliability takes months.
Every app implements accessibility differently. SwiftUI apps expose a clean, predictable element tree. AppKit apps vary wildly depending on how old they are and whether the developer ever used VoiceOver. Electron apps are a grab bag - some expose almost nothing, others expose detailed DOM-mapped trees. Chrome-based apps have their own accessibility layer that sometimes contradicts the macOS system view.
There is no way to know how an app behaves until you run automation against it. No documentation tells you that app X does not properly implement AXUIElementPerformAction for menu items, or that app Y returns stale element references after window resizing.
The Sandbox Problem
One issue that trips up almost every new developer building macOS automation tools: the Accessibility API does not work inside an App Sandbox.
When App Sandbox is enabled, the accessibility permission prompt never appears, the app cannot be manually added in System Settings > Privacy & Security > Accessibility, and AXIsProcessTrusted() always returns false. The API calls succeed silently but do nothing.
This means distributing an automation-focused MCP server through the Mac App Store is either impossible or requires significant entitlement exceptions that Apple rarely grants. Most production automation tools ship as direct downloads, sideloaded with a Developer ID signature or unsigned.
// Always check this before making any AX calls
import ApplicationServices
func checkAccessibilityPermission() -> Bool {
let options = [kAXTrustedCheckOptionPrompt.takeUnretainedValue() as String: true]
return AXIsProcessTrustedWithOptions(options as CFDictionary)
}
If this returns false, every subsequent AX call will silently fail. The error codes are not always informative.
What Broke Between Versions
v0.1.8: Finder sidebar navigation broke after a macOS update. Apple changed how Finder exposes sidebar items in what appeared to be a minor point release. No documentation, no changelog mention. We found out from user bug reports. The fix required reading the element tree differently based on the macOS version at runtime - an approach we now use in several places.
v0.1.11: A memory leak that grew to 2GB.
AXUIElement references are Core Foundation objects. They need explicit CFRelease() calls or proper bridging to Swift's ARC. We were retaining element references in an LRU cache without releasing them. Over a few hours of heavy use, the server would grow to 2GB. The fix was implementing proper reference counting and setting a maximum cache size of 200 elements with TTL-based eviction.
// Wrong: retains element without tracking
var cachedElement: AXUIElement?
// Right: explicit lifecycle management
class ElementCache {
private var cache: [(key: String, element: AXUIElement, expiry: Date)] = []
private let maxSize = 200
private let ttl: TimeInterval = 0.5
func set(_ element: AXUIElement, forKey key: String) {
// Evict expired entries first
let now = Date()
cache.removeAll { $0.expiry < now }
// Evict oldest if at capacity
if cache.count >= maxSize {
cache.removeFirst()
}
cache.append((key: key, element: element, expiry: now.addingTimeInterval(ttl)))
}
}
v0.1.13: Menu bar race condition.
Clicking a menu item sometimes had no effect. The menu would expand, but the click on the child item would register against the parent before it fully dismissed. The root cause is that AXUIElementPerformAction(element, kAXPressAction) is asynchronous in practice - the UI update happens on the main thread, but the accessibility event fires before the animation completes.
The fix was adding a small delay after menu expansion - 50ms in most cases. Not elegant, but necessary. We tried using AXObserver callbacks to wait for the UI state to settle, but the events are not reliable enough for timing-critical interactions.
The 500ms Cache TTL
The accessibility tree changes constantly. Every user interaction, every window resize, every background refresh can invalidate element references. Cache too aggressively and stale references cause crashes. Cache too conservatively and traversal is slow.
We benchmarked multiple TTL values across a set of common apps and settled on 500ms as a reasonable default. It is fast enough that typical command sequences complete without re-traversal, but short enough that a user interaction does not leave the cache stale.
For high-frequency automation - keyboard shortcut sequences, rapid form filling - 500ms is too long and can cause "element no longer exists" errors. For these cases, we added a forceRefresh option that bypasses the cache entirely.
Error Messages for Agents, Not Humans
The most underrated lesson: error messages need to serve agents, not humans.
When an element cannot be found, the naive message is "Button not found." That is useless for an LLM trying to recover. The useful message is: "Button 'Submit' not found in window 'Settings'. Window currently contains: Cancel, Apply, Reset, Help."
The agent now knows three things: the target element does not exist, what window it was looking in, and what elements are actually present. It can reason about whether to click one of the alternatives, ask for clarification, or report failure with a specific explanation.
func findButton(label: String, inWindow window: AXUIElement) throws -> AXUIElement {
let children = try getChildren(of: window)
let buttons = children.filter { element in
(try? getAttribute(.role, from: element) as? String) == kAXButtonRole
}
guard let button = buttons.first(where: {
(try? getAttribute(.title, from: $0) as? String) == label
}) else {
let availableLabels = buttons.compactMap {
try? getAttribute(.title, from: $0) as? String
}.joined(separator: ", ")
throw AutomationError.elementNotFound(
"Button '\(label)' not found. Available buttons: \(availableLabels)"
)
}
return button
}
Lessons for MCP Server Builders
Test with real apps, not synthetic UIs. Your test harness might have perfect accessibility support. Safari, Slack, VS Code, and Finder each have their own quirks that only appear in production use.
Ship often. Each release surfaces edge cases you did not know existed. Users run your server against apps you have never tested. Frequent releases build trust and surface issues faster than any internal testing regimen.
Assume the environment will change. macOS updates, app updates, and system preference changes can all break accessibility behavior silently. Version checks and graceful degradation matter more than they would in a more stable API environment.
The accessibility API is powerful but undersupported by most app developers. As more AI agents rely on it, there is growing pressure for apps to improve their accessibility implementations - which ultimately benefits screen reader users, keyboard users, and automation tools alike.
Fazm is an open source macOS AI agent. Open source on GitHub.