How to Set Up a Desktop AI Agent on Mac, Including the Third Permission Nobody Tells You About
Every tutorial online gives you the same answer: download the app, grant Screen Recording, you are done. That works for screenshot agents. It does not work for any agent that actually reads the UI tree. And on macOS 26 there is a silent failure mode on top of that where the Accessibility permission appears granted in System Settings but the agent cannot act. This page covers the real setup flow, the third permission, and the live probe Fazm uses to recover when TCC goes stale.
The two-permission answer is wrong
The generic setup guide for a Mac desktop AI agent is: download, open System Settings, toggle Screen Recording, toggle Microphone, relaunch. That is correct for agents that work by taking a screenshot, sending the image to a vision model, and clicking on pixel coordinates the model returns.
It is not correct for agents that read the real UI tree. Fazm, Agent! by Anthropic, and Manus all need a third permission called Accessibility. Accessibility is a separate pane in System Settings from Privacy and Security, Screen Recording. Granting Screen Recording does not grant Accessibility. Without it, every AXUIElement call the agent makes returns nothing.
The reason most setup guides skip Accessibility is that they are written for the screenshot class of agents. Those do not need the UI tree because they are operating on pixels. If you are picking an agent, the permission list tells you which architecture you are getting.
What each permission actually unlocks
The three permissions map to three different pipelines inside the agent. One permission, one pipeline. Missing one quietly disables that pipeline.
macOS permission to agent capability
The setup flow, in order
Order matters here. Granting Accessibility before the app has been launched once produces a TCC entry with no app icon, and some macOS builds get confused about matching the entry to the app bundle later. The safe order is launch first, grant second, relaunch third.
Five steps, start to finish
Install
Drag the .dmg into Applications
Launch
Let macOS quarantine-check once
Grant
Microphone, Screen Recording, Accessibility
Restart
Quit and reopen after Accessibility
Verify
Read a real UI element
Detailed step-by-step
Download the signed .dmg and drag to Applications
Fazm is distributed as a notarized, code-signed .dmg. macOS will verify the signature on first open. No terminal commands required.
Launch the app once before touching System Settings
This registers the app's bundle ID with TCC so that when you later grant Accessibility, macOS has an actual bundle to bind the permission to. Granting before first launch sometimes creates orphan entries.
Grant Microphone in Privacy & Security, Microphone
Needed only if you want voice dictation. The app requests it the first time you tap the mic button, but you can pre-grant it here.
Grant Screen Recording in Privacy & Security, Screen Recording
Needed for the visual context channel. macOS will require you to quit and reopen the app after this toggle. Do that.
Grant Accessibility in Privacy & Security, Accessibility
This is the one most setup guides forget. Without it, UI tree reading and synthetic input both fail. The app prompts you with a deep link to this pane the first time it needs the permission.
Quit and reopen the app
macOS caches accessibility permission state per process. The freshly granted permission does not take effect in the currently running instance. A clean relaunch is the fastest way to flush the cache.
Test with a non-screenshot task
Ask the agent to read the title of the frontmost window, or to list the buttons in a Finder window. If it can only describe what the window looks like visually, Accessibility is not granted or has gone stale.
Skip the permission gauntlet
Fazm detects all three permission states on launch, walks you through the exact System Settings panes to open, and auto-reruns the check when you return from System Settings. Free to start, fully open source.
Download Fazm →The hidden third failure mode: TCC cache drift on macOS 26
Here is the thing the setup guides never cover. On macOS 26 Tahoe, Apple moved accessibility permission checks behind a per-process cache. The first time a process calls AXIsProcessTrusted(), the result is cached for the lifetime of that process. That sounds like a performance optimization and it is, but it interacts badly with app updates, app re-signs, macOS upgrades, and any event that invalidates the TCC entry.
When the cache goes stale you end up in a state where the toggle in System Settings reads ON, the TCC database on disk has your app marked as allowed, and the running process still thinks it is not trusted. Every AX call fails silently. Most agents misreport this as "permission denied" and send you back to System Settings, which already looks correct, which makes you think the agent is broken.
The fix is to bypass the cache. Fazm does it with a probe that creates a live CGEvent tap. Event tap creation hits the live TCC database, not the cache. If the tap succeeds, the permission is actually granted and the cache is lying.
What a stale-permission session actually looks like
If you want to see the failure mode without triggering it on your own machine, here is the sequence of log lines Fazm writes when it detects and recovers from a stale TCC entry. Every line has a real emitter in AppState.swift.
The numbers behind the recovery
Fazm does not loop forever. The retry window is deliberately small so the user gets a fast answer either way. These constants live in AppState.swift at the top of the accessibility block.
Why Accessibility, not Screen Recording, is the critical permission
Screen Recording gives the agent a picture. Accessibility gives the agent a sentence. Those are not the same thing.
When Fazm reads the UI with the Accessibility API, it gets a structured tree where every element has a role (button, text field, menu item), a title ("Send", "To:", "Save As..."), a position, a size, and a list of valid actions it can perform. A screenshot has none of that. A screenshot has pixels. The model has to re-recognize the UI from those pixels every single turn.
Accessibility API
Structured UI tree. Query by role and title. Call AXUIElementPerformAction directly. Deterministic, 5-50ms round trip.
Screen Recording
Raw pixels via ScreenCaptureKit. Useful for canvas apps (Figma, games) and visual context, but the model has to re-interpret the UI every frame.
Microphone
Optional. Enables voice dictation via Whisper. The agent runs fine without it if you prefer typing.
Apple Events (bonus)
Fazm's entitlements request com.apple.security.automation.apple-events too. Per-target-app AppleScript. Separate permission from Accessibility.
The live recovery flow
When the retry timer finishes without recovering, Fazm does not just log an error. The app shows a modal with a Quit and Reopen button, spawns a detached sh subprocess that sleeps 1 second then runs open <bundlePath>, then terminates the current process. The new process starts with a fresh TCC cache.
Stale-cache recovery
AX call fails
apiDisabled or cannotComplete
Probe event tap
Bypass per-proc cache
Retry 3x
5 seconds apart
Prompt restart
Modal with Quit button
Spawn sh relaunch
sleep 1 && open app
Fresh process
Cache is clean
When to suspect a stale permission instead of a broken agent
There are four signals that tell you the permission is stale, not missing, so you can skip re-granting and go straight to a relaunch.
- The toggle in System Settings, Privacy and Security, Accessibility is already on for the app.
- The agent worked yesterday and stopped working after you installed an app update or upgraded macOS.
- Other accessibility-aware apps (Raycast, BetterTouchTool) also started behaving oddly around the same time.
- The agent's logs show AXError.cannotComplete or AXError.apiDisabled instead of "permission denied".
In all four cases the answer is: quit the app, reopen it. If that does not work, toggle Accessibility off and on. If that still does not work, remove and re-add the app in the Accessibility pane.
0 skills installed on first launch
One thing that differentiates a consumer desktop agent setup from a developer framework setup is what happens between clicking Done and the first useful task. Fazm ships 17 bundled Claude skills (pdf, docx, xlsx, pptx, video-edit, frontend-design, canvas-design, doc-coauthoring, deep-research, travel-planner, web-scraping, social-autoposter, ai-browser-profile, google-workspace-setup, telegram, find-skills, plus onboarding helpers) and installs them to ~/.claude/skills/ on first launch with a SHA-256 checksum update loop.
That matters for setup because the first real task you give the agent does not have to be a generic one. You can open the onboarding flow, pick a skill ("help me plan a trip", "convert this .docx to PDF"), and it already has a working runtime for that task. No pip install. No MCP server wiring. The setup is done when the permissions are green.
Setup FAQ
What permissions do I actually need to run a desktop AI agent on Mac?
Three, not two. Microphone if you want voice control, Screen Recording for on-screen context, and Accessibility so the agent can read the UI tree and post input events. Most setup guides only mention the first two because Screen Recording is enough for screenshot-based agents. Real accessibility-driven agents like Fazm need the Accessibility toggle too, which is a separate System Settings pane at Privacy and Security, Accessibility.
Why does my agent stop working after a macOS or app update?
macOS 26 (Tahoe) caches accessibility permission state per process. After an app update or re-sign, the TCC database says you are granted but the cached answer in AXIsProcessTrusted() can still say no. The agent then silently refuses to act. The fix is a fresh quit and reopen, or a live probe that bypasses the cache. Fazm does this automatically via a CGEvent tap probe in AppState.swift.
How do I know whether my setup is actually working?
Run a task that requires reading a real UI element, not a screenshot. For example, ask the agent to read the title of the frontmost window, or to click a named button in Finder. If it can only describe the screen visually but cannot enumerate elements or read text field values, your accessibility permission is missing or stale.
What is TCC and why does it matter for setup?
TCC stands for Transparency, Consent, and Control. It is macOS's privacy database that tracks which apps have which permissions. Each permission (Accessibility, Screen Recording, Microphone) is a separate entry keyed by the app's code signature. When the signature changes, such as after an update from a different signing key, or after a macOS upgrade, TCC can quietly drop the permission and the agent starts failing without any visible error.
Does the Apple Events permission do the same thing as Accessibility?
No. Apple Events permission lets one app send AppleScript commands to a specific other app. It is per-target-app. Accessibility permission lets the agent read and write the UI tree of every running app at once. Fazm uses both: Apple Events for high-level commands to specific apps like Mail or Safari, Accessibility for generic UI reading and CGEvent-based input simulation.
How does Fazm detect that a permission silently went stale?
It calls AXIsProcessTrusted() first, then falls back to CGEvent.tapCreate() with listenOnly on mouseMoved events. If AXIsProcessTrusted() says no but tap creation succeeds, the permission is actually granted and the cache is lying. If AX calls return AXError.cannotComplete it also tries Finder (bundle ID com.apple.finder) as a control target to rule out app-specific AX incompatibility. This logic is in probeAccessibilityViaEventTap() and confirmAccessibilityBrokenViaFinder() in Desktop/Sources/AppState.swift.
What do I do if System Settings shows the toggle is on but the agent still fails?
Three options. One, quit and reopen the app (fastest, works 90% of the time). Two, toggle the Accessibility switch off and back on while the app is running. Three, remove and re-add the app using the plus and minus buttons in System Settings, Privacy and Security, Accessibility. Fazm will prompt you with a Quit and Reopen button after 3 failed retries spaced 5 seconds apart.
Is Fazm a developer framework or a consumer app?
Consumer app. You download a signed macOS .dmg, drag it to Applications, open it, and step through onboarding. No terminal, no Python environment, no API key required to start. It bundles Claude skills that install to ~/.claude/skills on first launch so existing Claude Code users get their skill library for free. Source is public under Desktop/Sources if you want to audit.
Ready to set it up?
Three toggles, one relaunch, a live TCC probe if anything goes sideways. Fully open source so you can audit the probe logic before you grant anything.
Download Fazm for Mac
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.