A local AI that can use your other apps, not just run on your Mac
Search results for local host AI all stop at the same place. Here is how to download Ollama, here is an M-series quant, here is your localhost:11434 endpoint. What happens after you have a running model on your host? Can it read the email you have open? Can it click the calendar button? The honest answer in every one of those tutorials is no, not out of the box. Fazm is built around the other half. Its testAccessibilityPermission function does a three-stage probe of the macOS accessibility stack so the model on your local host can actually reach the apps running next to it.
“testAccessibilityPermission (AppState.swift, lines 431-463) runs AXUIElementCopyAttributeValue on the frontmost app's PID, treats apiDisabled as unambiguous failure, on cannotComplete falls back to confirmAccessibilityBrokenViaFinder (lines 468-485), and finally to probeAccessibilityViaEventTap (lines 490-504) which calls CGEvent.tapCreate to bypass the stale per-process TCC cache on macOS 26 Tahoe. Retry every 5 seconds via schedulAccessibilityRetry. On confirmed failure, an NSAlert asks the user to Quit & Reopen, and relaunchApp runs /bin/sh -c 'sleep 1 && open <bundlePath>' before terminating.”
Desktop/Sources/AppState.swift, Fazm open source
What the top SERP results actually give you
Every one of these is useful. Every one of them stops at model-on-disk. None of them answer the question can this local AI use the app I have open right now.
The difference a real accessibility tree makes
Screenshot-based agents paint every screen, send it upstream, and ask a vision model to point at the button. Accessibility-based agents ask macOS for the widget tree the OS already maintains for VoiceOver, and target elements by role and description. The difference shows up in latency, cost, and correctness.
What the local host asks the OS
The Swift function that decides whether the local host can act
This is the real function. No pseudo-code, no translation. The only thing the model's agent loop needs to know is the boolean that falls out of it. Everything fancy Fazm does, clicking, typing, reading apps, gates on this check returning true.
The five-stage recovery path when macOS gets confused
A one-shot call to AXIsProcessTrusted is what most tutorials write. That is not enough on modern macOS because TCC lies. Fazm wraps the probe in a five-stage loop that survives the common failure modes the user never thinks about.
Stage 1. Live AX call against the frontmost app
testAccessibilityPermission grabs the current NSWorkspace frontmostApplication, wraps its PID with AXUIElementCreateApplication, and calls AXUIElementCopyAttributeValue(appElement, kAXFocusedWindowAttribute, ...). Four return codes count as healthy: success, noValue, notImplemented, attributeUnsupported. AppState.swift lines 433-447.
Stage 2. Disambiguate cannotComplete against Finder
If the AX call returns cannotComplete, the permission might be broken, or the frontmost app might just not implement AX (Qt apps, OpenGL apps, PyMOL). Fazm re-runs the same AX probe against com.apple.finder, a known AX-compliant app. Only if Finder also fails is the permission declared truly broken. AppState.swift lines 468-485.
Stage 3. CGEvent tap probe for macOS 26 Tahoe TCC cache
If Finder is not running, or as a final tie-break, Fazm calls CGEvent.tapCreate with .cgSessionEventTap and .listenOnly. Unlike AXIsProcessTrusted, tap creation checks the live TCC database, bypassing the per-process cache that goes stale on macOS 26 when the user toggles the Accessibility switch. AppState.swift lines 490-504.
Stage 4. Retry timer, not a one-shot
The check does not run once and give up. A Timer fires every 5 seconds with a maxAccessibilityRetries budget. On every failed tick the state flips to isAccessibilityBroken, the floating UI shows a hint, and on retry success it clears automatically. AppState.swift lines 301-400.
Stage 5. Forced relaunch when macOS lies
On macOS Sequoia and Tahoe there is a failure mode where AXIsProcessTrustedWithOptions returns true but AX calls still fail. Fazm catches this, pops an NSAlert titled Accessibility Permission Needs Restart, and when the user clicks Quit & Reopen it spawns open "<bundlePath>" via /bin/sh and terminates. AppState.swift lines 403-429.
Why a CGEvent tap is the only probe that does not lie
AXIsProcessTrusted uses the per-process TCC cache. After macOS 26 Tahoe, that cache can report you as trusted while the actual AX calls keep failing. Creating a CGEventTap forces the kernel to re-read live TCC state for the calling process. If the tap handle comes back non-nil, you are genuinely trusted right now. Fazm invalidates it immediately, it only cares about the yes or no.
What the local host can actually do once the probe returns true
Every capability below is a direct consequence of reading the Accessibility tree instead of pixels. The OS already maintains this graph for VoiceOver users, so the agent is piggybacking on twenty years of Apple's work rather than asking a vision model to recreate it.
Read the focused window tree
AXUIElementCopyAttributeValue(kAXFocusedWindowAttribute) returns the actual widget hierarchy with text, roles, and bounds. That is not a screenshot; it is the accessibility graph the OS already maintains for VoiceOver.
Click a button by role, not by pixel
A click targets an element with role AXButton and a specific AXDescription. Resolution-independent, layout-change-resilient, works in dark mode, works on a dragged window.
Read a field without OCR
Text fields expose AXValue directly. No Tesseract, no GPT-4V round trip, no screenshot upload. The OS hands you the string the user is looking at.
Inspect Finder, Xcode, Numbers
Apple's own apps are fully AX-instrumented. The agent reads cells, rows, selected items, the open folder path, the currently opened file.
Fall back to capture only when useful
If an app is Qt or Electron with a gnarled AX tree, ScreenCaptureManager.captureAppWindow grabs just that window (via CGWindowListCreateImage with .optionIncludingWindow) and hands it to a vision model. That is the exception, not the default.
Survive macOS 26 Tahoe's new TCC
The per-process TCC cache in Tahoe can flag you as trusted while actual AX calls return cannotComplete. Fazm catches the mismatch and self-heals with a relaunch instead of silently doing nothing.
What the health check looks like on a real Mac
When Fazm boots, it logs a line per probe stage to /tmp/fazm.log. Grep for ACCESSIBILITY_CHECK and you can watch the state machine make up its mind.
Fazm's local host versus a classic local LLM stack
This is not a model benchmark. Both sides can run any model you point them at. The difference is what happens between the model and the rest of your Mac.
| Feature | Ollama / LM Studio / Foundry Local | Fazm |
|---|---|---|
| What you run on your local host | A Docker stack or a Python venv running an LLM server | A signed consumer Mac app that hosts the agent on your box |
| How the local AI reads other apps | Typically nothing. The model sits in a REPL or a web UI | macOS Accessibility API (AXUIElement*) reading the real widget tree |
| Vision fallback | N/A or full-screen screenshot on every step | ScreenCaptureKit / CGWindowListCreateImage, only when you ask |
| Scope of apps the AI can reach | None by default; browser-only via Playwright if wired up | Any running app on your Mac, not just the browser |
| Permission discovery | AXIsProcessTrusted() one-liner that lies on macOS 26 | 3-stage probe: AX call + Finder tie-break + CGEventTap |
| Recovery when macOS TCC cache is stale | You restart the terminal and hope | NSAlert + relaunch-via-open so the new process reads fresh TCC |
| Shape of the deliverable | README, Makefile, brew install, pip install, docker compose up | DMG, signed and notarized, App Store style install |
Two failure modes Fazm refuses to swallow
A local host AI is useful only to the degree you can trust it to tell you when it has stopped working. Two states that every tutorial glosses over:
Permission stuck after toggling
How Fazm recovers
- Retry the AX call every 5 seconds for 30 seconds
- If still stuck, show an NSAlert with Quit & Reopen
- On Quit & Reopen, spawn /bin/sh -c 'sleep 1 && open <bundlePath>'
- New process reads fresh TCC, probe passes, agent works
App does not implement AX at all
How Fazm recovers
- AX call returns cannotComplete for this app only
- Finder re-check passes, so permission is fine
- Fall back to CGWindowListCreateImage to capture the window
- Route pixels to a vision model for that one action, AX still handles the rest
Put the local host AI on your Mac
Fazm runs on your machine, talks to your apps through the accessibility layer Apple built for VoiceOver, and asks for screen capture only when AX is not enough. Free, open source, signed and notarized.
Download Fazm →Local host AI, answered against the source
What does 'local host AI' actually mean, and how is that different from a local model?
A local model means a neural net running on your hardware (Ollama, LM Studio, MLX, Foundry Local). A local host AI is the model plus everything around it: permissions, IO, and the ability to reach other apps on the same machine. If your model is local but it cannot read what is in Mail or type into Keynote, then your local host AI is a fancy chat window. Fazm's answer is that the local model calls a Mac-native host that talks to the OS through Accessibility APIs.
Why use macOS Accessibility APIs instead of screenshots like computer-use models do?
Three reasons. First, the AX tree is a structured string graph; screenshots are pixels that have to be re-read by a vision model on every turn. Second, the AX tree already has roles (AXButton, AXTextField, AXStaticText) so the agent can target elements semantically instead of guessing what a button looks like. Third, it is cheap: one AX call is a local function, a screenshot is a PNG plus an inference round trip. Fazm uses screenshots only as a fallback for apps that do not expose AX properly (Qt, some Electron builds).
Where in the Fazm source can I see this permission probe?
Desktop/Sources/AppState.swift in the open-source Fazm repo. Function names: testAccessibilityPermission (lines 433-463), confirmAccessibilityBrokenViaFinder (lines 468-485), probeAccessibilityViaEventTap (lines 490-504). The public retry timer that wraps them all is schedulAccessibilityRetry (lines 376-400). Everything is MIT licensed.
What is the macOS 26 Tahoe cache bug Fazm works around?
On Tahoe, each process has a cached view of which TCC permissions it owns. When the user toggles Accessibility for Fazm in System Settings while Fazm is running, AXIsProcessTrustedWithOptions starts returning true immediately, but the live AX calls continue returning cannotComplete or apiDisabled until Fazm restarts. Fazm sidesteps the cache entirely by testing with a real AX call and, if that is ambiguous, a CGEvent.tapCreate call which checks the live TCC database. If both come back bad it asks the user to relaunch.
Does Fazm send anything from my screen to a cloud?
By default, no. The AX tree never leaves your Mac. When you explicitly ask for screen context, ScreenCaptureManager uses CGWindowListCreateImage to capture the target window and feeds it into the model. You can see the captures in /tmp/fazm.log. If you pair Fazm with a local-inference backend like Ollama or LM Studio, even that round trip stays on your host.
Which apps does this actually work with?
Any app that implements AX. That includes Apple's whole family (Finder, Mail, Notes, Calendar, Messages, Xcode, Numbers, Keynote, Pages, System Settings), Safari and Chrome (including extension UIs), native Electron apps when they remember to set accessibility flags, Slack, Discord, Notion, Linear, Figma, Zoom, VS Code, Terminal, iTerm, and so on. The floating control bar detects the frontmost app and pulls its window tree regardless. Qt and raw SDL apps are the usual misses.
How is this different from just running Ollama plus a Python script?
Ollama gives you a model endpoint on localhost:11434. It does not know anything about your running apps. You could wire pyatspi or pyobjc on top, deal with the AX framework in Python, negotiate TCC permissions per process, and build a retry loop for macOS 26 Tahoe. Fazm does all of that in Swift and ships it as a signed DMG. The model half is interchangeable; Fazm talks to Claude by default but it can point at any local endpoint. The host half is the unique work.
What happens if I deny Accessibility permission?
Fazm keeps running but the agent loses its hands. You can still chat and the app can still render screen captures, but it cannot click, type, or read widget text. A floating hint appears in the control bar and a settings pane has a one-click tccutil reset flow that triggers an NSAlert and relaunches the app cleanly. The code path is resetAccessibilityPermissionDirect in AppState.swift line 544.
Is this open source so I can audit the claims?
Yes. Fazm's desktop is MIT-licensed at github.com/mediar-ai/fazm. Every file and line number cited on this page exists in that repo on 2026-04-19. The three functions that form the three-stage permission probe are all in Desktop/Sources/AppState.swift. The screen capture fallback is in Desktop/Sources/FloatingControlBar/ScreenCaptureManager.swift. The floating UI that reacts to the isAccessibilityBroken flag is next to it.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.