Local AI chatbotThe definition the SERP skipsPinned to the Fazm Swift source

Local AI chatbot, the second definition every guide skips

The top search results for this keyword agree that "local" means the model weights run on your Mac. Ollama, LM Studio, Jan, Locally AI, Enclave AI, GPT4All. Every one of those is a good answer to the question "can an LLM run on my laptop." None of them answer the more useful question: can a chatbot read what is on my screen right now without bouncing pixels through a vision model. Fazm does. It calls `AXUIElementCopyAttributeValue` on whatever app you have focused and walks the macOS accessibility tree, so the chatbot sees your Mail, your Numbers sheet, your Cursor file, or your Figma canvas as structured text instead of a JPEG. The one line of Swift that does it is pinned below.

Fazm

Published April 18, 202610 min read

Download Fazm for Mac

4.9from 200+

Every claim pinned to a real file and line number in the Fazm Swift source

Tested against the April 2026 top 5 SERP results for 'local ai chatbot'

Includes a bundled native mcp-server-macos-use binary, verifiable in /Applications/Fazm.app

There are two kinds of local AI chatbot

Everyone on the SERP covers one. Fazm is the other.

Local inference: the model runs on your Mac

Local context: the chatbot reads your live desktop

One solves privacy. One makes the chatbot useful.

Fazm reads apps via macOS Accessibility APIs, not screenshots

Any native Mac app. One line of Swift. Anchor fact below.

0:00 / 0:05

1 AX call

“A Fazm query does not start with a screenshot. It starts with AXUIElementCopyAttributeValue(appElement, kAXFocusedWindowAttribute as CFString, &focusedWindow) at AppState.swift line 441. That one Apple framework call returns a handle to the real focused window of whichever app you are in, as a tree of roles, titles, values, and children. The chatbot sees structured UI, not pixels.”

Desktop/Sources/AppState.swift, line 441

The anchor fact: how Fazm reads your screen without looking at it

Every other Mac chatbot that advertises "sees your screen" does one of two things. Either it prompts you to paste text into its window, or it takes a screenshot and feeds the JPEG to a vision-capable model. Fazm does neither by default. It reaches into the focused app through the same OS API that VoiceOver uses. Here is the code, unedited.

Desktop/Sources/AppState.swift, lines 433-445

`frontmostApplication` is the app you were in a moment before you invoked Fazm, not Fazm itself. `AXUIElementCreateApplication` builds a live handle to that app. `AXUIElementCopyAttributeValue` with `kAXFocusedWindowAttribute` returns a handle to the specific window currently focused. From there a chatbot can descend the accessibility hierarchy, pull the value of the text field under the cursor, the title of the selected row in a list, or the label on the button you are hovering, all as plain strings. Nothing is rendered. Nothing is captured.

acp-bridge/src/index.ts, lines 1056-1064

The AX read from Swift is half the story. The other half is a native binary called `mcp-server-macos-use` that ships inside `/Applications/Fazm.app/Contents/MacOS/`. It is a Model Context Protocol server that exposes a structured interface over the same Accessibility APIs to Claude. Every tool call prefixed with `mcp__macos-use__` routes through this binary, which walks the AX tree in Rust and returns structured results in milliseconds. You can verify the file with `ls /Applications/Fazm.app/Contents/MacOS` on any Mac with Fazm installed.

The two meanings of "local" in a chatbot, side by side

Both are legitimate. Both solve real problems. They are almost always framed as the same thing by the SERP, and they are not. Picking the right one depends on whether your priority is offline privacy or on-machine usefulness.

Local inference vs local context

The language model runs on your Mac. Zero network calls, zero cloud bill, full offline operation. The chatbot is a text window; it sees only what you type or paste into it. If you want to ask about a Mail thread you are reading, you copy the text into the chat first. Good fit for notes, brainstorming, sensitive documents, and flights with no WiFi.

Model weights sit on your disk
Fully offline, zero cloud round trips
Chat window cannot see other apps
You copy and paste what you want it to know

How a single Fazm query flows through your Mac

The diagram is literal. Four local inputs converge on the ACP bridge, which is the Node subprocess Fazm spawns. The bridge negotiates with Claude, but every input on the left lives on your machine. The cloud model only ever sees the distilled structured-context packet, not the raw pixels or permissions.

Fazm chat query: local inputs to model output

Six dimensions of "local" in a chatbot

The word hides more than one decision. A chatbot can be local on one axis and remote on another. The SERP currently treats only the first axis as "local," but the other five matter just as much for a real Mac workflow.

Local model weights

Ollama, LM Studio, Jan, Locally AI, Enclave, GPT4All. The model binary sits on your disk and every token is generated on your Mac. Solves privacy, works offline, costs zero marginal per message.

Local screen context

Fazm. A chatbot window that can read the focused app through Accessibility APIs, control other apps through a bundled MCP server, and store conversations in a local SQLite file. The model is remote, everything else is on your machine.

Local conversation history

Both categories store chat logs in a local database and let you search them offline. Ollama and Fazm both write to SQLite-style stores, never to a cloud sync service.

Local tool execution

The chatbot runs commands, reads files, and manipulates apps from your machine, not a cloud sandbox. This is where the Accessibility API story matters most, and where the local-inference crowd has historically been weakest.

Local permissions model

macOS gates Accessibility, Screen Recording, and Automation per-app in System Settings. A proper local chatbot lives inside that permissions model and asks for exactly the grants it needs, nothing more.

Local knowledge graph

Fazm ships an on-disk knowledge graph of files, people, and past observations. The observer agent writes to it continuously; the chatbot queries it without leaving your machine.

Local inference vs local context on specific workflows

Concrete tasks a Mac user actually does, scored against the two definitions of local. Not "Fazm beats Ollama." Ollama wins every offline row. The point is that the shape of the problem decides the shape of the tool.

Feature	Local inference (Ollama / LM Studio / Locally AI)	Local context (Fazm)
Works offline with no internet	Yes, model runs on your CPU or M-series GPU	No, Claude is cloud-hosted
Reads your focused Mail message without copy-paste	No, chat window cannot see other apps	Yes, via kAXFocusedWindowAttribute
Clicks a button in another app on your behalf	No, no tool channel into the OS	Yes, via bundled mcp-server-macos-use
Answers 'what is in my clipboard' without paste	No, requires manual paste	Yes, AX-adjacent NSPasteboard read
Uses your Claude Pro or Max subscription allowance	No, no Claude integration	Yes, via OAuth bridge (ACPBridge.swift line 350)
Zero marginal cost per message	Yes, unconditionally	Yes, if on Claude allowance or Haiku trial
Works on a flight with airplane mode on	Full chat, no limits	Stored messages only, no new responses
Stores every conversation on disk in SQLite	Varies by app; most also use local stores	Yes, ~/Library/Application Support/Fazm/fazm.db
Surfaces conversation history to a screen reader	Varies by app, usually yes	Yes, native SwiftUI views with AX labels
Bundled native MCP server inside .app	No (not the same problem space)	Yes, Contents/MacOS/mcp-server-macos-use

What happens, step by step, when you ask Fazm about your screen

Six steps from keystroke to answer. Every step is either a local API call or a subprocess on your Mac. Only step five leaves your machine, and what leaves is a structured string, not a screenshot.

1. You type a question into the Fazm floating bar

The bar is a tiny always-on-top window. It is NOT your focused app — that is whatever you were working in a moment ago, tracked via `FloatingControlBarManager.shared.lastActiveAppPID`.

2. Swift builds an AX handle to that app, not to itself

`let appElement = AXUIElementCreateApplication(frontApp.processIdentifier)` (AppState.swift line 439). The PID is the app you were last in, so Fazm reads the real thing you are working on, not its own chat UI.

3. Fazm pulls the focused window element over the AX bridge

`AXUIElementCopyAttributeValue(appElement, kAXFocusedWindowAttribute as CFString, &focusedWindow)` (AppState.swift line 441). This is the one line that lets a chatbot see a Mail message, a Numbers cell, a Figma layer, or a Cursor file tab without a screenshot.

4. The bundled MCP server traverses the AX tree on demand

The `mcp-server-macos-use` binary sitting at `Contents/MacOS/mcp-server-macos-use` is wired as a Claude tool (acp-bridge/src/index.ts line 1063). When the model needs a button coordinate, a text-field value, or a menu path, it calls an MCP tool that walks the accessibility hierarchy in Rust, not a vision model that guesses at pixels.

5. Only the structured snippet ships to Claude

What the model sees is a short JSON blob: role, title, children, coordinates. Not a 1568-pixel JPEG. Token cost drops by an order of magnitude compared to a vision-first path, and the model can reason about the UI as a tree instead of as a picture.

6. Screenshot fallback exists but is the last resort

`Desktop/Sources/Providers/ChatToolExecutor.swift` lines 895-927 implements `capture_screenshot` with mode 'screen' or 'window'. It is reachable, just not the default. AX tree reads happen first because they are faster, cheaper, and more precise for structured UI.

The local wiring, in numbers

0Line in Desktop/Sources/AppState.swift that calls AXUIElementCopyAttributeValue on the focused window

0Line in acp-bridge/src/index.ts that registers the bundled mcp-server-macos-use MCP server

0Line in ChatToolExecutor.swift where the screenshot fallback lives (opt-in, not default)

0Number of bundled MCP servers (fazm_tools, playwright, macos-use, whatsapp, google-workspace) from index.ts line 1266

Four line numbers. One native binary. One AX call. That is the whole local-context stack.

Tokens saved per query by using AX tree over a 1568px JPEG:~0(structured JSON vs vision-model image tokens)

Typical AX tree read latency on M-series:<0 ms(vs 500-2000ms for a screenshot + VLM round trip)

Verify the local-context story with three greps

If any of the above sounds too neat, these three commands prove the wiring exists. Run them against a fresh clone of the Fazm desktop repo or against the bundled source in `/Applications/Fazm.app`.

Three greps against the April 2026 Fazm source tree

Three consequences of choosing local context over local inference

Picking the second definition of local changes the daily experience of the chatbot more than the model choice does. These are the three things a user actually notices within the first week.

Copy-paste disappears

The chatbot already sees the Mail message, the Numbers cell, the Figma frame. You ask "summarize this thread" without any setup. The structured AX read covers what used to be the paste step in a local-inference workflow.

Action replaces suggestion

Because the mcp-server-macos-use binary can click, type, and navigate in any native app, the chatbot moves from "here is a draft" to "I sent the reply, here is the message ID." The offline chatbot has no comparable action channel.

Permissions become the UI

Screen Recording, Accessibility, Automation. Each one is a capability the chatbot unlocks. An offline chatbot has nothing to ask for. A local-context chatbot guides you through a one-time System Settings grant and from there has any-app reach.

How the "local AI chatbot" SERP currently maps

Every product the top results list, tagged by its definition of local. Fazm is the only entry on the context axis.

Ollama (local inference)LM Studio (local inference)Jan (local inference)GPT4All (local inference)Locally AI (local inference)Enclave AI (local inference)Chatbox (BYO provider)Fello AI (multi-provider)BoltAI (multi-provider)Elephas (system-wide + local)Fazm (local screen context)

The chatbot that reads your live Mac, not a screenshot of it

Fazm is a macOS-native AI chatbot that treats your Mac as the context, not as an afterthought. It reaches into the focused app through macOS Accessibility APIs, controls other apps through a bundled native MCP server, and stores every conversation in a local SQLite database on your disk. The cloud model is Claude over your own OAuth session. Everything else is local.

Download Fazm →

Frequently asked questions

What does 'local AI chatbot' actually mean in April 2026?

The SERP treats it as a single thing: a chat app that runs the language model on your Mac instead of calling a cloud API. Ollama, LM Studio, Jan, Enclave AI, Locally AI, Chatbox, and GPT4All all fit that definition. That is local inference. There is a second, underserved meaning: a chatbot that runs locally on your machine and reads your live desktop context through operating-system APIs, so it can actually help with what you are doing in whatever app you have focused. Fazm is the second kind. It uses a cloud model but its desktop integration is deeply local: structured accessibility reads of the focused window, clipboard, running apps, and a bundled native mcp-server-macos-use binary inside the .app bundle.

Is Fazm a local AI chatbot if the model runs in the cloud?

Depends on which definition you use. Model-locality: no, Fazm talks to Claude (or a bundled Anthropic key during the free trial). Context-locality: yes, more aggressively than the on-device-inference apps. The chatbot lives in a floating bar on your Mac, reads the focused app through Accessibility APIs before every query, controls other apps through the bundled mcp-server-macos-use MCP server, and stores every message in a local SQLite database. The screen content never leaves your computer as pixels, it leaves as a tiny structured JSON snippet (role, title, value, children) only when the model needs it.

How does Fazm read the focused app without a screenshot?

Through macOS Accessibility APIs. The code is in `/Users/matthewdi/fazm/Desktop/Sources/AppState.swift` line 441: `AXUIElementCopyAttributeValue(appElement, kAXFocusedWindowAttribute as CFString, &focusedWindow)`. `appElement` is built from `NSWorkspace.shared.frontmostApplication`, so the read targets whatever app you are currently in, not Fazm's own window. From the focused window element Fazm can walk the AX tree, pull button labels, menu items, list rows, text-field values, and selected text, all as structured strings. No pixel capture, no OCR, no VLM.

What is the native binary inside Fazm's app bundle that does the UI reads?

A Rust binary called `mcp-server-macos-use`, wired up in `/Users/matthewdi/fazm/acp-bridge/src/index.ts` line 1063. The TypeScript ACP bridge registers it as an MCP server named `macos-use` when the binary exists in `Contents/MacOS/mcp-server-macos-use`. From there every Claude tool call that opens with `mcp__macos-use__` runs against the live accessibility tree. You can verify the binary yourself with `ls /Applications/Fazm.app/Contents/MacOS` on any machine with Fazm installed.

Why use Accessibility APIs instead of screenshots for a chatbot?

Three reasons, all practical. One, latency: an AX tree read is milliseconds, a screenshot-plus-vision round trip is seconds. Two, precision: AX returns role=Button title='Send' directly, vision has to infer the same thing from pixels and often misreads icon-only buttons. Three, any-app coverage: AX works the same way in Mail, Numbers, Finder, Cursor, Xcode, Slack, and 99 percent of native macOS apps, whereas screenshot parsing depends on visible pixels and breaks on minimized windows, overlays, and off-screen content. The tradeoff is Accessibility permission, which the user grants once in System Settings.

How does Fazm compare to Ollama, LM Studio, Locally AI, and Enclave AI?

Those are all high-quality local-inference apps. They run the model on your Mac, keep conversations in a chat window, and never call a cloud API. Their constraint is that the chatbot only sees what you paste into it. Fazm's constraint is inverted: it uses a cloud model but sees your whole Mac. If your question is 'can I use an LLM without internet,' pick Ollama or Locally AI. If your question is 'can I ask an LLM what is on my screen right now and have it actually do something about it,' Fazm is the short list. Different tools, different problems, both valid.

Does Fazm work with any app or only specific ones?

Any AX-compliant native macOS app, which is the overwhelming majority of apps you use day to day: Finder, Mail, Notes, Safari, Chrome, Firefox, Slack, Discord, Telegram, Messages, WhatsApp, Spotify, Zoom, Numbers, Pages, Keynote, Xcode, Cursor, VS Code, Figma, Excel, Word, Outlook, Notion, Linear, Arc, Affinity, Sketch, Terminal, iTerm2, System Settings. The short list of things that are harder: Qt apps, OpenGL game canvases, and some Python tooling, which is why `Desktop/Sources/AppState.swift` line 454-458 has a specific `.cannotComplete` fallback that double-checks against Finder before assuming the permission itself is broken.

Is Fazm open source and can I verify these file paths?

The Fazm desktop app is fully open source. Every file path in this guide is a real path in the repository. The relevant files: `Desktop/Sources/AppState.swift` for the AX permission and focused-window probe, `Desktop/Sources/Providers/ChatToolExecutor.swift` for the capture_screenshot tool that is used only when a pure AX read is not enough, and `acp-bridge/src/index.ts` for the MCP server wiring. You can `rg -n kAXFocusedWindowAttribute Desktop/Sources` and `rg -n macos-use acp-bridge/src` yourself.

What if I want both a local model and local screen context?

Nothing in Fazm's architecture blocks a local model. The ACP bridge spawns a Node subprocess and hands it a Claude OAuth session today, but the same bridge could hand it an Ollama endpoint instead. The harder problem is the Accessibility integration, which is what Fazm is actually providing. Running a llama on your Mac is a solved problem in 2026. Having a chatbot that understands what is on your screen without bouncing pixels to a vision model is not, and is the part that the top 'local AI chatbot' SERP results skip entirely.

Where is the capture_screenshot tool and when does Fazm fall back to pixels?

`Desktop/Sources/Providers/ChatToolExecutor.swift` lines 895-927. The tool accepts mode='screen' or mode='window' and returns a base64 JPEG. It is wired explicitly as a fallback: when the user asks 'what does this look like' or when Claude decides it needs visual confirmation of a chart, icon, or rendered document, it calls `capture_screenshot`. For 'what is in the to-address field of this email', 'which row is selected', 'what is the subtotal shown here', Fazm stays on the Accessibility path because it is faster and more precise.

How is this different from the accessibility features inside other chatbots?

Most 'accessible chatbot' guides on the web are about making the chatbot itself readable by screen readers (ARIA labels, semantic HTML, keyboard navigation). That is a separate topic. Fazm uses the same operating-system Accessibility APIs in reverse: instead of exposing Fazm to VoiceOver, Fazm uses macOS AX to read every other app on your Mac. Both are valid uses of the AX stack, but the SEO keyword 'local ai chatbot' is about the second kind, even though the SERP does not currently reflect that.

Does Fazm need internet at all?

For chat responses, yes. It calls Claude over HTTPS. For everything else (reading your screen, controlling apps, storing your messages, searching your local knowledge graph, running bundled skills), it is fully local. The conversation database is a SQLite file at `~/Library/Application Support/Fazm/fazm.db`. The AX tree reads go through `/System/Library/Frameworks/ApplicationServices.framework`. The mcp-server-macos-use binary runs as a child process inside the app bundle. An offline Mac will still remember everything, just cannot generate new responses.

What every "local AI chatbot" article stops short of saying

The offline inference story is covered exhaustively. There is a good guide for every model format, every quantization, every Apple Silicon trick. If you want to run a llama on your Mac in April 2026, the SERP is ready for you.

What the SERP has not caught up to is that the most useful kind of local chatbot is the one that reads your live desktop. Accessibility APIs have been sitting in macOS since 2001. The one Apple framework call at `AppState.swift` line 441, plus a native `mcp-server-macos-use` binary shipped inside the app bundle, is what turns a chat window into something that knows what you are working on. The model can be anywhere. The context has to be local.

If that framing is right for your workflow, Fazm is free to start, fully open source, and every file path in this guide is something you can verify yourself.