A verification recipe, not a marketing claim

Verifying local AI privacy on macOS, the actual commands

A local-first claim is a testable claim. The bytes either leave your Mac or they do not, and the place to settle that argument is your own machine, not a privacy policy page. This guide is four checks you can run in about ten minutes. The worked example is Fazm because the source is MIT and every outbound endpoint is grep-able in three Swift files, but the recipe is the same for any AI app on macOS.

Matthew Diakonov, Written with AI

Published May 4, 20269 min read

Direct answer, verified 2026-05-04

Run sudo nettop -p $(pgrep AppName) -P -t external -L 0 while you use the feature you want to verify. Every cloud round-trip shows up as a live outbound socket. If the only rows you see are to localhost, 127.0.0.1, or hosts you explicitly configured, the local claim holds. If api.openai.com, api.anthropic.com, or any unrelated provider lights up while you are using a feature advertised as local, it does not.

What you are actually testing

Three different things get bundled into the phrase "local AI privacy" and they need to be untangled before the test is meaningful. First, the model: do the weights live on your Mac, or does the prompt go to a model provider over TLS. Second, the input pipeline: do the screen capture, accessibility tree, and microphone stay in process memory, or does any of it get uploaded for processing (transcription, vision, summarization). Third, the metadata: does the app phone home with usage telemetry, error reports, or feature analytics, even if the model and inputs are local.

A truly local app holds all three. A "local-friendly" app (which is what most ship as today) holds the input pipeline and lets you point the model at whatever endpoint you want. A local-in-marketing-only app says all the right things and still ships your transcription audio to a third party. The four checks below tell you which of the three you have, and they take about ten minutes end to end.

The four checks, in order

1. Read the source if you can

If the app is open source, grep for hardcoded URLs in the codebase. Five seconds gets you the host list. Closed source: skip to step 2.

2. Watch nettop while you actually use the feature

Launch the app, run nettop scoped to its PID, then exercise the feature you care about. External sockets appear in real time. The order matters: open nettop first, app second, click third.

3. Block-and-test with the firewall

Drop the app's outbound traffic with Little Snitch, LuLu, or pf. Try the feature again. If it still works, it really is local. If it errors, you now know what is cloud-dependent.

4. Inspect the model setting, then confirm

If the app has a custom-endpoint or model-picker setting, point it at 127.0.0.1 (Ollama, LM Studio, llama-server). Re-run nettop. The previous external row should disappear and a loopback row should take its place.

Check 1, the source-code grep

For an open-source AI app this is the cheapest possible verifier. Clone the repo, point ripgrep at the source directory, and ask for every URL the binary could possibly talk to. The output is small (most apps have between three and fifteen hardcoded hosts) and you can categorize each one in seconds: model provider, voice provider, analytics, update server, OAuth.

Here is the actual run against the Fazm desktop source. The five hostnames in the output are the entire externally-talking surface of the app. The LLM endpoint is not hardcoded, which is the next interesting fact: it comes from the ANTHROPIC_BASE_URL environment variable, set by Swift in ACPBridge.swift:468-469 when the user fills in the Custom API Endpoint field.

grep the Fazm source for hardcoded URLs

Five rows, three vendors. Two of those vendors (Deepgram for voice, ElevenLabs for TTS) are switchable: the app falls back to nothing when the corresponding feature is off. The first vendor is Fazm's own backend, which exists only to vend short-lived API keys for the other two. The LLM endpoint is the fourth, and it is configurable. That is the entire menu. Anything not in this list is not happening, regardless of what the marketing copy says.

Check 2, the live nettop trace

For a closed-source app, this is the only first-class verifier you have, and it is also the one a non-technical user can run. nettop ships with macOS, takes one command, and shows live per-process external sockets with bytes-in and bytes-out. The trick is to scope it to a single PID so the noise drops to zero, and to start it before you launch the app, so you see the connection-time hosts (analytics SDKs love to phone home in the first two seconds).

Below is what the Fazm process looks like during a normal voice-driven chat session, with the default Anthropic endpoint and Deepgram voice transcription on. Two external rows. Both to model providers. Nothing to ad networks, nothing to analytics, nothing unexpected.

nettop, default Fazm install

Check 3, the firewall disproof

nettop confirms the positive case (you see the host you expected and nothing else). The firewall is the test for the negative case (the feature breaks the way you predicted). Open Little Snitch (or LuLu, which is free), set a deny rule for every outbound connection from the binary except 127.0.0.1, and try the feature again. The error message is the answer: if it says "could not reach api.deepgram.com" you have learned that voice transcription is cloud-dependent. If it says "rate limit on api.anthropic.com" you have learned the LLM endpoint is still there.

The firewall test is asymmetric. An app that keeps working with all outbound blocked is genuinely local for that feature. An app that breaks gives you a clean error message that names the specific cloud dependency, and the dependency is what you actually wanted to know. Either way you walk out with a true statement, which is more than you started with.

Check 4, redirect the model and re-run nettop

The last check is the one most guides skip, because it requires the app to have a configurable model endpoint. When it does, this is the single most useful five-minute experiment you can run. Stand up an Ollama or LM Studio server on localhost (default port 11434 for Ollama, 1234 for LM Studio). For an Anthropic-shaped agent like Fazm, slot a translator like LiteLLM in front so the local server accepts the Anthropic Messages format. Paste http://127.0.0.1:11434 into the Custom API Endpoint field, restart the chat panel, and re-run nettop.

The Anthropic row should disappear from the external table. A new row should appear in nettop -t internal, to 127.0.0.1, with bytes-out matching the size of your prompt. That is the local-LLM path, verified end to end. The agent process, the tool calls, the file system access, and the model call are now all on your machine.

nettop, Custom API Endpoint pointed at localhost

What you are actually proving

binary on your Mac

the AI app process

outbound socket

every host shows up in nettop

model endpoint

configurable, verifiable

your data

stays where the trace says it stays

The Fazm anchor: two lines of Swift, one Settings field

The reason Fazm reads cleanly under this recipe is that the entire model-routing decision is concentrated in one place. The Swift side reads a UserDefault and exports it to the agent subprocess as an environment variable:

// Desktop/Sources/Chat/ACPBridge.swift, lines 468-469
if let customEndpoint = defaults.string(forKey: "customApiEndpoint"),
   !customEndpoint.isEmpty {
  env["ANTHROPIC_BASE_URL"] = customEndpoint
}

The Settings page that drives that default contains one hint, also verbatim from the source:

Route API calls through a custom endpoint (e.g. local LLM bridge, corporate proxy, or GitHub Copilot bridge). Leave empty to use the default Anthropic API.

That is the entire surface area for the LLM-routing decision. One field, two lines, one env var, one process tree. Once you know which two lines they are, the verification recipe collapses to: did the field get exported correctly, and did nettop confirm that the only model traffic now goes where you said. Both questions take less than a minute to answer once you are set up. The voice and TTS endpoints are not yet pluggable in the same way (Deepgram and ElevenLabs are hardcoded as of this writing), so the honest version of "local Fazm" today is: local agent, local model if you point it at one, cloud voice and cloud TTS unless you turn those features off. Knowing exactly which row is which in nettop is what makes the claim honest.

What this recipe will not catch

A few honest limits. nettop sees TLS-wrapped connections, so it tells you the host but not the body. An app that talks only to a provider you trust can still be sending more than you would like; the way to bound that is to know what the API actually does (does the Deepgram streaming endpoint require sending audio, yes; does a Sentry crash report require sending the stack trace, yes; does a PostHog autocapture event require sending the screen, no, but most apps configure it that way anyway). The grep step is what tells you which APIs are in play; the API docs tell you what each one carries.

nettop also does not see UDP listens or in-memory IPC. For a normal AI app this does not matter (everything interesting goes over TCP/443), but for a tool that does its own networking (peer-to-peer, mDNS, on-LAN discovery) you would want to add lsof -iUDP -p $(pgrep AppName) to the toolkit. And nettop does not record history; if you want a long-running record, Little Snitch's Network Monitor or LuLu's rules log will keep one for you.

Want a walkthrough on your own machine?

Bring an app you want to audit and we will run the four checks together over a call, on your Mac, in real time.

FAQ

Why nettop and not Wireshark or Little Snitch?

Because nettop ships with macOS, runs without a kext, and shows live per-process external sockets, which is exactly the question. Little Snitch is a great long-term monitor but you have to install and license it. Wireshark sees TLS-wrapped bytes which tells you the host but not the body, same as nettop, and adds a learning curve. The point of this guide is the cheapest thing that answers the question: what hostname is this binary talking to right now. Once you have that, all three tools tell you the same story.

If everything is over TLS, can I really trust the hostname?

You can trust it as far as DNS and the cert chain are willing to lie, which on a standard Mac is not very far. If api.deepgram.com resolves and the connection succeeds, your bytes are on their way to Deepgram's IP block. There are exotic ways to fake that (hosts file, MITM proxy, custom CA) and a normal user is not in any of those situations. The bigger trust question is: does the app phone home before you grant permission, and does it hide traffic during quiet periods. Both of those show up in nettop the moment you launch the app, before you click anything. Watch the first ten seconds.

What does an honest local-AI-friendly app actually look like in nettop?

Two shapes are common. Either you see no external rows at all while the AI features run, only LISTEN and 127.0.0.1 entries (this is the case for an app pointed at Ollama or LM Studio on localhost). Or you see exactly one external row, to the host you configured, and nothing else (this is the case for an app pointed at a corporate proxy or a private gateway). What you should not see is several unrelated external rows, especially to ad/analytics SDKs or to a model provider you did not choose.

How does Fazm fit into this verification, since it is not running fully local out of the box?

Fazm is the worked example because the source is MIT and every endpoint is grep-able in a few seconds. The honest read of the default install is: the agent loop, file access, browser drive, and macOS accessibility calls all run on your Mac, but the LLM call goes to api.anthropic.com, voice transcription goes to api.deepgram.com, and TTS goes to api.elevenlabs.io. Those are visible in three Swift files (KeyService.swift, TranscriptionService.swift, ChatToolExecutor.swift). The Settings page exposes a Custom API Endpoint field that overrides the LLM destination via the ANTHROPIC_BASE_URL env var (the assignment is at ACPBridge.swift:468-469). Point that at a localhost LLM bridge and the Anthropic row in nettop disappears, replaced by 127.0.0.1. That is the verification path.

Does turning on macOS firewall block-all actually prove anything?

It proves the negative case loud and clear. If the app advertises a local feature and that feature stops working when you flip System Settings, Network, Firewall to block all incoming, plus a Little Snitch deny rule for the binary, then the feature is not local. The forward direction is not as clean: an app that still works with the firewall blocking everything inbound just means it talks outbound, which most cloud-using apps do. So use firewall block-all to disprove a claim, and use nettop to confirm a positive one. They are different gates.

What about telemetry, crash reports, update checks?

Those are the easiest to forgive and the most often missed. A reasonable bar: a local-first app may phone home for app-update checks (one HEAD request to a release manifest, on launch and once per day), and may submit anonymized crash reports if the user opted in. It should not phone home with screen contents, accessibility tree dumps, or microphone audio. nettop will not tell you what is in the body, so you fall back to the source. For an open-source app, grep for fetch, URLSession.shared, and the names of any analytics SDKs (PostHog, Sentry, Amplitude, Mixpanel, Segment). For a closed-source app, the firewall-block test plus a careful read of the privacy policy is what you have.

I am not technical. What is the lazy version of this?

Three steps. One, install Little Snitch's free Network Monitor or use the built-in Activity Monitor, Network tab. Two, launch the app, use the AI feature you care about, and watch which hostnames or IP addresses light up next to the app's process row. Three, if you see something other than your own machine and the model provider you expected, the local claim is incomplete. You do not need to read source for this. You do need to actually run the app while you watch.

Why does this matter for a small business owner who is not paranoid?

Because the answer changes what the AI is allowed to look at. If the screen capture, accessibility tree, or microphone feed is going to a third party, then customer data on your screen, invoice numbers in QuickBooks, and the names you say out loud are all leaving the building. Most privacy policies say the right things, and most are honest, but the policy is not the verifier. nettop and the source code are. Five minutes of verification on your specific machine is worth more than a thousand-word policy page.

Related guides

Privacy

How AI agents actually see your screen, and where the pixels go

The screenshot route sends a full image of your display to a vision model. The accessibility-tree route sends element metadata. The bytes per second are not comparable.

Read

Architecture

The interesting half of local-first AI is the agent, not the weights

The model and the agent are two stacked things. Most playbooks argue about the first layer. The harder question is what happens at the second.

Read

Local inference

Run vLLM locally on Mac with an AI agent

When the weights actually live on your Mac and the agent talks to them on 127.0.0.1, what changes in the verification picture above.

Read