Reframing the memeAccessibility APIs, not screenshotsSource-verified, April 20 2026

Your FBI agent is fired. Meet the AI agent you actually hired.

The TikTok joke about an exhausted FBI agent assigned to your computer is funny because everyone agrees, intuitively, that an invisible non-consenting watcher is bad. There is, in fact, going to be an agent on your computer in 2026. The question is whether you chose it, whether you can see it, and whether you can audit how it reads your screen. This guide walks through the literal code in Fazm that answers those three questions: a glowing halo injected at z-index 2147483647, a status pill that reads “Browser controlled by Fazm · Feel free to switch tabs or use other apps”, and an accessibility-API tree that returns text instead of pixels.

Matthew Diakonov, Written with AI

Published April 20, 202611 min read

Download Fazm for Mac

4.9from 200+

Injects a literal pill 'Browser controlled by Fazm · Feel free to switch tabs or use other apps' on every controlled page (browser-overlay-init.js:58)

Sees your Mac via the AXUIElement accessibility tree, the same one VoiceOver reads, not via screenshots (acp-bridge/src/index.ts:1056)

Source-required onboarding line: 'Everything is open-source at github.com/mediar-ai/fazm — your data is stored locally and you own it all' (ChatPrompts.swift:308)

The agent on your computer should be one you can see

Not invisible. Not non-consenting. Not a screenshot guesser.

Halo on every page Fazm controls

z-index 2147483647 so nothing can hide it

Reads the AXUIElement tree, not screenshots

Permissions you grant one at a time

Open source on github.com/mediar-ai/fazm

0:00 / 0:05

z-index of the Fazm browser overlay (max 32-bit signed int, browser-overlay-init.js:23)

pulsing wing gradients pinned to the viewport edges (top, bottom, left, right)

MCP servers Fazm spawns at launch (fazm_tools, playwright, macos-use, whatsapp, google-workspace)

network ports the bundled Python MCP listens on (it speaks stdio)

The meme, in one sentence

On TikTok, “the FBI agent assigned to my computer” is a friendly fiction: somewhere in a bullpen in Quantico, an exhausted civil servant has to read your group chats, watch your tabs, and sigh through your search history. The joke works because everyone, regardless of politics, agrees on the thing being mocked: an invisible watcher you did not pick, who has full access, and who is accountable to nobody you know.

For most of computing history that fiction was harmless because there was no real version of it on the consumer desktop. There were spyware, RATs, parental controls, MDM agents, antivirus drivers; all of them edge cases, none of them in the daily life of an average Mac user. In 2026 that changes. There is going to be an agent on your computer. It will read your screen, click buttons, type into forms, and run code on your behalf.

The question is not whether the agent will exist. It will. The question is whether the version of it you live with will look like the meme, or like its inverse. This page is about the inverse.

z-index 2147483647

“The anchor fact no competitor page mentions. When Fazm drives a browser tab, the page itself wears a name tag. acp-bridge/browser-overlay-init.js (lines 16-67) injects an element with id='fazm-overlay' that paints four pulsing wing gradients (#fazm-w-top, #fazm-w-bottom, #fazm-w-left, #fazm-w-right) on the four edges of the viewport, four floating blobs in the corners, and a centered pill whose innerHTML on line 58 is the literal UTF-8 string 'Browser controlled by Fazm · Feel free to switch tabs or use other apps'. Both the canvas (line 23) and the pill (line 43) carry z-index: 2147483647, the maximum 32-bit signed integer, with !important. The script is re-injected on every page load and DOMContentLoaded by the patched Playwright extensionContextFactory.js (scripts/patch-playwright-overlay.cjs), because addInitScript() does not work in CDP extension mode. There is no version of Fazm that quietly drives your browser without painting itself onto it.”

/Users/matthewdi/fazm/acp-bridge/browser-overlay-init.js:58 + /Users/matthewdi/fazm/acp-bridge/src/index.ts:1037

The literal lines that flip the meme

The whole architecture of consent is one JavaScript file. If the file is on disk, the agent paints itself onto every page it touches. If you delete it, Fazm refuses to drive the browser. There is no toggle, no telemetry feature flag, no A/B test.

acp-bridge/browser-overlay-init.js (excerpt)

Four parts of the chrome that name the operator

The overlay is not subtle by accident. Every part of it exists so a person glancing at their screen can answer one question in under a second: is something else driving this right now?

Four pulsing wing gradients on the viewport edges

browser-overlay-init.js lines 25-28 paint #fazm-w-top, #fazm-w-bottom, #fazm-w-left, and #fazm-w-right as fixed-position radial gradients in cyan, blue, indigo, and purple. Animations 'fazm-pulse-top' through 'fazm-pulse-right' (lines 34-37) breathe each wing on a 4 to 5 second cycle so the chrome of the page itself looks like it is alive.

A centered status pill that names the operator

Line 58 of browser-overlay-init.js sets the pill content to the literal UTF-8 string 'Browser controlled by Fazm · Feel free to switch tabs or use other apps'. The pill is fixed at 50% / 50%, has a small spinning ring, and is built so the user can ignore it and keep working.

z-index 2147483647 so nothing can hide it

Both #fazm-canvas2 (line 23) and #fazm-pill3 (line 43) carry z-index: 2147483647 with the !important flag. That is the maximum 32-bit signed integer. A page that wanted to occlude the overlay would have to set the same value, which !important refuses to lose to.

Injected on every page load, by Playwright's --init-page

acp-bridge/src/index.ts line 1037 passes '--init-page' pointing at browser-overlay-init-page.js to the Playwright MCP. The patch in scripts/patch-playwright-overlay.cjs ensures the script is re-injected on every page 'load' and 'domcontentloaded', because addInitScript does not work in Chrome extension mode.

How a chosen agent perceives your Mac

The screenshot route is a black box: the agent receives pixels, asks a vision model what it sees, and you have no audit trail. The accessibility route is text the whole way through. Each box on the right is a tree of labelled nodes you could open in a JSON viewer and read.

From your prompt to a labelled tree, never to opaque pixels

The meme, beside its inverse

The toggle below is the same comparison most of the SERP for this keyword skips. Tap between the two states.

Invisible watcher vs. chosen agent

An imagined invisible watcher who has full access, who you did not choose, and who is accountable to nobody you can name. Funny because everyone agrees it would be bad if it were real.

No on-screen indicator
No menu bar icon
No permission dialog
Reads everything by default
No way for you to audit what was 'seen'
Closed source, by definition

Surveillance vs. consensual automation, side by side

The same axes the SERP avoids: announcement, perception model, consent, source openness, data residency, tool surface, and how to stop it.

Feature	The 'FBI agent' meme	Fazm on your Mac
Does the agent announce itself when it is in control?	No. The whole conceit of the meme is invisibility. Real spyware, RATs, keyloggers, and stalkerware are designed to leave no on-screen footprint, no menu bar icon, no permission dialog. The user has no way to know the agent is active.	Yes, on every page Fazm controls. acp-bridge/browser-overlay-init.js injects an element with id='fazm-overlay' that paints four pulsing wing gradients pinned to the viewport edges (#fazm-w-top, #fazm-w-bottom, #fazm-w-left, #fazm-w-right) and a centered pill with the literal text 'Browser controlled by Fazm · Feel free to switch tabs or use other apps' (line 58). The overlay sits at z-index 2147483647, the maximum 32-bit signed integer, so nothing on the page can hide it.
How does it 'see' your screen?	Screenshots, fed to a vision model that guesses x,y coordinates. The user cannot audit what was 'seen' because pixels are opaque. The model can read a private message that happens to be on screen as easily as a button label.	Through macOS Accessibility APIs and the AXUIElement tree. The bundled mcp-server-macos-use binary (registered at acp-bridge/src/index.ts line 1056) reads the same accessibility tree VoiceOver uses: structured nodes with roles, labels, and frames. For browser tabs, browser_snapshot returns a labelled DOM with [ref=eN] tokens. The agent reads text, not pixels.
Did you grant permission?	No. The premise is non-consensual. Even when consent exists for an installed antivirus or remote-access tool, the user usually has no granular per-action control: the agent already has root, and uses it.	Yes, from a system-level macOS permission dialog you actively approved. Onboarding asks one permission at a time (microphone → accessibility → screen_recording, ChatPrompts.swift line 326), with a one-sentence explanation per prompt. You can revoke any of them in System Settings → Privacy & Security at any time.
Is the source code public?	No. Investigative or surveillance code is closed by definition. You cannot diff today's binary against yesterday's, and you cannot read the prompt the agent was given.	Yes. The onboarding script literally instructs the agent to send a trust-building message before requesting permissions (ChatPrompts.swift line 307-308): 'Everything is open-source at github.com/mediar-ai/fazm — your data is stored locally and you own it all.' The overlay file, the MCP server registrations, and the system prompt are all readable on disk.
Where does your data live?	On a remote server, by design. The whole point of an investigative or surveillance agent is to ship what it sees off-device.	On your Mac. The Python MCP server speaks stdio, not a network socket; nothing listens on a port. Conversation history, knowledge graph, and indexed files sit in the user's local data directory. The bundled venv at Contents/Resources/google-workspace-mcp/.venv is hermetic.
How many tools can the agent reach?	Whatever the operator wired in. You do not get to know.	Five MCP servers spawned at launch: fazm_tools, playwright (your real Chrome via the MCP Bridge extension), macos-use (any Mac app via accessibility APIs), whatsapp (Catalyst app via accessibility), and google-workspace (Python, Gmail/Calendar/Drive/Docs). The full list is enumerated at acp-bridge/src/index.ts line 1266.
Can you stop it mid-action?	No. Stopping a covert agent requires identifying that it exists first.	Yes. The floating bar is always reachable. Closing the chat window or quitting Fazm stops the agent immediately. The Playwright extension token can be revoked from chrome://extensions in one click.

What actually happens the first time you let Fazm onto your Mac

You install Fazm. macOS asks for permissions, one at a time.

Onboarding follows the order: microphone → accessibility → screen_recording (ChatPrompts.swift line 326). Each permission gets a one-sentence reason and a 'Why?' button that the agent must answer if you tap it. The agent is told to never nag (line 327).

Before the first permission, the agent says where the code lives.

ChatPrompts.swift lines 307-308 instruct the agent to send this exact line: 'Everything is open-source at github.com/mediar-ai/fazm — your data is stored locally and you own it all.' This message is required before any sensitive permission request.

When Fazm starts driving the browser, every tab gets a halo.

The Playwright MCP is launched with --init-page browser-overlay-init-page.js (acp-bridge/src/index.ts line 1037). Every page Fazm visits gets injected with the wing gradients, the four corner blobs, and the centered pill. You always know when the agent is in control.

When Fazm reads a Mac app, it asks the AXUIElement tree, not the pixels.

The mcp-server-macos-use binary, bundled inside Contents/MacOS, is registered at acp-bridge/src/index.ts line 1056. It exposes the same accessibility nodes VoiceOver reads. Roles, labels, and frames travel as text. You can read a transcript of what the agent saw.

If you close Fazm, the agent stops.

There is no daemon staying behind. The bridge process and the five MCP servers (fazm_tools, playwright, macos-use, whatsapp, google-workspace, enumerated at acp-bridge/src/index.ts line 1266) exit with the app. The Playwright extension token can be revoked from chrome://extensions.

The two MCP servers that read your screen as text

The Playwright MCP returns labelled DOM nodes. The macos-use MCP returns the AXUIElement tree. Both are text. You can pipe them to a file. You can read them. You can diff yesterday’s perception against today’s.

acp-bridge/src/index.ts (excerpt)

What a chosen agent looks like, in chips

Browser controlled by FazmFeel free to switch tabsz-index 2147483647#fazm-overlayAXUIElement treeVoiceOver-grade accessibility[ref=eN] DOM tokensmacOS permission dialoggithub.com/mediar-ai/fazmstdio, not a portPYTHONDONTWRITEBYTECODE=1Playwright MCP --init-pageFive MCP serversLocal data only

The sentence the agent must say before it can ask for anything

Most onboarding flows make trust an afterthought, a footer link, a long Terms doc nobody reads. Fazm hard-codes a one-line trust statement into the system prompt at the exact moment it matters: the moment before you grant your first sensitive permission.

Desktop/Sources/Chat/ChatPrompts.swift (excerpt)

Verify the claims yourself in three commands

None of the facts on this page require trust. They are filesystem state on your Mac. A bundled accessibility-API binary, a bridge log, and an empty list of listening ports are all you need.

zsh

The contract, in plain words

You will always know when Fazm is in control of a browser tab, because the tab will have a halo and a pill on it. You will always be able to read the text of what Fazm saw, because Fazm reads accessibility trees, not screenshots. You will always be able to stop Fazm by quitting the menu bar app, because the bridge process and its five MCP servers exit with the app.

That is what good looks like for an agent on your computer in 2026. The meme had it right about what bad looks like.

What to demand from any 'AI agent on your computer' product

A visible on-screen indicator the moment the agent is in control
An accessibility-API perception layer, not a screenshot pipeline
macOS permissions requested one at a time, with a per-permission reason
An open-source repository you can read before granting permissions
A local-first data path, with no listening network ports
A way to stop the agent in one click (quit, revoke token, kill process)
A logged history of every tool call the agent made on your behalf

Want to see the halo, the AX tree, and the kill switch live on your Mac?

Twenty minutes, screen-shared. We open browser-overlay-init.js together, watch the pill paint itself on a live tab, and pull a real AXUIElement tree out of macos-use.

Frequently asked questions

Is there really an FBI agent assigned to my computer?

No. 'The FBI agent assigned to my computer' is a TikTok-era meme: a friendly fiction in which an exhausted federal employee has to watch your search history all day. The Bureau is not literally watching your webcam, and the only way they could is via a court order plus targeted exploitation, not a default subscription. Quora and Clario have written this answer at length. The interesting question for 2026 is the inverse: there genuinely is going to be an agent on your computer, in the form of an LLM-driven assistant, and the question is whether you choose it, see it, and audit it. Fazm is one answer to that question.

How is Fazm different from any other AI 'agent on my computer'?

Three concrete differences, all visible in the source. First, when Fazm is driving the browser the page itself wears a halo: acp-bridge/browser-overlay-init.js injects four pulsing wing gradients (#fazm-w-top through -right) and a centered pill that reads 'Browser controlled by Fazm · Feel free to switch tabs or use other apps' (line 58), at z-index 2147483647 so nothing can hide it. Second, the way Fazm 'sees' your Mac is by reading the macOS Accessibility tree (the same one VoiceOver uses) via the bundled mcp-server-macos-use (registered at acp-bridge/src/index.ts line 1056), and by reading labelled DOM nodes with [ref=eN] tokens via Playwright; not by taking screenshots and asking a vision model to guess pixel coordinates. Third, the onboarding flow is required to send 'Everything is open-source at github.com/mediar-ai/fazm — your data is stored locally and you own it all.' (ChatPrompts.swift line 308) before it can ask for any permission. Other agents do not have to do any of that.

What does 'reads your screen via accessibility APIs' actually mean compared to taking a screenshot?

An accessibility-API read returns a structured text tree: every interactive element has a role (Button, TextField, Group), an optional label ('Send message'), a position rect, and sometimes an editable value. That tree is the same one screen readers use to speak the screen aloud. A screenshot is a 2D array of RGB pixels; everything else (what is a button, what does it say, where is the cursor) has to be guessed by a vision model. Reading the tree is faster, more reliable across UI redesigns, and auditable in a way pixels are not. You can save the tree to a log file and read it back. You cannot meaningfully audit what a vision model 'saw' in a screenshot.

If Fazm can read my screen, how is that not the surveillance the FBI agent meme is making fun of?

Three guardrails that surveillance does not have. One, it requires a macOS permission you actively granted in System Settings, which you can revoke at any time. The OS itself enforces the gate, not Fazm. Two, the only process reading the tree is one you can terminate by quitting the app, and the Python MCP server speaks stdio rather than listening on a port (verifiable with lsof). Three, the data goes to local conversation history and to whatever AI provider you have configured for your turn; the agent's instructions to be transparent about what it is doing are baked into the system prompt at session start (Desktop/Sources/Chat/ChatPrompts.swift). Surveillance is invisible, persistent, network-bound, and deny-by-default for the user. Fazm is the inverse on every axis.

Why does Fazm paint a glowing halo on every webpage when it takes over the browser?

Because anything else would feel like the meme. The halo is not branding for its own sake; it is a hard requirement of consensual automation. acp-bridge/browser-overlay-init.js paints four edge-pinned radial gradients (#fazm-w-top, #fazm-w-bottom, #fazm-w-left, #fazm-w-right) on a 4 to 5 second pulse and a centered pill. The element id is 'fazm-overlay', the canvas inside it has z-index 2147483647 with !important, so a hostile or buggy site cannot occlude it. If you cannot see the halo on a page, Fazm is not driving that page. That is the contract.

What stops a malicious site from hiding the Fazm overlay?

z-index 2147483647 (the maximum 32-bit signed integer, applied to both #fazm-canvas2 on line 23 and #fazm-pill3 on line 43 of browser-overlay-init.js) plus !important on every layout property in the pill block. A site that wanted to draw on top would need to match the same value, and CSS specificity rules with !important make that very hard to do silently. The overlay is also re-injected on every page load and DOMContentLoaded event by the patched Playwright extensionContextFactory.js (scripts/patch-playwright-overlay.cjs), so a single-page-app navigation cannot strip it.

Where does Fazm send my data?

Local first. The macos-use server speaks stdio (no port), the bundled Python MCP for Google Workspace also speaks stdio (no port), conversations and the knowledge graph live on disk in your user data directory. When the agent makes an LLM call, that of course goes to whichever model provider you have configured. There is no analytics-driven 'send a copy of every screenshot to a vendor server' pipeline; there is no copy of every screenshot, because the agent reads the accessibility tree instead.

Is Fazm open source?

Yes. The onboarding script literally requires the agent to say so before requesting any permissions: ChatPrompts.swift line 308 instructs the agent to send 'Everything is open-source at github.com/mediar-ai/fazm — your data is stored locally and you own it all.' as a trust-building message before the first sensitive prompt. You can read the overlay file, the MCP server registrations, the build script that bundles the Python interpreter, and the Swift floating bar code. None of it is obfuscated.

Can I tell when Fazm is reading something I did not ask it to read?

Yes. Two surfaces. First, the Fazm chat panel logs every tool call: each browser_snapshot, each macOS accessibility query, each Python invocation appears as an inline tool call card you can expand. Second, the on-page overlay tells you the browser is under control; absence of overlay means absence of browser control. For native Mac app reads, the agent's chat reply contains the exact tool calls and their text-form arguments, because the macos-use server returns text. You cannot get the same audit trail out of any screenshot-based agent.

How do I stop Fazm mid-action?

Quit the app from the menu bar. The bridge process and the five MCP servers exit with the app (acp-bridge/src/index.ts line 1266 enumerates them). If you only want to disconnect the browser, open chrome://extensions and revoke the Playwright MCP Bridge token, which kills the CDP attach instantly. There is no covert resident process that survives quitting; that is the whole point of building this on plain macOS app architecture rather than a kernel extension.

Why is this in a guide about an FBI meme?

Because the meme is the right cultural anchor for explaining what AI agents on the desktop should look like. The TikTok 'FBI agent assigned to my computer' is funny because everyone agrees, intuitively, that an invisible non-consenting watcher is bad. Fazm is the easiest possible product to position against that intuition: a Mac app you actively install, that asks for permissions one at a time, that paints itself onto every page it touches, that reads your UI through APIs you can audit, and that ships its source code. If your computer is going to have an agent on it (and it is) the meme is the test for what good looks like.

Try the inverse of the meme

Fazm is a free download for macOS. Source on GitHub. Halo on every page. Accessibility tree, not screenshots. Permissions one at a time, granted by you, revocable by you.

Download Fazm for Mac

Your FBI agent is fired. Meet the AI agent you actually hired.

The meme, in one sentence

The literal lines that flip the meme

Four parts of the chrome that name the operator

Four pulsing wing gradients on the viewport edges

A centered status pill that names the operator

z-index 2147483647 so nothing can hide it

Injected on every page load, by Playwright's --init-page

How a chosen agent perceives your Mac

From your prompt to a labelled tree, never to opaque pixels

The meme, beside its inverse

Invisible watcher vs. chosen agent

Surveillance vs. consensual automation, side by side

What actually happens the first time you let Fazm onto your Mac

You install Fazm. macOS asks for permissions, one at a time.

Before the first permission, the agent says where the code lives.

When Fazm starts driving the browser, every tab gets a halo.

When Fazm reads a Mac app, it asks the AXUIElement tree, not the pixels.

If you close Fazm, the agent stops.

The two MCP servers that read your screen as text

What a chosen agent looks like, in chips

The sentence the agent must say before it can ask for anything

Verify the claims yourself in three commands

The contract, in plain words

Want to see the halo, the AX tree, and the kill switch live on your Mac?

Frequently asked questions

Try the inverse of the meme

More on agents that read your screen the right way

Why accessibility APIs beat screenshots for AI desktop agents

What an accessibility tree actually is, for non-developers

Python automation in a browser, for people who do not have Python

Comments ()

Your FBI agent is fired. Meet the AI agent you actually hired.

The meme, in one sentence

The literal lines that flip the meme

Four parts of the chrome that name the operator

Four pulsing wing gradients on the viewport edges

A centered status pill that names the operator

z-index 2147483647 so nothing can hide it

Injected on every page load, by Playwright's --init-page

How a chosen agent perceives your Mac

From your prompt to a labelled tree, never to opaque pixels

The meme, beside its inverse

Invisible watcher vs. chosen agent

Surveillance vs. consensual automation, side by side

What actually happens the first time you let Fazm onto your Mac

You install Fazm. macOS asks for permissions, one at a time.

Before the first permission, the agent says where the code lives.

When Fazm starts driving the browser, every tab gets a halo.

When Fazm reads a Mac app, it asks the AXUIElement tree, not the pixels.

If you close Fazm, the agent stops.

The two MCP servers that read your screen as text

What a chosen agent looks like, in chips

The sentence the agent must say before it can ask for anything

Verify the claims yourself in three commands

The contract, in plain words

Want to see the halo, the AX tree, and the kill switch live on your Mac?

Frequently asked questions

Try the inverse of the meme

More on agents that read your screen the right way

Why accessibility APIs beat screenshots for AI desktop agents

What an accessibility tree actually is, for non-developers

Python automation in a browser, for people who do not have Python

Comments (••)

Comments ()