VISIBLE OVERLAY / YOUR REAL CHROME / YOUR SESSIONS

AI browser automation you can actually see, running inside the Chrome you are already signed into

Every product on the first page for this keyword puts the automation in a cloud browser or a separate Chromium window. Fazm runs it inside the exact Chrome.app you opened this morning, keeps your logins and browser fingerprint intact, and drops a visible glowing overlay on every page the agent touches so you always know when the AI has the wheel. The overlay reads Browser controlled by Fazm · Feel free to switch tabs or use other apps and uses pointer-events:none so it never blocks your own clicks.

M
Matthew Diakonov
10 min read
4.9from Written from the Fazm source tree
Runs in your real Chrome
Visible overlay on every page
pointer-events:none
Accessibility tree, not screenshots
Signed macOS app

WHY THIS KEYWORD IS MISLEADING

The default picture of AI browser automation is wrong

If you google this exact phrase you will find ten variations on one answer: a cloud-hosted agent that spins up a fresh Chromium somewhere, runs a task, and shows you a video of it afterwards. That works great for scraping a competitor's pricing page. It is a bad fit for the tasks most people actually want automated.

The tasks people want automated involve logged-in state. Paying a bill in a bank portal. Replying to a support ticket in Help Scout. Updating a CRM record. Applying to a job on LinkedIn with the resume already saved in autofill. A fresh cloud Chromium does not have any of that. It has to log in from scratch, which means passing your credentials to the cloud, solving whatever 2FA challenge comes up, and surviving the browser fingerprint heuristics that flag headless sessions.

The Fazm answer to this is the boring one: do not use a fresh browser at all. Drive the one the user is already running, with their cookies, their saved passwords, their fingerprint, their extensions, their trust history. Then put a visible overlay on top of it so the user knows when they are watching the AI and when they are watching themselves.

The default pattern vs the Fazm pattern

An agent runs in a cloud-hosted or locally launched Chromium that has none of your saved state. To do a task on a site you are already logged into, the agent either asks for your credentials and logs in fresh (so 2FA, captchas, and fingerprint heuristics all fire at the worst time), or it cannot do the task at all.

  • Fresh browser, no cookies
  • New fingerprint, often flagged as bot
  • Re-login for every logged-in site
  • No visual signal when the agent is driving
  • Task dies the moment it leaves the tab
z-index:2147483647pointer-events:noneaddInitScriptfazm-overlay--image-responses omitmmlmfjhmonkocbjadbfplnigmagldckmAXUIElementCreateApplicationkAXFocusedWindowAttributePLAYWRIGHT_USE_EXTENSIONfazm_toolsplaywrightmacos-usegoogle-workspace

THE ANCHOR FACT

One CSS rule decides the whole UX

The overlay is a single DOM element injected at the top of every page the agent controls. It has id fazm-overlay, lives at z-index 2147483647 (the maximum 32-bit signed integer, so no page can stack above it), and uses pointer-events:none everywhere so your clicks, hovers, and scroll events pass through to the page underneath. That combination is the entire trick: visible and undismissable, completely non-blocking.

acp-bridge/browser-overlay-init.js (lines 16 to 59, abridged)

The status pill text ("Browser controlled by Fazm · Feel free to switch tabs or use other apps") is important enough to be hardcoded, not templated. It is explicitly giving the user permission to ignore the agent and do something else. On a sandbox browser this would be irrelevant; on your own Chrome it is the difference between trust and paranoia.

HOW THE OVERLAY GETS THERE

From the Mac app to your browser tab

SEQUENCE · overlay injection path

Fazm appacp-bridgePlaywright MCPChrome extensionYour tabstart browser task--extension --image-responses omitCDP connect via local tokenattach to existing tabpage.evaluate(overlayScript)YAML snapshot (ref=e1, e2, ...)structured page statenext action decided by agent

WHERE THE NUMBERS LIVE

A few specific numbers from the source

None of these are invented. Each one is a literal value that appears in a file on disk in the Fazm source tree. If you downloaded the open-source repo at github.com/mediar-ai/fazm and grepped for any of the tokens below, you would find them.

0z-index of the overlay (INT32_MAX)
0built-in MCP servers registered as peers
0line where --image-responses omit is set
0animated gradient wings around each page

Line numbers are from the current main branch. 1266 is the BUILTIN_MCP_NAMES Set in acp-bridge/src/index.ts; 1033 is the Playwright flag line in the same file; the overlay z-index appears twice in acp-bridge/browser-overlay-init.js.

THE ARCHITECTURE

Your English sentence to a click in your browser

Where the browser leg sits inside Fazm

You typing
Claude Agent SDK
acp-bridge
playwright
macos-use
whatsapp
google-workspace
fazm_tools

WIRING THE EXTENSION

The extension mode switch, in ten lines

The bit that decides whether Fazm drives your real Chrome or spins up a fresh one is genuinely small. The whole thing is an env-var gated playwrightArgs.push("--extension") plus a pass-through of the extension token so the local Playwright and the Chrome extension can authenticate each other.

acp-bridge/src/index.ts (lines 1028 to 1054, abridged)

A GROUND-TRUTH MOMENT

What the startup log looks like when Accessibility works

On boot, Fazm does not just call AXIsProcessTrusted. That function returns cached TCC state that can be stale for hours after a macOS update or app re-sign. Instead it calls the real accessibility API against the frontmost app and against Finder, and logs the result. Here is what the log looks like on a healthy machine.

~/Library/Logs/Fazm/app.log

WHAT THIS BUYS YOU

The practical consequences

Logged-in sites just work

Because the browser is your browser, the agent inherits every session you already have. No credential forwarding, no fresh login, no surprise 2FA prompt five minutes into a task.

Cloudflare Turnstile and bot checks pass transparently

Your Chrome has a history. It has passed thousands of Turnstile and reCAPTCHA checks over time. The agent inherits that reputation. A headless Chromium does not.

Shared control, same tab

The overlay is pointer-events:none. The agent clicks what it clicks, you click what you click, and neither blocks the other. You can take over mid-task and hand it back.

Task can exit the browser

When the workflow ends with a desktop action (rename a file, change a macOS setting, send a WhatsApp message) the agent switches to the macos-use MCP and keeps going via the real Accessibility API.

No pixel clicks

Playwright is launched with --image-responses omit. The agent navigates a YAML accessibility snapshot with numbered refs (ref=e1, e2, ...), not screenshots. The model context stays small and deterministic.

Visible, always

The overlay sits at INT32_MAX z-index with a status pill reading 'Browser controlled by Fazm'. If the wings are not on the page, the agent is not driving. That is a real-time trust signal.

HEAD TO HEAD

Fazm vs a typical cloud AI browser automation service

FeatureTypical cloud AI browser agentFazm
Whose Chrome runs the taskA fresh Chromium on a vendor serverThe Chrome.app on your Mac, already running
Logged-in sitesNeeds your credentials forwarded; 2FA at task startInherited from your real profile; nothing to forward
Bot-detection / TurnstileOften flagged; user solves challengesUses your existing fingerprint and history
Visibility of what the agent is doingReplay video after the factLive glowing overlay + status pill on every page
Blocking the user while agent runsUsually separate session; cannot share tabpointer-events:none, you can click through at any time
What the model sees of the pageUsually screenshots; expensive contextYAML accessibility snapshot, ref=e1 style
Works outside the browserNo, browser-onlyYes, same agent jumps to macos-use for native Mac apps
Shape of the productSaaS or developer frameworkSigned notarized consumer Mac app, English in a floating bar

ONE-TIME SETUP

What installing this looks like

The extension-mode setup is a three-step thing you do once. Everything downstream is regular English against a floating bar.

ONE-TIME INSTALL

  • Download and install the signed Fazm Mac app from fazm.ai
  • Grant Accessibility and Screen Recording permissions (two System Settings panes)
  • Install Playwright MCP Bridge from the Chrome Web Store (ID mmlmfjhmonkocbjadbfplnigmagldckm)
  • Paste the bridge auth token shown in the Fazm setup window into the extension options
  • Click 'Test connection' to confirm the local Playwright can reach the extension
  • After this, every English sentence that touches the web routes through your real Chrome

WHO THIS IS FOR

Who should use AI browser automation like this

If you need to scrape a million product pages or run thousands of parallel sessions, use a cloud tool. That is what they are for. Browser automation at that shape is a server problem and a Fazm-style consumer Mac app is not the right vehicle for it.

If you are a single person whose most annoying tasks look like "reply to these five Slack threads, pull the attachments into Drive, file the invoices in our accounting tool, and send Sam the final link", the Fazm shape is the one that fits. One session. Your logins. Your Mac. You watch it happen. When it gets stuck you take over without restarting.

The spectrum runs from robotic process automation (cloud, scripted, high-throughput, zero trust) to personal AI agent (local, reasoning, one operator, full trust). Fazm sits firmly on the right side of that spectrum, and the visible overlay is how it earns the trust.

SOURCE FILE INDEX

Everything on this page, verifiable in four files

acp-bridge/browser-overlay-init.js

Full overlay DOM and CSS, status pill text, z-index, pointer-events, four animated wings.

acp-bridge/browser-overlay-init-page.js

Playwright init-page entry point that hooks page load events and calls page.evaluate with the overlay script.

acp-bridge/src/index.ts

Line 1029 is the --extension switch. Line 1033 is --image-responses omit. Line 1266 is the BUILTIN_MCP_NAMES Set with five entries.

Desktop/Sources/BrowserExtensionSetup.swift

Line 220 contains the Chrome Web Store URL for the Playwright MCP Bridge extension. Token generation and connection probe live in the same file.

Desktop/Sources/AppState.swift

Line 433 is testAccessibilityPermission. Lines 468 to 485 are the Finder fallback. The CGEvent tap probe is below that.

THE UNCOPYABLE PART

Why this page is not easy to clone

Almost every other result on the SERP is reviewing a product from the outside. They have not read the source, because the source is theirs and it is behind an API. The Fazm app is open source at github.com/mediar-ai/fazm, so everything on this page points at a line of code you can read yourself.

You can go and see 0 literally written in acp-bridge/browser-overlay-init.js. You can grep for BUILTIN_MCP_NAMES and find the set of five. You can run the Mac app yourself and look at a real page with the overlay on it. That verifiability is the moat, not any single feature.

Want the agent to share a tab with you instead of a cloud browser?

Fifteen minutes. I will show you the overlay injecting onto a live page, the YAML snapshot with ref=e1 in it, and the agent handing control back mid-task.

Book a call

Frequently asked questions

How does Fazm attach to my actual Chrome instead of launching a headless one?

Fazm ships Playwright MCP in extension mode. When PLAYWRIGHT_USE_EXTENSION is set to true, the bridge appends --extension to the Playwright args (see acp-bridge/src/index.ts lines 1029 to 1031). That flag tells Playwright to connect to a Chrome Web Store extension called Playwright MCP Bridge, with extension ID mmlmfjhmonkocbjadbfplnigmagldckm. Once the extension is installed and you paste the one-time auth token from the Fazm setup window, Playwright drives the Chrome you are already running, with your existing profile, cookies, SSO sessions, and browser fingerprint intact. If the extension is not present, Playwright falls back to launching its own fresh Chromium.

What is the visible overlay and why does Fazm add it?

Every page the agent controls gets a full-viewport element with id fazm-overlay injected at z-index 2147483647 (the maximum 32-bit signed int, so nothing else can stack above it). It renders four soft radial gradient wings at the edges of the page and a small centered status pill that reads 'Browser controlled by Fazm · Feel free to switch tabs or use other apps'. The overlay uses pointer-events:none, so it is visible but never intercepts clicks. You can see the exact CSS and DOM in acp-bridge/browser-overlay-init.js lines 16 to 68. The overlay exists because the user needs to know when the AI is driving so they can stop it if it starts doing something unexpected, and because on a real Chrome with real logins the psychological cost of not knowing is higher than on a fresh sandbox.

How is this different from Browser-Use, Operator, Chrome Auto Browse, or Fellou?

Those are variations on the same shape: the agent gets a screenshot or a DOM snapshot from a browser the user does not own (cloud-hosted Chromium, or a chrome profile the vendor runs). Fazm is the inversion. The browser the agent drives is the exact Chrome.app you double-clicked this morning, in your own macOS user account, with your own extensions and saved passwords. The agent gets accessibility snapshots, not screenshots, and those snapshots are saved to /tmp/playwright-mcp as YAML files with numbered refs like [ref=e1]. Inline base64 image responses are stripped out at the MCP layer by launching Playwright with --image-responses omit (acp-bridge/src/index.ts line 1033), so the model never eats a screenshot even when a cloud competitor would have sent one.

If the overlay covers the page, how do I still click things myself?

The overlay uses pointer-events:none on the root and all of its children. CSS pointer-events:none makes the element completely transparent to the mouse: hovers, clicks, text selection, scroll wheels all go straight through to whatever is under it. So you see the glow and the status pill, but every click you make lands on the actual page. That is the whole point. The agent and the user can share the same tab in real time. You can watch the AI fill a form, take over mid-field to fix an autocomplete, then hand it back by typing into Fazm again.

What happens when the task leaves the browser? Does Fazm stop working?

No. Fazm registers five MCP servers as peers, not just Playwright. The list is hardcoded as a Set in acp-bridge/src/index.ts at line 1266 with exactly these names: fazm_tools, playwright, macos-use, whatsapp, google-workspace. When the task needs to rename a file in Finder, open a macOS setting, or paste something into the native WhatsApp app, the agent switches to macos-use or whatsapp, which drive those apps through the real macOS Accessibility API (AXUIElement, kAXFocusedWindowAttribute, etc.) rather than pixel guessing. So AI browser automation here is a subset of something bigger, and the same English sentence can start in Chrome and finish in Finder without a seam.

Is Fazm a developer tool or a consumer app?

A consumer app. There is no npm install, no API key to generate, no Python script to run, no docker compose up. You download a signed notarized Mac app, grant Accessibility and Screen Recording permissions, optionally install the Playwright MCP Bridge extension from the Chrome Web Store, and type English into a floating bar. Under the hood it runs the Claude Agent SDK against the five bundled MCP servers, but that is an implementation detail the user does not touch.

How does Fazm verify the Accessibility permission is actually working?

At boot it runs a real round-trip through the accessibility API, not just AXIsProcessTrusted which caches stale values on macOS 26 Tahoe. The function is testAccessibilityPermission in Desktop/Sources/AppState.swift, starting at line 433. It grabs the frontmost app, calls AXUIElementCreateApplication on its PID, then tries to read kAXFocusedWindowAttribute back. Success, noValue, and attributeUnsupported all count as 'working'. If it gets cannotComplete, it re-runs against Finder specifically (lines 468 to 485) to disambiguate a real broken permission from an app that simply does not implement AX. If Finder also fails, the code falls back to creating a CGEvent tap as a last-resort probe of the live TCC database. That three-stage check is why the agent can trust the tree.

Why use the accessibility tree at all if Playwright already has a DOM?

Because the agent also needs to work outside the browser. In Chrome, Playwright hands back its own structural snapshot. But Gmail, Slack, Finder, Mail, Settings, Notes, Xcode, VSCode, and Catalyst apps like WhatsApp do not have a DOM the agent can query. They do publish to the macOS Accessibility API (the same surface screen readers use), which exposes a typed tree of roles, labels, bounds, and children for every UI element. By using the accessibility tree as the canonical ground truth for any surface, Fazm gets one mental model that spans Chrome plus the rest of your Mac, instead of a browser-only model that quits at the URL bar.

Does the overlay ever get stripped by a site that sanitizes DOM?

The overlay is injected via Playwright's page.evaluate on every page load, not via addInitScript (which does not work reliably on CDP-connected contexts in extension mode). See the comment block at the top of acp-bridge/browser-overlay-init.js. That means if a single-page app re-renders its body, the overlay re-injects on the next load event. If the page is a hardened origin that aggressively rewrites its own DOM, the overlay can be cosmetically displaced, but the agent's control path is independent of it. The overlay is a transparency signal, not a load-bearing component.

What does the one-time Chrome extension setup actually install?

It installs the Playwright MCP Bridge extension from the Chrome Web Store, at chromewebstore.google.com/detail/playwright-mcp-bridge/mmlmfjhmonkocbjadbfplnigmagldckm. This is an open-source bridge that lets a local Playwright process drive the Chrome window over a localhost channel. Once installed, you paste a bridge token into its options page (Fazm generates the token in BrowserExtensionSetup.swift and stores it under UserDefaults key 'playwrightExtensionToken'), then the setup window runs a connection probe to confirm the local Playwright can reach the extension. After that, any command you give Fazm that involves a URL routes through this extension into your real Chrome, not a separate Chromium.