VISIBLE AUTOMATION / YOUR REAL CHROME / OVERLAY ON EVERY TAB

Browser automation for Chrome, but you can see it happen.

Every playbook on this topic hands you a ChromeDriver binary, a Puppeteer script, or a headless Playwright harness and assumes the browser runs on a server nobody is looking at. Fazm runs in your real Chrome, on your real Mac, while you are using it, and paints a visible animated overlay on every page the agent touches. The pill in the center of the screen reads Browser controlled by Fazm · Feel free to switch tabs or use other apps. That sentence is in the source, at acp-bridge/browser-overlay-init.js.

M
Matthew Diakonov
8 min read
4.9from Written from the Fazm source tree
Attaches to your real Chrome
Visible overlay on every tab
One Chrome extension, one token
Inherits all your logins and cookies
Consumer Mac app, no CLI

THE SILENCE PROBLEM

Most Chrome automation is invisible, and that is the bug

Selenium drives a fresh Chrome with a plain Chrome is being controlled by automated test software banner and a clean profile. Puppeteer and headless Playwright run offscreen by default. ChromeDriver steals window focus every time it clicks. All three were designed for CI: a remote browser inside a container nobody is watching. When you run any of them on your laptop, the tab is alive but there is no signal that tells you what is yours and what is the bot's.

Fazm runs in the opposite setup. The browser is your browser, already logged in, with the tabs you opened. The AI drives one of them. You need to know which.

The quiet agent vs. the visible agent

A headless or newly-launched Chrome runs on the side. If it is visible at all, you get a generic automation banner across the top and a focus-stealing cursor that interrupts anything you are doing.

  • Fresh profile, no cookies or logins
  • Generic automation banner or nothing
  • Steals window focus on every click
  • No per-tab indicator
  • You cannot safely switch to another tab
fazm-overlay (DOM id)fazm-pill3 (status pill)fazm-w-top, fazm-w-bottomfazm-w-left, fazm-w-rightfazm-blob1..4 (corner glows)z-index: 2147483647pointer-events: none--init-page flagpage.context().addInitScript()page.evaluate(overlayScript)PLAYWRIGHT_MCP_EXTENSION_TOKENacp-bridge/src/index.ts:1037@playwright/mcp --extensionChrome Web Store ID mmlmfjhmon...

THE ANCHOR FACT

The overlay is one file on disk

This is the part no other guide on this topic can write, because no other Chrome automation tool ships a visible human-facing overlay. Fazm's lives in a single JavaScript file inside the app bundle. When the agent spawns Playwright MCP, it passes this file via the --init-page flag, and Playwright runs it on every new page. The script builds a div, stamps a unique id on it, and drops it at the highest possible z-index with pointer-events disabled.

acp-bridge/browser-overlay-init.js

Three things in that snippet are worth slowing down on. The dot between the two halves of the pill is a middle dot, not a hyphen, because the pill is a status line and deserves typographic care even though no reader will notice. The pointer-events:none on every layer is mandatory: if the overlay blocked a click, the agent's next browser_click call would land on the overlay instead of the button it meant to press. And the z-index of 0 is the maximum 32-bit signed integer, the largest number any site's own CSS can use. Fazm picks the ceiling so nothing on the page can cover the status pill by accident.

HOW IT ATTACHES

The path from spoken prompt to painted overlay

When you ask Fazm to do something in Chrome, the request flows through a small number of components. The shape is worth seeing before the individual parts.

Inputs converge on one Chrome window

Voice / text prompt
System context
Memory
Fazm agent
Playwright MCP
Your real Chrome
Overlay init-page

What the stack actually does

YouFazm appPlaywright MCPChrome extensionReal Chromespeak a taskspawn with --extension flagattach using auth tokenopen CDP channel--init-page runs overlay JSyou see the glow + pillbrowser_snapshot (accessibility tree)snapshot yml on diskbrowser_click [ref=e17]

WHAT YOU SEE

Anatomy of the overlay

The overlay is not a single blob. It is eleven elements layered inside one parent div, each doing a small job. None of them block clicks. None of them reach the model.

Four edge wings

fazm-w-top, fazm-w-bottom, fazm-w-left, fazm-w-right. Each is a radial-gradient ellipse that pulses between 0.7 and 1.0 opacity on a 4 to 5 second cycle. Gives the viewport a soft breathing glow that frames whatever the agent is looking at.

Four drifting blobs

fazm-blob1 through fazm-blob4 sit in the four corners. Each one is 35 to 40 viewport percent in size, blurred to 60 pixels, and runs its own 6 to 8 second translate+scale loop on a staggered delay. They read as ambient motion, not UI.

One central pill

fazm-pill3 is a 28 pixel tall capsule floating dead center. It contains a spinner and the line Browser controlled by Fazm · Feel free to switch tabs or use other apps. It is the only text on the overlay, and the only part most users consciously notice.

Shared z-index: 2147483647

Every overlay element uses the max signed 32-bit integer as its z-index. No site stylesheet can climb above it by accident, so the pill is never covered by a cookie banner or chat widget on top of the page.

pointer-events: none everywhere

Both the canvas and the pill disable pointer events. A browser_click call from the agent passes straight through to whatever button is underneath. The overlay is a skin, not a chrome.

Zero impact on snapshots

Playwright MCP's default browser_snapshot reads the page's accessibility tree and DOM, not pixels. The overlay is a sibling node in the body and the agent ignores it. It only shows up in screenshots, which are rare.

HOW IT GETS INJECTED

One line in one bridge turns it on

The bridge that spawns Playwright MCP is called the ACP bridge, short for Agent Client Protocol bridge. It lives at acp-bridge/src/index.ts inside the Fazm app bundle. The relevant block is short.

acp-bridge/src/index.ts

Three design choices in that block are load-bearing. First, the overlay only turns on when PLAYWRIGHT_USE_EXTENSION is set to true, which is the user-visible Settings toggle playwrightUseExtension. If someone opts out of extension mode, there is no Chrome to paint on, so the overlay stays out of the arg list. Second, snapshots get written to disk under /tmp/playwright-mcp instead of inlining giant base64 payloads into the model's context. Third, the init-page path is guarded with existsSync so a missing overlay file logs a warning instead of crashing the bridge.

fazm-dev.log at startup

THE UX CLAIM IN ONE LINE

“Feel free to switch tabs or use other apps.”

The pill says this because CDP extension mode does not lock your window. Unlike ChromeDriver, which hijacks focus on every click, the extension runs the agent's commands in the target tab's own event loop. You can keep typing in another tab. You can move your cursor into Figma. You can read Slack. The overlay is the promise that nothing will jump at you.

COMPARED TO THE CLASSICS

Where Chrome automation differs, tool by tool

FeatureSelenium / Puppeteer / ChromeDriverFazm (consumer Mac app)
Runs in your real Chrome profileNo, launches a fresh profileYes, attaches via extension over CDP
Inherits logins, cookies, 2FA stateNo, every run starts cleanYes, all sessions are yours
Visible per-tab indicatorGeneric yellow banner or nothingAnimated glow + centered status pill
Steals window focus on clicksYes, every click grabs focusNo, CDP runs in the tab's event loop
Perceives page via pixels or DOMDOM, but via WebDriver protocolDOM + accessibility tree, not pixels
Setup surfaceLanguage runtime, driver binary, scriptsInstall app, add extension, paste token
Trips Cloudflare TurnstileOften, because the profile is freshNo, because the fingerprint is yours

ONBOARDING

The four phases between download and first drive

The setup window is defined in Desktop/Sources/BrowserExtensionSetup.swift as a state machine with four phases. No command line. No text file to edit. Each phase polls for its own completion condition before the next one unlocks.

BrowserExtensionSetup phases

1

Welcome

Explain what extension mode does in two lines. A primary button advances to Connect. A Skip for now button exists for users who want voice-only features without the browser.

2

Connect

Four sub-steps: check Chrome is installed at /Applications/Google Chrome.app, open the Chrome Web Store listing for Playwright MCP Bridge, open the extension's status.html to copy the token, paste the token into the text field. A timer polls every two seconds to move the checkmarks automatically.

3

Verify

Fazm spawns Playwright MCP with the pasted token, issues a no-op browser_tabs call, and shows a spinner until the server responds. On success, the overlay is injected on your current active tab immediately, as a demo.

4

Done

Token saved to UserDefaults under the key playwrightExtensionToken. The Settings page from that point on shows a masked token (first eight characters plus dots) and a Reconfigure button.

MEASURED AT SIGHT

The numbers that define the overlay

0z-index used
0overlay DOM ids
0pxpill height (px)
0swing pulse cycle (s)
pointer-events: none

If the overlay blocked a click, the agent's next browser_click call would land on the overlay instead of the button it meant to press. pointer-events:none is mandatory.

acp-bridge/browser-overlay-init.js comment block

BEYOND THE BROWSER

The part this topic cannot contain

Every guide in this corner of the internet stops at the browser. Fazm does not, because the same agent that drives Chrome also drives Finder, Mail, and WhatsApp. That chain is the reason the overlay exists at all. If the AI can cross from a tab to a native window in the middle of a task, the human watching needs a cue for which one is currently being driven. The overlay handles Chrome. For native apps, Fazm uses a different MCP server, mcp-server-macos-use, which talks to the macOS accessibility API, not to pixels.

FeaturePure browser automationFazm surface area
Drive a Chrome tabYesYes, via @playwright/mcp --extension
Drive Finder, Mail, SettingsNoYes, via mcp-server-macos-use (AX API)
Send a WhatsApp messageNoYes, via whatsapp-mcp (native Catalyst app)
Read Gmail, Drive, CalendarNoYes, via google-workspace MCP (API, not a browser)
Cross from browser to desktop in one turnNot possibleOne ACP session hosts all four servers

Want to see the overlay in action on your Mac?

Book a 15 minute call and we will walk through extension-mode setup, a live Chrome task, and cross-app chaining into Finder or Mail.

Book a call

FREQUENT QUESTIONS

Things people ask before they install

Frequently asked questions

What actually appears on a Chrome tab when Fazm drives it?

A semi-transparent animated overlay with four soft glow wings on the edges of the viewport, four drifting color blobs in the corners, and a small pill in the center of the screen reading 'Browser controlled by Fazm · Feel free to switch tabs or use other apps'. The overlay sits at z-index 2147483647 with pointer-events:none, so it never blocks a click and the agent underneath still perceives the page exactly as it is. The source lives in acp-bridge/browser-overlay-init.js in the Fazm Mac app bundle.

Why bother showing an overlay at all? Other Chrome automation tools don't.

Because other Chrome automation tools were built for CI servers, not for a human sitting in front of the same Mac. Selenium, Puppeteer, and headless Playwright all assume the browser is unattended. Fazm runs on your personal machine while you work, so the question 'is the AI doing something right now, or is that me?' matters. The overlay is the cheapest possible answer. It also tells you something other tools cannot: the pill literally says feel free to switch tabs, because extension-mode attachment does not lock the window.

How is the overlay injected if Chrome is running under the user's real profile?

Playwright MCP exposes a --init-page flag that takes a JavaScript file and runs it at page load on every tab. Fazm's bridge passes acp-bridge/browser-overlay-init-page.cjs, which reads overlay-init.js from disk and hands it to both page.context().addInitScript() and page.evaluate(). In extension mode (CDP attach), addInitScript is a no-op, so page.evaluate is what actually mounts the DOM. The wiring is at acp-bridge/src/index.ts line 1037.

Does the overlay interfere with what the AI sees or with the page itself?

No. The overlay is a single <div id='fazm-overlay'> absolutely positioned above the page with pointer-events:none. Playwright MCP's browser_snapshot tool reads the accessibility tree and DOM of the underlying page, not a screenshot of pixels, so the visual overlay is invisible to the model in the default code path. It only appears in screenshots, which the agent takes sparingly.

Why is Chrome specifically required? Can Fazm drive Safari, Arc, or Brave?

The extension is a Chrome Web Store listing (Playwright MCP Bridge, ID mmlmfjhmonkocbjadbfplnigmagldckm), and the MCP server attaches over CDP, which is a Chromium-family protocol. In practice it works with Chrome and any Chromium browser that accepts the same extension and exposes CDP. Safari does not expose CDP so extension mode cannot target it. For non-browser Mac apps, Fazm uses a different server, mcp-server-macos-use, which drives apps through the native accessibility API instead.

Does the AI still have access to my cookies, logins, and 2FA state?

Yes, that is the whole point of extension mode. Because the MCP server attaches to your real Chrome window over CDP rather than launching a fresh headless instance, the agent inherits every cookie, session, extension, and logged-in tab you already have. Cloudflare Turnstile, Google sign-in, and bank 2FA all pass transparently because the browser fingerprint is yours. This is the opposite of how Selenium and ChromeDriver work, where every run starts with a clean profile.

Can I switch tabs or work in other apps while the overlay is showing?

Yes. That sentence is literally printed on the overlay pill. Because the Playwright MCP extension attaches over CDP, it does not take focus away from whatever tab or app you are currently using. The agent acts on the tab it was given, and its script runs in the background; you can type into a different tab or switch to VS Code without breaking the automation. This is the single biggest UX difference from ChromeDriver, which steals focus every time it clicks.

Is Fazm a developer framework or something my parents could install?

Consumer. Fazm is a signed, notarized Mac app you download from fazm.ai. No terminal, no pip install, no Node setup, no chromedriver binary on your PATH. The onboarding flow walks you through installing Chrome, adding the Playwright MCP Bridge extension, and pasting a one-time auth token. After that you talk to it by voice and it drives Chrome. The onboarding window is defined in BrowserExtensionSetup.swift and runs as four phases: welcome, connect, verify, done.

How does Fazm know which tab to drive?

Playwright MCP maintains a list of tabs it has attached to via the extension. Fazm's system prompt tells the agent to call browser_tabs list before navigating, check if the user already has the target site open (matched by domain), and switch to that tab with browser_tabs select instead of opening a new one. This keeps the overlay from popping onto unrelated tabs and keeps the browser tidy. The rule is in Desktop/Sources/Chat/ChatPrompts.swift around line 73.

What happens if I close Chrome while the agent is in the middle of something?

The CDP connection drops and Playwright MCP returns an error on the next tool call. The agent sees the error in its tool result, and its system prompt tells it to stop and ask you what to do rather than silently retry. Fazm logs Chrome diagnostics every startup: version, running process count, port 9222 status, singleton lock, and extension count, all visible in the acp-bridge log. Search for the line 'Browser diagnostics: chrome=' in /tmp/fazm-dev.log to see them.