Browser automation for Chrome, but you can see it happen.
Every playbook on this topic hands you a ChromeDriver binary, a Puppeteer script, or a headless Playwright harness and assumes the browser runs on a server nobody is looking at. Fazm runs in your real Chrome, on your real Mac, while you are using it, and paints a visible animated overlay on every page the agent touches. The pill in the center of the screen reads Browser controlled by Fazm · Feel free to switch tabs or use other apps. That sentence is in the source, at acp-bridge/browser-overlay-init.js.
THE SILENCE PROBLEM
Most Chrome automation is invisible, and that is the bug
Selenium drives a fresh Chrome with a plain Chrome is being controlled by automated test software banner and a clean profile. Puppeteer and headless Playwright run offscreen by default. ChromeDriver steals window focus every time it clicks. All three were designed for CI: a remote browser inside a container nobody is watching. When you run any of them on your laptop, the tab is alive but there is no signal that tells you what is yours and what is the bot's.
Fazm runs in the opposite setup. The browser is your browser, already logged in, with the tabs you opened. The AI drives one of them. You need to know which.
The quiet agent vs. the visible agent
A headless or newly-launched Chrome runs on the side. If it is visible at all, you get a generic automation banner across the top and a focus-stealing cursor that interrupts anything you are doing.
- Fresh profile, no cookies or logins
- Generic automation banner or nothing
- Steals window focus on every click
- No per-tab indicator
- You cannot safely switch to another tab
THE ANCHOR FACT
The overlay is one file on disk
This is the part no other guide on this topic can write, because no other Chrome automation tool ships a visible human-facing overlay. Fazm's lives in a single JavaScript file inside the app bundle. When the agent spawns Playwright MCP, it passes this file via the --init-page flag, and Playwright runs it on every new page. The script builds a div, stamps a unique id on it, and drops it at the highest possible z-index with pointer-events disabled.
Three things in that snippet are worth slowing down on. The dot between the two halves of the pill is a middle dot, not a hyphen, because the pill is a status line and deserves typographic care even though no reader will notice. The pointer-events:none on every layer is mandatory: if the overlay blocked a click, the agent's next browser_click call would land on the overlay instead of the button it meant to press. And the z-index of 0 is the maximum 32-bit signed integer, the largest number any site's own CSS can use. Fazm picks the ceiling so nothing on the page can cover the status pill by accident.
HOW IT ATTACHES
The path from spoken prompt to painted overlay
When you ask Fazm to do something in Chrome, the request flows through a small number of components. The shape is worth seeing before the individual parts.
Inputs converge on one Chrome window
What the stack actually does
WHAT YOU SEE
Anatomy of the overlay
The overlay is not a single blob. It is eleven elements layered inside one parent div, each doing a small job. None of them block clicks. None of them reach the model.
Four edge wings
fazm-w-top, fazm-w-bottom, fazm-w-left, fazm-w-right. Each is a radial-gradient ellipse that pulses between 0.7 and 1.0 opacity on a 4 to 5 second cycle. Gives the viewport a soft breathing glow that frames whatever the agent is looking at.
Four drifting blobs
fazm-blob1 through fazm-blob4 sit in the four corners. Each one is 35 to 40 viewport percent in size, blurred to 60 pixels, and runs its own 6 to 8 second translate+scale loop on a staggered delay. They read as ambient motion, not UI.
One central pill
fazm-pill3 is a 28 pixel tall capsule floating dead center. It contains a spinner and the line Browser controlled by Fazm · Feel free to switch tabs or use other apps. It is the only text on the overlay, and the only part most users consciously notice.
Shared z-index: 2147483647
Every overlay element uses the max signed 32-bit integer as its z-index. No site stylesheet can climb above it by accident, so the pill is never covered by a cookie banner or chat widget on top of the page.
pointer-events: none everywhere
Both the canvas and the pill disable pointer events. A browser_click call from the agent passes straight through to whatever button is underneath. The overlay is a skin, not a chrome.
Zero impact on snapshots
Playwright MCP's default browser_snapshot reads the page's accessibility tree and DOM, not pixels. The overlay is a sibling node in the body and the agent ignores it. It only shows up in screenshots, which are rare.
HOW IT GETS INJECTED
One line in one bridge turns it on
The bridge that spawns Playwright MCP is called the ACP bridge, short for Agent Client Protocol bridge. It lives at acp-bridge/src/index.ts inside the Fazm app bundle. The relevant block is short.
Three design choices in that block are load-bearing. First, the overlay only turns on when PLAYWRIGHT_USE_EXTENSION is set to true, which is the user-visible Settings toggle playwrightUseExtension. If someone opts out of extension mode, there is no Chrome to paint on, so the overlay stays out of the arg list. Second, snapshots get written to disk under /tmp/playwright-mcp instead of inlining giant base64 payloads into the model's context. Third, the init-page path is guarded with existsSync so a missing overlay file logs a warning instead of crashing the bridge.
THE UX CLAIM IN ONE LINE
“Feel free to switch tabs or use other apps.”
The pill says this because CDP extension mode does not lock your window. Unlike ChromeDriver, which hijacks focus on every click, the extension runs the agent's commands in the target tab's own event loop. You can keep typing in another tab. You can move your cursor into Figma. You can read Slack. The overlay is the promise that nothing will jump at you.
COMPARED TO THE CLASSICS
Where Chrome automation differs, tool by tool
| Feature | Selenium / Puppeteer / ChromeDriver | Fazm (consumer Mac app) |
|---|---|---|
| Runs in your real Chrome profile | No, launches a fresh profile | Yes, attaches via extension over CDP |
| Inherits logins, cookies, 2FA state | No, every run starts clean | Yes, all sessions are yours |
| Visible per-tab indicator | Generic yellow banner or nothing | Animated glow + centered status pill |
| Steals window focus on clicks | Yes, every click grabs focus | No, CDP runs in the tab's event loop |
| Perceives page via pixels or DOM | DOM, but via WebDriver protocol | DOM + accessibility tree, not pixels |
| Setup surface | Language runtime, driver binary, scripts | Install app, add extension, paste token |
| Trips Cloudflare Turnstile | Often, because the profile is fresh | No, because the fingerprint is yours |
ONBOARDING
The four phases between download and first drive
The setup window is defined in Desktop/Sources/BrowserExtensionSetup.swift as a state machine with four phases. No command line. No text file to edit. Each phase polls for its own completion condition before the next one unlocks.
BrowserExtensionSetup phases
Welcome
Explain what extension mode does in two lines. A primary button advances to Connect. A Skip for now button exists for users who want voice-only features without the browser.
Connect
Four sub-steps: check Chrome is installed at /Applications/Google Chrome.app, open the Chrome Web Store listing for Playwright MCP Bridge, open the extension's status.html to copy the token, paste the token into the text field. A timer polls every two seconds to move the checkmarks automatically.
Verify
Fazm spawns Playwright MCP with the pasted token, issues a no-op browser_tabs call, and shows a spinner until the server responds. On success, the overlay is injected on your current active tab immediately, as a demo.
Done
Token saved to UserDefaults under the key playwrightExtensionToken. The Settings page from that point on shows a masked token (first eight characters plus dots) and a Reconfigure button.
MEASURED AT SIGHT
The numbers that define the overlay
“If the overlay blocked a click, the agent's next browser_click call would land on the overlay instead of the button it meant to press. pointer-events:none is mandatory.”
acp-bridge/browser-overlay-init.js comment block
BEYOND THE BROWSER
The part this topic cannot contain
Every guide in this corner of the internet stops at the browser. Fazm does not, because the same agent that drives Chrome also drives Finder, Mail, and WhatsApp. That chain is the reason the overlay exists at all. If the AI can cross from a tab to a native window in the middle of a task, the human watching needs a cue for which one is currently being driven. The overlay handles Chrome. For native apps, Fazm uses a different MCP server, mcp-server-macos-use, which talks to the macOS accessibility API, not to pixels.
| Feature | Pure browser automation | Fazm surface area |
|---|---|---|
| Drive a Chrome tab | Yes | Yes, via @playwright/mcp --extension |
| Drive Finder, Mail, Settings | No | Yes, via mcp-server-macos-use (AX API) |
| Send a WhatsApp message | No | Yes, via whatsapp-mcp (native Catalyst app) |
| Read Gmail, Drive, Calendar | No | Yes, via google-workspace MCP (API, not a browser) |
| Cross from browser to desktop in one turn | Not possible | One ACP session hosts all four servers |
Want to see the overlay in action on your Mac?
Book a 15 minute call and we will walk through extension-mode setup, a live Chrome task, and cross-app chaining into Finder or Mail.
Book a call →FREQUENT QUESTIONS
Things people ask before they install
Frequently asked questions
What actually appears on a Chrome tab when Fazm drives it?
A semi-transparent animated overlay with four soft glow wings on the edges of the viewport, four drifting color blobs in the corners, and a small pill in the center of the screen reading 'Browser controlled by Fazm · Feel free to switch tabs or use other apps'. The overlay sits at z-index 2147483647 with pointer-events:none, so it never blocks a click and the agent underneath still perceives the page exactly as it is. The source lives in acp-bridge/browser-overlay-init.js in the Fazm Mac app bundle.
Why bother showing an overlay at all? Other Chrome automation tools don't.
Because other Chrome automation tools were built for CI servers, not for a human sitting in front of the same Mac. Selenium, Puppeteer, and headless Playwright all assume the browser is unattended. Fazm runs on your personal machine while you work, so the question 'is the AI doing something right now, or is that me?' matters. The overlay is the cheapest possible answer. It also tells you something other tools cannot: the pill literally says feel free to switch tabs, because extension-mode attachment does not lock the window.
How is the overlay injected if Chrome is running under the user's real profile?
Playwright MCP exposes a --init-page flag that takes a JavaScript file and runs it at page load on every tab. Fazm's bridge passes acp-bridge/browser-overlay-init-page.cjs, which reads overlay-init.js from disk and hands it to both page.context().addInitScript() and page.evaluate(). In extension mode (CDP attach), addInitScript is a no-op, so page.evaluate is what actually mounts the DOM. The wiring is at acp-bridge/src/index.ts line 1037.
Does the overlay interfere with what the AI sees or with the page itself?
No. The overlay is a single <div id='fazm-overlay'> absolutely positioned above the page with pointer-events:none. Playwright MCP's browser_snapshot tool reads the accessibility tree and DOM of the underlying page, not a screenshot of pixels, so the visual overlay is invisible to the model in the default code path. It only appears in screenshots, which the agent takes sparingly.
Why is Chrome specifically required? Can Fazm drive Safari, Arc, or Brave?
The extension is a Chrome Web Store listing (Playwright MCP Bridge, ID mmlmfjhmonkocbjadbfplnigmagldckm), and the MCP server attaches over CDP, which is a Chromium-family protocol. In practice it works with Chrome and any Chromium browser that accepts the same extension and exposes CDP. Safari does not expose CDP so extension mode cannot target it. For non-browser Mac apps, Fazm uses a different server, mcp-server-macos-use, which drives apps through the native accessibility API instead.
Does the AI still have access to my cookies, logins, and 2FA state?
Yes, that is the whole point of extension mode. Because the MCP server attaches to your real Chrome window over CDP rather than launching a fresh headless instance, the agent inherits every cookie, session, extension, and logged-in tab you already have. Cloudflare Turnstile, Google sign-in, and bank 2FA all pass transparently because the browser fingerprint is yours. This is the opposite of how Selenium and ChromeDriver work, where every run starts with a clean profile.
Can I switch tabs or work in other apps while the overlay is showing?
Yes. That sentence is literally printed on the overlay pill. Because the Playwright MCP extension attaches over CDP, it does not take focus away from whatever tab or app you are currently using. The agent acts on the tab it was given, and its script runs in the background; you can type into a different tab or switch to VS Code without breaking the automation. This is the single biggest UX difference from ChromeDriver, which steals focus every time it clicks.
Is Fazm a developer framework or something my parents could install?
Consumer. Fazm is a signed, notarized Mac app you download from fazm.ai. No terminal, no pip install, no Node setup, no chromedriver binary on your PATH. The onboarding flow walks you through installing Chrome, adding the Playwright MCP Bridge extension, and pasting a one-time auth token. After that you talk to it by voice and it drives Chrome. The onboarding window is defined in BrowserExtensionSetup.swift and runs as four phases: welcome, connect, verify, done.
How does Fazm know which tab to drive?
Playwright MCP maintains a list of tabs it has attached to via the extension. Fazm's system prompt tells the agent to call browser_tabs list before navigating, check if the user already has the target site open (matched by domain), and switch to that tab with browser_tabs select instead of opening a new one. This keeps the overlay from popping onto unrelated tabs and keeps the browser tidy. The rule is in Desktop/Sources/Chat/ChatPrompts.swift around line 73.
What happens if I close Chrome while the agent is in the middle of something?
The CDP connection drops and Playwright MCP returns an error on the next tool call. The agent sees the error in its tool result, and its system prompt tells it to stop and ask you what to do rather than silently retry. Fazm logs Chrome diagnostics every startup: version, running process count, port 9222 status, singleton lock, and extension count, all visible in the acp-bridge log. Search for the line 'Browser diagnostics: chrome=' in /tmp/fazm-dev.log to see them.