AI browser automation you can actually see, running inside the Chrome you are already signed into
Every product on the first page for this keyword puts the automation in a cloud browser or a separate Chromium window. Fazm runs it inside the exact Chrome.app you opened this morning, keeps your logins and browser fingerprint intact, and drops a visible glowing overlay on every page the agent touches so you always know when the AI has the wheel. The overlay reads Browser controlled by Fazm · Feel free to switch tabs or use other apps and uses pointer-events:none so it never blocks your own clicks.
WHY THIS KEYWORD IS MISLEADING
The default picture of AI browser automation is wrong
If you google this exact phrase you will find ten variations on one answer: a cloud-hosted agent that spins up a fresh Chromium somewhere, runs a task, and shows you a video of it afterwards. That works great for scraping a competitor's pricing page. It is a bad fit for the tasks most people actually want automated.
The tasks people want automated involve logged-in state. Paying a bill in a bank portal. Replying to a support ticket in Help Scout. Updating a CRM record. Applying to a job on LinkedIn with the resume already saved in autofill. A fresh cloud Chromium does not have any of that. It has to log in from scratch, which means passing your credentials to the cloud, solving whatever 2FA challenge comes up, and surviving the browser fingerprint heuristics that flag headless sessions.
The Fazm answer to this is the boring one: do not use a fresh browser at all. Drive the one the user is already running, with their cookies, their saved passwords, their fingerprint, their extensions, their trust history. Then put a visible overlay on top of it so the user knows when they are watching the AI and when they are watching themselves.
The default pattern vs the Fazm pattern
An agent runs in a cloud-hosted or locally launched Chromium that has none of your saved state. To do a task on a site you are already logged into, the agent either asks for your credentials and logs in fresh (so 2FA, captchas, and fingerprint heuristics all fire at the worst time), or it cannot do the task at all.
- Fresh browser, no cookies
- New fingerprint, often flagged as bot
- Re-login for every logged-in site
- No visual signal when the agent is driving
- Task dies the moment it leaves the tab
THE ANCHOR FACT
One CSS rule decides the whole UX
The overlay is a single DOM element injected at the top of every page the agent controls. It has id fazm-overlay, lives at z-index 2147483647 (the maximum 32-bit signed integer, so no page can stack above it), and uses pointer-events:none everywhere so your clicks, hovers, and scroll events pass through to the page underneath. That combination is the entire trick: visible and undismissable, completely non-blocking.
The status pill text ("Browser controlled by Fazm · Feel free to switch tabs or use other apps") is important enough to be hardcoded, not templated. It is explicitly giving the user permission to ignore the agent and do something else. On a sandbox browser this would be irrelevant; on your own Chrome it is the difference between trust and paranoia.
HOW THE OVERLAY GETS THERE
From the Mac app to your browser tab
SEQUENCE · overlay injection path
WHERE THE NUMBERS LIVE
A few specific numbers from the source
None of these are invented. Each one is a literal value that appears in a file on disk in the Fazm source tree. If you downloaded the open-source repo at github.com/mediar-ai/fazm and grepped for any of the tokens below, you would find them.
Line numbers are from the current main branch. 1266 is the BUILTIN_MCP_NAMES Set in acp-bridge/src/index.ts; 1033 is the Playwright flag line in the same file; the overlay z-index appears twice in acp-bridge/browser-overlay-init.js.
THE ARCHITECTURE
Your English sentence to a click in your browser
Where the browser leg sits inside Fazm
WIRING THE EXTENSION
The extension mode switch, in ten lines
The bit that decides whether Fazm drives your real Chrome or spins up a fresh one is genuinely small. The whole thing is an env-var gated playwrightArgs.push("--extension") plus a pass-through of the extension token so the local Playwright and the Chrome extension can authenticate each other.
A GROUND-TRUTH MOMENT
What the startup log looks like when Accessibility works
On boot, Fazm does not just call AXIsProcessTrusted. That function returns cached TCC state that can be stale for hours after a macOS update or app re-sign. Instead it calls the real accessibility API against the frontmost app and against Finder, and logs the result. Here is what the log looks like on a healthy machine.
WHAT THIS BUYS YOU
The practical consequences
Logged-in sites just work
Because the browser is your browser, the agent inherits every session you already have. No credential forwarding, no fresh login, no surprise 2FA prompt five minutes into a task.
Cloudflare Turnstile and bot checks pass transparently
Your Chrome has a history. It has passed thousands of Turnstile and reCAPTCHA checks over time. The agent inherits that reputation. A headless Chromium does not.
Shared control, same tab
The overlay is pointer-events:none. The agent clicks what it clicks, you click what you click, and neither blocks the other. You can take over mid-task and hand it back.
Task can exit the browser
When the workflow ends with a desktop action (rename a file, change a macOS setting, send a WhatsApp message) the agent switches to the macos-use MCP and keeps going via the real Accessibility API.
No pixel clicks
Playwright is launched with --image-responses omit. The agent navigates a YAML accessibility snapshot with numbered refs (ref=e1, e2, ...), not screenshots. The model context stays small and deterministic.
Visible, always
The overlay sits at INT32_MAX z-index with a status pill reading 'Browser controlled by Fazm'. If the wings are not on the page, the agent is not driving. That is a real-time trust signal.
HEAD TO HEAD
Fazm vs a typical cloud AI browser automation service
| Feature | Typical cloud AI browser agent | Fazm |
|---|---|---|
| Whose Chrome runs the task | A fresh Chromium on a vendor server | The Chrome.app on your Mac, already running |
| Logged-in sites | Needs your credentials forwarded; 2FA at task start | Inherited from your real profile; nothing to forward |
| Bot-detection / Turnstile | Often flagged; user solves challenges | Uses your existing fingerprint and history |
| Visibility of what the agent is doing | Replay video after the fact | Live glowing overlay + status pill on every page |
| Blocking the user while agent runs | Usually separate session; cannot share tab | pointer-events:none, you can click through at any time |
| What the model sees of the page | Usually screenshots; expensive context | YAML accessibility snapshot, ref=e1 style |
| Works outside the browser | No, browser-only | Yes, same agent jumps to macos-use for native Mac apps |
| Shape of the product | SaaS or developer framework | Signed notarized consumer Mac app, English in a floating bar |
ONE-TIME SETUP
What installing this looks like
The extension-mode setup is a three-step thing you do once. Everything downstream is regular English against a floating bar.
ONE-TIME INSTALL
- Download and install the signed Fazm Mac app from fazm.ai
- Grant Accessibility and Screen Recording permissions (two System Settings panes)
- Install Playwright MCP Bridge from the Chrome Web Store (ID mmlmfjhmonkocbjadbfplnigmagldckm)
- Paste the bridge auth token shown in the Fazm setup window into the extension options
- Click 'Test connection' to confirm the local Playwright can reach the extension
- After this, every English sentence that touches the web routes through your real Chrome
WHO THIS IS FOR
Who should use AI browser automation like this
If you need to scrape a million product pages or run thousands of parallel sessions, use a cloud tool. That is what they are for. Browser automation at that shape is a server problem and a Fazm-style consumer Mac app is not the right vehicle for it.
If you are a single person whose most annoying tasks look like "reply to these five Slack threads, pull the attachments into Drive, file the invoices in our accounting tool, and send Sam the final link", the Fazm shape is the one that fits. One session. Your logins. Your Mac. You watch it happen. When it gets stuck you take over without restarting.
The spectrum runs from robotic process automation (cloud, scripted, high-throughput, zero trust) to personal AI agent (local, reasoning, one operator, full trust). Fazm sits firmly on the right side of that spectrum, and the visible overlay is how it earns the trust.
SOURCE FILE INDEX
Everything on this page, verifiable in four files
Full overlay DOM and CSS, status pill text, z-index, pointer-events, four animated wings.
Playwright init-page entry point that hooks page load events and calls page.evaluate with the overlay script.
Line 1029 is the --extension switch. Line 1033 is --image-responses omit. Line 1266 is the BUILTIN_MCP_NAMES Set with five entries.
Line 220 contains the Chrome Web Store URL for the Playwright MCP Bridge extension. Token generation and connection probe live in the same file.
Line 433 is testAccessibilityPermission. Lines 468 to 485 are the Finder fallback. The CGEvent tap probe is below that.
THE UNCOPYABLE PART
Why this page is not easy to clone
Almost every other result on the SERP is reviewing a product from the outside. They have not read the source, because the source is theirs and it is behind an API. The Fazm app is open source at github.com/mediar-ai/fazm, so everything on this page points at a line of code you can read yourself.
You can go and see 0 literally written in acp-bridge/browser-overlay-init.js. You can grep for BUILTIN_MCP_NAMES and find the set of five. You can run the Mac app yourself and look at a real page with the overlay on it. That verifiability is the moat, not any single feature.
Want the agent to share a tab with you instead of a cloud browser?
Fifteen minutes. I will show you the overlay injecting onto a live page, the YAML snapshot with ref=e1 in it, and the agent handing control back mid-task.
Book a call →Frequently asked questions
How does Fazm attach to my actual Chrome instead of launching a headless one?
Fazm ships Playwright MCP in extension mode. When PLAYWRIGHT_USE_EXTENSION is set to true, the bridge appends --extension to the Playwright args (see acp-bridge/src/index.ts lines 1029 to 1031). That flag tells Playwright to connect to a Chrome Web Store extension called Playwright MCP Bridge, with extension ID mmlmfjhmonkocbjadbfplnigmagldckm. Once the extension is installed and you paste the one-time auth token from the Fazm setup window, Playwright drives the Chrome you are already running, with your existing profile, cookies, SSO sessions, and browser fingerprint intact. If the extension is not present, Playwright falls back to launching its own fresh Chromium.
What is the visible overlay and why does Fazm add it?
Every page the agent controls gets a full-viewport element with id fazm-overlay injected at z-index 2147483647 (the maximum 32-bit signed int, so nothing else can stack above it). It renders four soft radial gradient wings at the edges of the page and a small centered status pill that reads 'Browser controlled by Fazm · Feel free to switch tabs or use other apps'. The overlay uses pointer-events:none, so it is visible but never intercepts clicks. You can see the exact CSS and DOM in acp-bridge/browser-overlay-init.js lines 16 to 68. The overlay exists because the user needs to know when the AI is driving so they can stop it if it starts doing something unexpected, and because on a real Chrome with real logins the psychological cost of not knowing is higher than on a fresh sandbox.
How is this different from Browser-Use, Operator, Chrome Auto Browse, or Fellou?
Those are variations on the same shape: the agent gets a screenshot or a DOM snapshot from a browser the user does not own (cloud-hosted Chromium, or a chrome profile the vendor runs). Fazm is the inversion. The browser the agent drives is the exact Chrome.app you double-clicked this morning, in your own macOS user account, with your own extensions and saved passwords. The agent gets accessibility snapshots, not screenshots, and those snapshots are saved to /tmp/playwright-mcp as YAML files with numbered refs like [ref=e1]. Inline base64 image responses are stripped out at the MCP layer by launching Playwright with --image-responses omit (acp-bridge/src/index.ts line 1033), so the model never eats a screenshot even when a cloud competitor would have sent one.
If the overlay covers the page, how do I still click things myself?
The overlay uses pointer-events:none on the root and all of its children. CSS pointer-events:none makes the element completely transparent to the mouse: hovers, clicks, text selection, scroll wheels all go straight through to whatever is under it. So you see the glow and the status pill, but every click you make lands on the actual page. That is the whole point. The agent and the user can share the same tab in real time. You can watch the AI fill a form, take over mid-field to fix an autocomplete, then hand it back by typing into Fazm again.
What happens when the task leaves the browser? Does Fazm stop working?
No. Fazm registers five MCP servers as peers, not just Playwright. The list is hardcoded as a Set in acp-bridge/src/index.ts at line 1266 with exactly these names: fazm_tools, playwright, macos-use, whatsapp, google-workspace. When the task needs to rename a file in Finder, open a macOS setting, or paste something into the native WhatsApp app, the agent switches to macos-use or whatsapp, which drive those apps through the real macOS Accessibility API (AXUIElement, kAXFocusedWindowAttribute, etc.) rather than pixel guessing. So AI browser automation here is a subset of something bigger, and the same English sentence can start in Chrome and finish in Finder without a seam.
Is Fazm a developer tool or a consumer app?
A consumer app. There is no npm install, no API key to generate, no Python script to run, no docker compose up. You download a signed notarized Mac app, grant Accessibility and Screen Recording permissions, optionally install the Playwright MCP Bridge extension from the Chrome Web Store, and type English into a floating bar. Under the hood it runs the Claude Agent SDK against the five bundled MCP servers, but that is an implementation detail the user does not touch.
How does Fazm verify the Accessibility permission is actually working?
At boot it runs a real round-trip through the accessibility API, not just AXIsProcessTrusted which caches stale values on macOS 26 Tahoe. The function is testAccessibilityPermission in Desktop/Sources/AppState.swift, starting at line 433. It grabs the frontmost app, calls AXUIElementCreateApplication on its PID, then tries to read kAXFocusedWindowAttribute back. Success, noValue, and attributeUnsupported all count as 'working'. If it gets cannotComplete, it re-runs against Finder specifically (lines 468 to 485) to disambiguate a real broken permission from an app that simply does not implement AX. If Finder also fails, the code falls back to creating a CGEvent tap as a last-resort probe of the live TCC database. That three-stage check is why the agent can trust the tree.
Why use the accessibility tree at all if Playwright already has a DOM?
Because the agent also needs to work outside the browser. In Chrome, Playwright hands back its own structural snapshot. But Gmail, Slack, Finder, Mail, Settings, Notes, Xcode, VSCode, and Catalyst apps like WhatsApp do not have a DOM the agent can query. They do publish to the macOS Accessibility API (the same surface screen readers use), which exposes a typed tree of roles, labels, bounds, and children for every UI element. By using the accessibility tree as the canonical ground truth for any surface, Fazm gets one mental model that spans Chrome plus the rest of your Mac, instead of a browser-only model that quits at the URL bar.
Does the overlay ever get stripped by a site that sanitizes DOM?
The overlay is injected via Playwright's page.evaluate on every page load, not via addInitScript (which does not work reliably on CDP-connected contexts in extension mode). See the comment block at the top of acp-bridge/browser-overlay-init.js. That means if a single-page app re-renders its body, the overlay re-injects on the next load event. If the page is a hardened origin that aggressively rewrites its own DOM, the overlay can be cosmetically displaced, but the agent's control path is independent of it. The overlay is a transparency signal, not a load-bearing component.
What does the one-time Chrome extension setup actually install?
It installs the Playwright MCP Bridge extension from the Chrome Web Store, at chromewebstore.google.com/detail/playwright-mcp-bridge/mmlmfjhmonkocbjadbfplnigmagldckm. This is an open-source bridge that lets a local Playwright process drive the Chrome window over a localhost channel. Once installed, you paste a bridge token into its options page (Fazm generates the token in BrowserExtensionSetup.swift and stores it under UserDefaults key 'playwrightExtensionToken'), then the setup window runs a connection probe to confirm the local Playwright can reach the extension. After that, any command you give Fazm that involves a URL routes through this extension into your real Chrome, not a separate Chromium.
ADJACENT ANGLES
Related guides
Browser automation tool
Why Fazm ships the browser as one of five peer MCP tools, not the whole product. Same source tree, different angle.
Browser automation extension
The Playwright MCP Bridge extension, how the one-time token hand-off works, and why Fazm did not write its own.
Automation web browser
The case against building a standalone 'AI browser' and for driving the one you already use every day.