Guide

n8n browser automation is a browser-shaped hole. Here is the cross-surface alternative.

Every SERP result for this keyword points you at a browser node: Browserless, Puppeteer, Playwright, Browser Use, BrowserAct, Anchor, Airtop. They all automate one surface. On a Mac, the work that breaks your workflow lives in the other 19 surfaces: Mail, Settings, WhatsApp Desktop, Numbers, Finder, the Keychain, the sandboxed Save dialog. Fazm ships one process that spawns two MCP servers side by side: Playwright for the browser, and a native accessibility-tree server for everything else.

M
Matthew Diakonov
9 min read
4.8from 600+ Mac users
One app install, zero n8n hosting
Works across Chrome and every native macOS app
Accessibility-tree based, not screenshot guessing

The top results are all answering the wrong question

Open the SERP for “n8n browser automation” and every result is a variation of the same answer: install this community node, or plug in this cloud browser service. Browserless docs. Anchor docs. Browser Use github. The Puppeteer node. The Playwright node. BrowserAct. Airtop. Socket's security report on n8n-nodes-browser-use. They differ on price, concurrency, and how much the LLM decides, but they are all the same product shape: a browser-in-a-box that you pipe into your n8n graph.

The question they all dodge: what happens when the workflow needs to reach something that is not a web page. A Save dialog with a sandboxed file picker. A WhatsApp message in the Catalyst app. A click on the Wi-Fi pane in System Settings. An item in the Finder sidebar. A Numbers spreadsheet that lives only on disk. A cell in the native iMessage window. On macOS, these are not edge cases. They are half of the work.

You can wire two services together to cover both surfaces, and people do. Zapier Webhooks forwarding to a self-hosted Mac that runs AppleScript. n8n HTTP Request nodes calling out to a custom daemon on your laptop. It works, until the two services disagree about timing or state, which they always do. The bridge between surfaces is where those workflows rot. Fazm makes the bridge an implementation detail inside one process.

Architecture

One Node process, five MCP servers

The acp-bridge process is the hub. Each MCP server is a child process it supervises. The agent on the left picks between them per tool call, inside the same chat turn.

acp-bridge/src/index.ts

Your chat input
Claude model
acp-bridge
playwright
macos-use
whatsapp
google-workspace
fazm_tools

Anchor fact

The exact line that makes cross-surface workflows possible

Inside the Fazm bridge, the set of built-in MCP servers is declared as a single JavaScript Set. Both playwright and macos-use are members of it. They are spawned by the same parent process, share the same agent context, and are routed per call by the live system prompt.

acp-bridge/src/index.ts

This is not a wrapper or a compatibility layer. Both servers speak the standard Model Context Protocol. The agent chooses between them using the routing rules below.

The router is a paragraph in the system prompt

There is no glue code that decides which MCP handles which tool call. The model does, by reading five bullet points in its system prompt. This lives in Swift source in the Fazm Mac app at Desktop/Sources/Chat/ChatPrompts.swift lines 56 to 65 and is the actual prompt the model reads every session.

Desktop/Sources/Chat/ChatPrompts.swift
1 turn

The agent picks per tool call, not per workflow. A screenshot lands on fazm_tools, a dashboard click lands on playwright, a Mail.app click lands on macos-use, a text message lands on whatsapp, all inside one chat turn.

How Fazm composes five MCP servers without a graph editor

Fazm versus any n8n browser node

Browser-only tools are optimized for one surface. Fazm is optimized for the workflow boundary between surfaces, which is where real Mac work lives.

Featuren8n + any browser nodeFazm
Where the automation can physically reachWhatever your browser node can render. Everything else is out of scope. If a workflow needs to click a button in Mail or accept a sandboxed Save dialog, you are writing a second automation in a second tool.Anywhere pixels render on macOS. Both inside Chrome AND every native app: Mail, Finder, Settings, WhatsApp Desktop, Numbers, Calendar, Messages, Slack client, Notion desktop. Routed per-call by the system prompt at ChatPrompts.swift lines 56 to 65.
How a browser action and a desktop action share stateThey do not, unless you bolt on a second service. You end up with a workflow like: n8n browser node scrapes a page, pipes JSON into an HTTP webhook, that webhook talks to a second VM running macOS with AppleScript to drive the native app. Then you maintain two services.They share the same chat session inside one acp-bridge Node process. Both MCP servers (playwright and macos-use) are pushed onto the same servers array at acp-bridge/src/index.ts lines 1049 and 1058. Intermediate values live in the agent's context, not a JSON payload bus.
Login state on day oneNone. Browserless, Anchor Browser, Browser Use Cloud, and Puppeteer all spawn a fresh headless Chromium with a blank profile. Every run re-logs in, which breaks on 2FA, CAPTCHAs, SSO with WebAuthn, and Turnstile.Your real running Chrome profile. The bridge passes --extension to Playwright MCP when PLAYWRIGHT_USE_EXTENSION is set (acp-bridge/src/index.ts line 1030), attaching over CDP to the Chrome you already use. Cookies, 2FA, Cloudflare trust, WebAuthn creds all exist on turn one.
How the agent reads the page or the appMost 'AI browser' n8n nodes (Browser Use, Airtop, BrowserAct) still screenshot the viewport and ask the model to guess coordinates. Every CSS redesign breaks the guesses. Nothing reads native macOS apps at all.Accessibility tree, not pixels. Playwright's browser_snapshot returns a YAML tree with [ref=e_] tokens; macos-use returns the macOS accessibility hierarchy as text with x/y/w/h per element. The model passes a text ref to click and type, not a guessed pixel coordinate.
What you run and hostA workflow engine. You self-host n8n or pay for cloud, plug in a browser node, build the graph in the editor, handle credentials, manage the executor, and wire up error branches. It is a developer tool, not a user tool.A consumer Mac app. Open Fazm, type into the floating bar. No Docker, no VPS, no self-hosted queue, no webhook ingress, no cron. The app owns the process tree.
How new tool types get addedInstall a community node, or wait for an official integration. Community nodes live in npm, are approved per-instance, and you still have to read their docs to learn their auth model.Drop an MCP server into ~/.fazm/mcp-servers.json. The bridge reads it at session start (acp-bridge/src/index.ts lines 1104 to 1115) and merges user servers next to the built-in five. No rebuild, no redeploy.
Surface of failure when a site changesCSS selector breaks. You open the n8n editor, re-record, re-test, push. This is the recurring maintenance bill everyone on r/n8n complains about.Usually nothing. The ref token is regenerated on the next browser_snapshot. If macOS updates shift a window, macos-use re-reads the accessibility tree on the next refresh_traversal. No brittle selectors.

What one cross-surface chat turn looks like

Here is a real example of a workflow that every n8n browser node can handle the first half of, and none of them can handle the second half without a second service on your Mac.

Grab a Stripe invoice, save to disk, send on WhatsApp

1

You type one message into Fazm's floating bar

'Find the last Stripe invoice for Acme Co, save the PDF to ~/Documents, then WhatsApp it to the AP contact with a short cover note.' This is one sentence in the Chat page. No graph editor, no credentials panel, no run button.

2

Fazm picks the right MCP per call

The router at ChatPrompts.swift lines 56 to 65 is in the live system prompt. Line 57 routes WhatsApp to whatsapp_* tools. Line 59 routes desktop apps to macos-use. Line 61 routes browser actions to playwright. One system prompt, three surfaces.

3

playwright MCP drives the Stripe Dashboard

browser_snapshot returns a YAML accessibility tree with [ref=e_] tokens. The agent clicks the invoice, clicks Download PDF, waits for the file. Your real Chrome profile is attached, so the 2FA session and the org switcher are already on the right account.

4

fazm_tools writes the PDF to disk

The download appears in ~/Downloads. An execute_sql + file_move call relocates it to ~/Documents/stripe-acme-{date}.pdf. This is one tool call inside the same session, not a webhook to another service.

5

whatsapp MCP sends the file

whatsapp_search locates the AP contact. whatsapp_open_chat opens the thread. whatsapp_get_active_chat verifies the right chat before sending. whatsapp_send_message attaches the PDF and posts the cover note. All inside the same chat turn.

6

macos-use would take over if any step needed a native click

If the contact card were in the native Contacts app instead of WhatsApp's search, macos-use_open_application_and_traverse would list elements with coordinates, macos-use_click_and_traverse would click them, and the session would continue without ever leaving Fazm.

The bridge log for that run

Three MCP servers, four tool calls, one process, zero webhooks. This is the shape of a Fazm workflow at runtime.

acp-bridge stderr (/tmp/fazm-dev.log)

The five built-in MCP servers

Every Fazm install ships with these five. They are the set referenced by BUILTIN_MCP_NAMES at acp-bridge/src/index.ts line 1266. User-supplied servers from ~/.fazm/mcp-servers.json are merged in next to them at session start.

fazm_tools

In-process tool bus for SQL over the local chat database, screenshots, profile lookups, and ask_followup quick-reply buttons. First entry in BUILTIN_MCP_NAMES at acp-bridge/src/index.ts:1266.

playwright

The Node MCP binary from @playwright/mcp, spawned with --extension when the Playwright MCP Bridge token is configured. Attaches to your real Chrome over CDP. Covers every website in one surface.

macos-use

mcp-server-macos-use, a native binary bundled inside the Fazm .app under Contents/MacOS. Reads the macOS accessibility tree and drives any app: Finder, Settings, Mail, Numbers, Calendar, Notes, Photos, Safari, Xcode.

whatsapp

Bundled whatsapp-mcp binary that drives the native WhatsApp Catalyst app via accessibility APIs. Search a chat, open it, verify the active chat, send a message, read the last n messages. No Web QR, no re-pairing.

google-workspace

Bundled Python MCP that talks directly to Gmail, Drive, Calendar, Sheets, Docs over Google Workspace APIs. Uses your own Google Cloud OAuth app from ~/google_workspace_mcp/client_secret.json. No n8n credential store, no shared secret.

Numbers from the shipping build

Read from acp-bridge/src/index.ts and ChatPrompts.swift in the current Fazm repo.

0Built-in MCP servers
0That cross the browser/desktop boundary
0Node process supervising them
0n8n instances required
0
Line in acp-bridge/src/index.ts where BUILTIN_MCP_NAMES is declared
0 lines
Length of the router paragraph at ChatPrompts.swift 56 to 65
0
Native Mac surfaces n8n cannot reach, listed below

Where n8n browser nodes stop, and macos-use starts

Every surface below is driven by mcp__macos-use__* tools via the macOS Accessibility API, inside the same Fazm chat turn as your browser actions.

Mail.appCalendar.appWhatsApp DesktopiMessageFinderSystem SettingsNumbersPagesKeynoteNotes.appRemindersPhotosKeychain AccessPreviewTerminalXcodeSandboxed Save dialogsSandboxed Open dialogsNotification CenterMenu bar apps

And the n8n-adjacent tools this pattern replaces:

Browserless nodeAnchor Browser nodeBrowser Use community nodePuppeteer community nodePlaywright community nodeBrowserAct nodeAirtop workflowBrowse AI scraperAppleScript + HTTP Request nodeCustom macOS self-hosted runnerZapier Webhooks bridgeMake.com HTTP moduleSelenium Grid VMHeadless Chrome Lambda

When n8n is still the right answer

I am not the right person to tell you to rip out a working workflow. Fazm is not an n8n clone. It does not schedule cron jobs from a server, it does not speak native Postgres triggers, it does not have an SFTP node, it does not run on a VPS without a human in the loop. If that is what you are using n8n for, keep it.

Use Fazm for the part of your stack where a human on a Mac is the fastest path, where you need real cookies and real 2FA state, where the next step after a web interaction is a click inside a native app, or where the cost of spinning up a cloud headless browser is higher than the value of what you are automating.

A clean split: keep n8n for server-to-server glue, and let Fazm own the surface where you actually sit.

See one chat turn cross Chrome, Mail, and WhatsApp live

Book a 20 minute call and we will run a cross-surface workflow on a real Mac together, no slide deck, no recorded demo.

Book a call

Frequently asked questions

Does Fazm replace n8n completely?

No. n8n is a general workflow engine with hundreds of non-browser integrations (Postgres triggers, S3, SFTP, Kafka, RSS, cron). Fazm replaces the browser-automation slice of n8n, plus every desktop surface n8n cannot reach on a Mac. If your workflow is 'pull an RSS feed every hour and post to Discord', keep n8n. If it is 'find X in the browser, then act on it in Mail, Finder, or a native app', Fazm handles both ends.

Which n8n browser node is closest to what Fazm does?

Browser Use is the closest conceptually because it uses an LLM agent loop against the browser. Fazm differs on three concrete things. First, it attaches to your real Chrome over CDP via --extension (acp-bridge/src/index.ts line 1030), not a cloud Chromium, so logins exist on turn one. Second, it reads the accessibility tree via Playwright's browser_snapshot, not screenshots, so CSS redesigns do not break refs. Third, it also runs a native macOS accessibility MCP server alongside the browser, so workflows are not confined to one surface.

How does Fazm pick between the browser and a native app?

The system prompt at Desktop/Sources/Chat/ChatPrompts.swift lines 56 to 65 is the live router. Line 59 routes desktop apps to macos-use. Line 61 routes web pages to playwright. Line 57 routes WhatsApp specifically to the whatsapp MCP. The model reads this every session and chooses per tool call. There is no graph editor.

Can I host Fazm on a server to run scheduled browser automations like n8n does?

Fazm is a Mac desktop app, not a server runtime. Scheduled runs live on your Mac under the Fazm Scheduled Tasks feature, which wakes the app and re-runs a saved chat turn on a cron. The trade-off is that your Mac has to be on. The upside is that every run has your real cookies, your real Keychain, your real WhatsApp session, and your real file system, without you uploading any of it to a shared host.

How does Fazm compare to Browserless + n8n for scraping?

For pure scraping of public pages where login is not a factor, Browserless wins on throughput. It is purpose-built for concurrent headless sessions. Fazm wins when the target is behind a login, behind 2FA, behind Cloudflare Turnstile, or when the result must end up inside a native Mac app. Fazm also wins on zero setup cost, one Mac app install versus a Browserless account plus an n8n instance plus a credentials pipeline.

What does 'one process, two MCP servers' actually buy me?

Shared session context. The agent in Fazm keeps state across tool calls inside a single chat turn: the PDF it just downloaded from Stripe is a file path in its working memory. When the next tool call is whatsapp_send_message, it hands that path straight to the WhatsApp MCP. In an n8n workflow, the equivalent is to marshal the file across a browser node, a Write Binary File node, an IF branch, an HTTP Request, and a second automation on a remote Mac to drive WhatsApp Desktop. Six nodes versus one chat turn.

Is macos-use a screenshot tool?

No. macos-use uses the macOS Accessibility API (ApplicationServices / AXUIElement) to read a text tree of on-screen elements, with exact x, y, w, h per element. Clicks auto-center at (x + w/2, y + h/2). The tool also returns a reference screenshot PNG for the model to sanity-check, but clicking is by coordinate derived from the accessibility tree, not by pixel matching.

Do I need to write any YAML or JSON to use it?

No. The only config file is optional: ~/.fazm/mcp-servers.json for adding user-provided MCP servers. The built-in five (fazm_tools, playwright, macos-use, whatsapp, google-workspace) are spawned by the app on first run. Everything else is a sentence in the chat window.

Can I hand a Fazm chat turn back to my n8n instance as a webhook trigger?

Yes. fazm_tools ships a generic HTTP call tool. At the end of a Fazm chat turn, the agent can POST to an n8n webhook with a JSON body, which then triggers whatever downstream nodes you still want to keep in n8n. Treat Fazm as the 'last mile' that reaches surfaces n8n cannot, and keep n8n as the long-running scheduler if you already have one.

fazm.AI Computer Agent for macOS
© 2026 fazm. All rights reserved.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.