A business process automation example that crosses six apps on one Mac, in one sentence, from a real founder's Monday.

Most guides on this topic hand you a bulleted menu of 12 BPA ideas and link back to a SaaS builder. This page does the opposite. One concrete process, a small business Monday-morning reconciliation, walked step by step through the actual machinery a consumer Mac agent has to run to make it work. Which of the 17 skills shipped inside the app bundle handle which step. Which tool routing rule fires in which window. Which macOS accessibility call replaces every pixel-matching hack.

M
Matthew Diakonov
10 min read
4.9from macOS agent, April 2026 build
17 composable skills, auto-installed on first launch
Reads AXUIElement trees, not screenshots
One sentence drives six apps end to end

The one-sentence command

Here is the actual text a founder would type into the floating bar on a Monday morning. No workflow YAML. No drag-and-drop builder. No pre-wired connectors. One sentence.

“Monday reconciliation — pull last week's Shopify orders, match them against my Mercury deposits, update the Numbers sheet, email vendors about the two open disputes, send a standup summary to the team WhatsApp, and draft a customer win post for LinkedIn and X.”

That sentence has to become a sequence of real actions in six different applications. Everything below is the plumbing that makes it possible.

The machinery, in three layers

An LLM on top, a tool routing table in the middle, and 17 skills on disk at the bottom. Every step of the Monday reconciliation passes through all three.

How one sentence reaches six apps

User sentence
Fazm LLM loop
playwright MCP
macos-use MCP
whatsapp MCP
xlsx skill
social-autoposter

The anchor fact, in one paragraph

Before we walk the example, look at the one detail that makes this page uncopyable. On first launch, Fazm copies 17 composable skills out of its own app bundle and onto your disk. From that point on, the LLM can invoke any of them by name during a turn. The full list as of April 2026:

~/.claude/skills (after first Fazm launch)

Those files are not downloaded from the internet. They are copied out of the signed app bundle at Contents/Resources/Fazm_Fazm.bundle/BundledSkills/<name>.skill.md by a Swift installer. Open the source to confirm.

Desktop/Sources/SkillInstaller.swift (lines 29-55)

The example, step by step

Same sentence, broken into the six steps the agent actually runs. Each step names the tool route it takes and the specific primitive underneath.

1

Pull last week's Shopify orders

The agent opens admin.shopify.com, filters the orders page to the prior 7 days, and reads the order table.

Tool route from ChatPrompts.swift: “Browser: playwright tools ONLY for web pages inside Chrome.” The agent calls browser_navigate then browser_snapshot. The snapshot lands as a YAML tree with [ref=eN] handles for every row. No screenshots, no OCR.
2

Match orders to Mercury deposits

For founders on Mercury web, same playwright path. For those with a desktop banking client, the agent switches to the macos-use MCP and reads the AXUIElement tree of the native window.

This is the step that breaks every screenshot-driven BPA tool. A single column resize in the deposits view shifts every row coordinate. The accessibility tree does not move. The deposit amount stays pinned to the same AX cell identifier regardless of where it renders.
3

Update the Numbers sheet

Agent switches to Numbers.app, navigates to the current month's reconciliation sheet, and writes matched rows with the xlsx skill providing the logic.

Native app, so tool route is mcp__macos-use__*. Underneath, the macos-use server issues AXUIElementCreateApplication(pid) on Numbers and walks kAXChildrenAttribute to find the target cell by role, not pixel.
4

Reply to two vendor disputes in Mail.app

Agent opens Mail, locates the two flagged threads by subject, and drafts an inline reply to each using the thread context it already has from step 1.

Mail is a native app, so macos-use again. The draft is visible and editable before it sends. The floating bar has an Ask / Act dial; in Ask mode the agent composes but does not send, in Act mode it sends. Same accessibility primitive, different final call.
5

Send the team standup to WhatsApp

Tool route: whatsapp MCP for the native WhatsApp app. Workflow is search, open chat, verify active chat, send message.

From ChatPrompts.swift: “WhatsApp: whatsapp tools (mcp__whatsapp__*). Workflow: whatsapp_search → whatsapp_open_chat → whatsapp_get_active_chat (verify) → whatsapp_send_message. Always verify the correct chat is open before sending.” The verify step is explicit because misdirecting a message is the most common failure mode for unattended BPA.
6

Draft a customer win post for LinkedIn and X

Agent invokes the social-autoposter skill. The skill pulls the customer quote from the Shopify snapshot, composes a short post, and drops it in LinkedIn and X as a draft.

Skill file: ~/.claude/skills/social-autoposter/SKILL.md (copied there from the app bundle on first launch). The skill handles the cross-platform formatting differences and delegates back to the browser route for the actual draft composition.

The tool routing table, verbatim

The reason one sentence can touch six apps without the user picking connectors is the tool routing block inside the system prompt at Desktop/Sources/Chat/ChatPrompts.swift. These are the actual rules the LLM reads on every turn. I am pasting them unedited because the specific language is what makes the routing deterministic.

Desktop/Sources/Chat/ChatPrompts.swift (lines 65-74)

That block is why the Monday-morning run does not need a workflow definition. The model reads the block, looks at the step, and picks the tool. Shopify is a URL, so playwright. Numbers is a desktop app, so macos-use. WhatsApp is its own MCP. There is no ambiguity for the model to get wrong.

Why screenshots were never going to work for this

Three concrete failure modes hit during a reconciliation pass. Each of them is a wall for screenshot-based BPA. Each is a non-event for an accessibility-tree reader.

Numbers resizes columns mid-paste

The sheet auto-fits column width as new matched rows land. Pixel templates that targeted column C at x=340 now target an empty region. Accessibility lookup hits the same AX cell identifier regardless of render width.

Mail collapses threads on reply

After the first reply, Mail tucks the quoted text into a collapsed block, shifting every button below it. Reading AXRole and AXIdentifier on the reply control stays valid across the collapse.

Dark-mode toggle defeats image match

A founder flips Mercury to dark mode mid-reconciliation. Button contrast drops below template tolerance. Accessibility labels are mode-independent.

OS point release moves a pixel

macOS 14.5 nudged a Mail.app toolbar icon by 2 pixels. Every OCR template broke that week. AX trees for Mail.app did not change at all.

What the generic BPA examples miss, at a glance

Here is the same process, framed the way a standard vendor-written guide would frame it, next to the reality of running it on a Mac. The vendor column is a fair summary of what the top-ranking guides (Kissflow, Activepieces, ClickUp, Monday, Intuit) say. The right column is the actual cost nobody lists.

FeatureThe standard vendor framingRunning it on a real Mac
Invoice / order reconciliationSet up a Zap or a Power Automate flow between Shopify and Xero. Done.That only works if your accounting is Xero Cloud. Half of SMBs still run QuickBooks Desktop or a regional client with no API. The reconciliation has to happen on the Mac.
Cross-app handoffsPre-build an integration for each pair of SaaS tools. 12 connectors at $20/month.The agent reads the accessibility tree of whatever app is frontmost. New app next quarter? No connector to build. Same code path.
Human in the loopAdd an approval node to the workflow graph. Workflow pauses until a human clicks a button in the vendor portal.Flip the Ask / Act dial in the floating bar. Same chat loop, different disposition. No second tool to check.
Legacy desktop appsOut of scope. 'API-first'.In scope. Every macOS app exposes an accessibility tree because screen readers require it. If VoiceOver can read it, Fazm can drive it.
Messaging (WhatsApp, Telegram, Slack)Pay for each connector. Often $15-40/month per platform.Bundled MCPs. whatsapp talks to the native app. telegram runs telethon from the bundled skill. Slack routes through playwright or native MCPs as available.
ConfigurationYAML or drag-and-drop canvas. Versioned flows, deployment pipeline.One sentence. The LLM reads, routes, runs. No flow definition to maintain.

What a real run looks like in the log

Abbreviated, from a local run on my machine. Enough to show the tool routing in action.

~/Library/Logs/Fazm/chat.log (trimmed)

The bundled skill manifest, in one marquee

Every one of these lives inside the app bundle as a .skill.md file with YAML frontmatter carrying a name: and description:. The LLM invokes them by name as part of a turn.

ai-browser-profilecanvas-designdeep-researchdoc-coauthoringdocxfind-skillsfrontend-designgoogle-workspace-setuppdfpptxsocial-autopostersocial-autoposter-setuptelegramtravel-plannervideo-editweb-scrapingxlsx

The exact directory that ships in every build: Desktop/Sources/BundledSkills/. The count does not lie.

By the numbers

0
Skills bundled inside the app, copied to ~/.claude/skills/ on first launch
0
Apps the one-sentence Monday reconciliation actually crosses
0
Pixel-matching templates used. All targeting is via AX role and identifier.

The sequence, as the agent sees it

For readers who prefer a sequence diagram, this is the same Monday-morning run expressed as messages between the LLM loop and the three transport layers.

Monday reconciliation, one turn per step

UserLLM loopplaywright MCPmacos-use MCPwhatsapp MCPMonday reconciliation (one sentence)browser_navigate shopify47 orders (AX tree)browser_navigate mercury3 Stripe payoutsNumbers append rowsok (AX cells written)Mail.app draft 2 replies2 drafts stagedwhatsapp_send_messagedelivereddone in 2m 14s

The smallest version you can try today

If six apps in one command feels like a lot for a first run, here is a two-step version that still proves out the approach. Any Mac, any mainstream mail client, any spreadsheet tool.

Two-step starter, ten minutes

  • Install Fazm. Grant screen recording, accessibility, microphone when prompted.
  • Open the floating bar. Type: 'Open Mail.app, grab the three newest invoices from the vendor-invoices label, and append each one to a new row in the June invoices sheet in Numbers.'
  • Watch the accessibility tree do the work. No setup, no connectors, no templates.
  • If it works for two apps, it works for six. Scale the sentence, not the config.
17 skills

If VoiceOver can read it, Fazm can drive it.

Desktop/Sources/BundledSkills, April 2026 build

Want help mapping your own Monday-morning reconciliation?

Book a 20-minute call. I'll walk through your specific process, name the exact tool route per step, and we'll decide together whether the accessibility-tree approach fits your stack.

Book a call

Frequently asked questions

Why does every other business process automation example look the same?

Because almost every top guide for this topic is written by a SaaS vendor whose product lives in a browser. Kissflow, Activepieces, ClickUp, Monday, DocsAutomator, Informatica, Splunk, Workday, Intuit — read the guides back to back and you see the same nine examples: invoice intake, employee onboarding, expense approvals, customer support routing, purchase orders, marketing emails, data entry, report generation, CRM updates. Each one is a form that feeds another form. The underlying assumption is that the whole process lives inside SaaS already. For a small business owner on a Mac, the actual process spans Finder, Mail, Numbers, a desktop accounting client, WhatsApp, and the web. That is the half of the example space nobody writes about.

What exact process does this page walk through?

A Monday-morning reconciliation that a solo founder or a lean ecom operator runs every week. Step 1, pull last week's Shopify orders from the admin site. Step 2, cross-reference them against the Stripe deposit in Mercury (a desktop banking client for some, a browser tab for others). Step 3, label the matched deposits inside Numbers or Excel. Step 4, email the vendor about two refund disputes by replying inline in Mail.app. Step 5, send a Monday standup to the team WhatsApp group summarizing the week's revenue. Step 6, post a customer-win quote to LinkedIn and X. Six apps, three web properties, one Mac. The entire run is one sentence typed into the Fazm floating bar.

How exactly does Fazm execute that across six apps?

Three layers. One, the LLM running through a chat loop. Two, a set of tool routing rules baked into the system prompt at Desktop/Sources/Chat/ChatPrompts.swift that tell the model which tool handles which surface — playwright for web pages inside Chrome, macos-use MCP tools for Finder, Settings, Mail, capture_screenshot for desktop screenshots, the whatsapp MCP for the native WhatsApp app, and a first-party telegram skill that runs Python telethon scripts. Three, 17 composable skills bundled inside the app and copied to ~/.claude/skills/ on first launch so the model can invoke, for example, xlsx for spreadsheet work or social-autoposter for the LinkedIn post without any setup from the user.

Where are those 17 skills actually stored on disk?

Two places. Before first launch, they live inside the signed app bundle at Contents/Resources/Fazm_Fazm.bundle/BundledSkills/<name>.skill.md. The directory listing on my machine right now is exactly 17 files: ai-browser-profile.skill.md, canvas-design.skill.md, deep-research.skill.md, doc-coauthoring.skill.md, docx.skill.md, find-skills.skill.md, frontend-design.skill.md, google-workspace-setup.skill.md, pdf.skill.md, pptx.skill.md, social-autoposter-setup.skill.md, social-autoposter.skill.md, telegram.skill.md, travel-planner.skill.md, video-edit.skill.md, web-scraping.skill.md, xlsx.skill.md. On first launch, Desktop/Sources/SkillInstaller.swift runs install() which iterates bundledSkillNames and copies each one to ~/.claude/skills/<name>/SKILL.md on your disk, tracking checksums so updates are non-destructive.

Why does the screenshot approach break for this specific example?

Three separate failure modes fire during a reconciliation pass. First, Numbers resizes its columns the moment you paste new data. Any template that matches the Monday-check cell by pixel coordinates is now misaligned. Second, Mail.app collapses a thread after the first reply, which moves every subsequent target by the height of the collapsed quote. Third, Mercury's web dashboard dark-mode toggle shifts button contrast enough to defeat image-match tolerance. Reading the accessibility tree sidesteps all three: the spreadsheet cell's AX identifier, the mail thread's AX role, and the Mercury button's ARIA label do not move when pixels move. Fazm reads those directly via AXUIElementCopyAttributeValue, which is the same API macOS VoiceOver has used for 20 years.

How do I verify the accessibility approach myself?

Three ways, increasing difficulty. Easy: run Fazm, grant accessibility permission when prompted, then type any command that requires clicking inside Numbers or Mail. It will work the first time even on an app that has never been used as a BPA target before. Medium: open the app bundle at /Applications/Fazm.app, Show Package Contents, Contents/MacOS, note the executable mcp-server-macos-use alongside the main Fazm binary. That is the MCP server that speaks AX to the rest of macOS. Hard: read Desktop/Sources/AppState.swift starting at line 431 — the method testAccessibilityPermission() uses AXUIElementCreateApplication on the frontmost app and calls AXUIElementCopyAttributeValue for kAXFocusedWindowAttribute, which is literally the primitive the whole automation layer is built on.

Is this really 'business process automation,' or is it just a fancy assistant?

Every process a small business owner automates on their Mac qualifies: invoice filing, month-end reconciliation, vendor dispute tracking, customer win reporting, weekly standup summaries, product listing updates, competitor price monitoring. Those are textbook BPA processes. The only reason 'BPA' has historically been used for server-side iPaaS and not for a Mac is that server-side iPaaS physically could not reach the apps where the work happens. A Mac agent reading the accessibility tree collapses the distinction.

What does the actual user command look like?

One sentence typed into the floating bar: "Monday reconciliation — pull last week's Shopify orders, match them against my Mercury deposits, update the Numbers sheet, email vendors about the two open disputes, send a standup summary to the team WhatsApp, and draft a customer win post for LinkedIn and X." No workflow YAML, no drag-and-drop builder, no manually configured connectors. The LLM reads the sentence, picks the right tool chain per step from the tool routing block, invokes the right bundled skill where needed (xlsx, social-autoposter), and walks the accessibility tree for each app.

Can I customize this example for my own process?

Yes. The code path is generic. The six apps in the example are only there because they match a common ecom operator's Monday. Swap them: Mail for Superhuman, Numbers for Excel, Mercury for Relay, WhatsApp for Slack, LinkedIn for Threads. Fazm does not care. Every major macOS app exposes the standard NSAccessibility tree, and the skills directory is additive — drop a new .skill.md into ~/.claude/skills/<name>/SKILL.md and the next chat turn can invoke it by name. The skill installer at SkillInstaller.swift also ships two non-core skills, social-autoposter-setup and google-workspace-setup, as one-time onboarding flows that wire the agent into third-party services.

What's the smallest version of this I can try today?

Install Fazm, grant the three permissions it asks for (screen recording, accessibility, microphone), then pick one two-step process that touches one desktop app and one web page. 'Open Mail.app, grab the three newest invoices from the vendor-invoices label, and log each one to a new row in the June invoices sheet in Numbers.' That single command is already crossing two apps and two data formats. If it works for that, you can scale up to the six-app reconciliation in this guide without changing the setup. This is also the exact flow most founders eventually book a call about — start small, see the accessibility tree do its job, then decide where to take it.