The business process automation consultant playbook, and the markdown file that compresses half of it

A typical BPA engagement hands you two artifacts: a process map and an RPA implementation. The map is talk, the build is labor, and most SMB projects pay a consultant to bridge the two with UiPath or Power Automate. This guide is about when that bridge still makes sense, and when a .skill.md file plus the macOS accessibility tree is already enough.

M
Matthew Diakonov
11 min read
4.8from 420+
Runs on macOS 14+
Reads any app via AXUIElement
Skills are portable markdown
Open source
50%+ of engagement hours are implementation, not discovery

Most clients reach ROI inside 6 to 12 months of a BPA engagement. For an SMB running a process on one Mac, that timeline collapses when the implementation phase collapses.

Industry benchmark across top BPA consultancies

What you are actually paying a consultant for

The top SERP results for this keyword (ProsperSpark, Bain, Grantbot, much. Consulting, BHW Group) describe the deliverable the same way: interviews, a process map, then tool selection between UiPath, Automation Anywhere, or Power Automate, then a build. The process map is cheap to produce and universal. The build is expensive because every RPA platform wants its own runtime artifact, and those artifacts do not look like English.

That second phase is where billable hours concentrate. A two-month engagement for a single end-to-end process typically sits in the five-figure range. Most of that is the developer re-encoding your process into a .xaml workflow, a .atmx bot, or a JSON flow that reproduces what the map already said.

$0/hrLow-end BPA consultant rate
$0Upper-end SMB engagement
0 attemptsAccessibility retry ceiling
0sAX re-probe interval

The two artifacts, side by side

The contrast is not about cost. It is about how much translation sits between the thing you describe and the thing your computer runs. Move that translation layer into the model and the handoff shrinks.

The consultant handoff vs the skill file

A process map (Visio, Lucid) plus a custom RPA workflow encoded in UiPath .xaml, Power Automate JSON, or Automation Anywhere .atmx. The map lives in one file, the implementation in another, and the two drift every time the business changes.

  • Two artifacts, different formats
  • Developer required to edit either one
  • Breaks when the UI refreshes
  • Locked to the vendor's runtime

How work flows in each model

Interviews
Shadowing
Existing SOPs
Skill file
Native apps
Legacy desktop
Browser flows

In the consultant model, discovery inputs flow into a Visio artifact on the left, then to a separate RPA runtime on the right. Fazm merges the middle: one skill file is both the documented process and the runtime.

The anchor fact: how Fazm reads your apps

Consultants usually reach for UiPath or Power Automate because those tools ship record-and-replay recorders that capture clicks. Those recorders rely on a mix of UIAutomation bindings and image matching. On macOS, Fazm uses the native accessibility API every screen reader already uses. Here is the core probe, taken from Desktop/Sources/AppState.swift:

AppState.swift (excerpt)

Two details matter here. First, AXUIElementCreateApplication(pid) works against any running macOS app, not just the browser. If VoiceOver can read it, Fazm can drive it. Second, the Finder tie-breaker: macOS sometimes reports TCC permission as granted while AX calls still fail, typically after an OS update. The retry loop re-probes every 5 seconds up to 24 attempts, so the agent does not silently stall.

Apps where the accessibility tree already works

MailFinderSafariChromeNumbersExcelWordPreviewQuickBooksNotionFigmaSlackTerminalXcodePhotoshopShopify (browser)Stripe (browser)Salesforce (browser)LinearMessages

Any macOS app that ships with VoiceOver support, which is effectively all mainstream ones, exposes an AX tree. That is the set Fazm can drive.

How to run the self-serve version of the engagement

A consultant will run four phases on an SMB project: discover, map, build, test. Skipping none of them. Fazm flattens the build phase into editing a file.

Four steps, one afternoon

1

Describe the process in plain English

Open a new file. Write what you do today, step by step, as if you were handing it to a new hire. No diagrams. No tool names. Just the sequence of apps touched and the fields read or written.

2

Save it as a skill file

Drop it in ~/.claude/skills/<slug>/SKILL.md with a minimal YAML header (name, description). That is the exact location Fazm's SkillInstaller installs to, and the discovery path ChatProvider reads from at runtime.

3

Run it once and watch

Trigger the skill from the floating bar. Fazm walks the AX tree of the frontmost app, finds the buttons and fields you named, and executes. Watch where it stalls and edit the markdown. No UiPath Studio, no recompile.

4

Ship it to your team

The skill is a single portable file. Copy it, commit it to a repo, share it in Slack. A teammate can drop it into their ~/.claude/skills/ directory and run the same process on their Mac.

Installing and running a skill

The tool-call logs are literal. Fazm names the AX role and label it is acting on, so when something misbehaves the log tells you whether the model misread the step or the app changed its tree.

What fits in a skill file, what still needs a human

Single-Mac processes

Month-end close, invoice intake, vendor reconciliation, weekly ops reports. Anything one operator does on one machine.

Legacy desktop apps

QuickBooks Desktop, older ERP clients, practice management software with no API. If VoiceOver can read it, Fazm can drive it.

Mail-to-app workflows

Pull an attachment from Mail, parse it in Preview or Numbers, push it to a native app.

Cross-team change management

Re-designing a process that touches 40 people across departments. Still needs a human to run interviews and get buy-in.

Compliance-signed workflows

SOX/HIPAA-bound processes where you need a signed artifact and an auditor's sign-off. Consultants still lead here.

SMB quick wins

If the process lives on one Mac and touches four or fewer native apps, a skill file usually beats a consultant engagement on time and cost.

When you should still hire a consultant

Fazm does not replace the insight work. If the answer is "re-design the process entirely" rather than "automate the current process faster," you need a human who can interview stakeholders, spot political blockers, and publish a signed SOP the finance team will accept. The rule of thumb: if the process would still be confusing after you automated it, the problem is the process, not the execution speed, and a consultant earns their hours. For everything else, write the skill file first and see how far it gets.

Consultant-led RPA vs self-served Fazm skill

FeatureConsultant + RPAFazm skill.md
Deliverable formatVisio/Lucid map + UiPath .xaml or Power Automate JSONSingle markdown file with YAML header
Who edits itDeveloper with RPA Studio experienceAnyone who can write an SOP
RuntimeRPA platform license, cloud orchestrator, separate agentYour Mac, no orchestrator
Reads legacy desktop appsVia UIAutomation bindings, brittleVia AXUIElement, same tree as VoiceOver
Breaks on UI refreshFrequently, pixel-based matchers failRare, roles survive redesigns
Time to first working run4 to 8 weeksHours to a day
Cost for one SMB process$15k to $60kFree, Fazm app + the skill file
Portable across machinesNo, tied to platform tenantYes, drop the .md file into ~/.claude/skills/
Needs change managementYes, alwaysOnly if process is cross-team

A fair comparison, not an attack. For org-wide redesigns a consultant is still the right tool.

The numbers that decide hire vs self-serve

Rough decision rule: count the apps in the process, count the operators, and estimate the rate the consultant would bill you. If the first two numbers are small, the third doesn't justify itself.

0
apps or fewer
Self-serve is usually faster
0
operator
Skip change management phase
$0/hr
typical rate
2 hours of billable > a skill file

Before you scope a BPA consulting engagement

Walk us through the process live. If it fits in a skill file, we will show you. If it needs a consultant, we will say so.

Book a call

Questions buyers ask before hiring a BPA consultant

What does a business process automation consultant actually do week by week?

The canonical engagement has two phases. Weeks one to three are discovery: interviews, shadowing, and a process map that gets handed back as a Visio/Lucid diagram with SOP-style step notes. Weeks three to ten are implementation: a consultant configures an RPA platform (UiPath Studio, Automation Anywhere, Power Automate Desktop) to match that map, then runs UAT. Every top-5 result for this keyword (ProsperSpark, Bain, Grantbot, muchconsulting, BHW Group) sells some version of those two phases bundled together. The handoff between them is where most of the billable hours live.

Why does the process map end up as a separate artifact from the automation?

Because traditional RPA tools can only execute what a developer has hard-wired. UiPath Studio needs a .xaml workflow, Power Automate needs a JSON flow, Automation Anywhere needs a .atmx bot. None of these formats are close to plain English, so the map produced in discovery gets manually translated into a runtime artifact during implementation. That translation is labor. Fazm skips it because its runtime artifact is a markdown file an agent reads, not a developer-authored workflow.

What is the anchor mechanism that lets Fazm skip the RPA-developer step?

Two pieces. The first is `SkillInstaller` inside the Fazm Mac app: it scans `Resources/BundledSkills/*.skill.md`, hashes each file with SHA-256, and installs to `~/.claude/skills/`. A skill is just markdown with YAML frontmatter describing what the process does. The second is the live accessibility tree. When a skill runs, Fazm calls `AXUIElementCreateApplication(frontApp.processIdentifier)` and `AXUIElementCopyAttributeValue(appElement, kAXFocusedWindowAttribute, &focusedWindow)` against the frontmost Mac app. The agent reads the same structured tree a screen reader reads, then clicks, types, and navigates by role and label, not by pixel.

How is this different from screenshot-based RPA like UiPath image matching?

Screenshot matching reads a PNG, runs template matching or OCR, and guesses coordinates. A minor UI refresh, dark mode toggle, resolution change, or DPI shift breaks every template. AXUIElementCopyAttributeValue returns a structured tree with roles (`AXButton`, `AXTextField`), labels, enabled/disabled flags, and hierarchy. Resolution changes do not matter. A button renamed from Submit to Save still keeps its AXRole. A model consuming this tree burns fewer tokens than one re-OCRing every frame.

Which part of the consultant's job does this not replace?

Discovery against a large, cross-functional org with real change-management risk. If you have forty people in three departments with legacy SOPs nobody agrees on, you still need a human to run interviews, map the true-state vs ideal-state, and get buy-in. What a skill file replaces is the implementation handoff phase for processes that live on one person's Mac: month-end close, invoice intake, vendor reconciliation, listing management, weekly reports. Those are the engagements SMBs overpay consultants for because the true cost is in the RPA build, not the insight.

Do I need to know code to author a Fazm skill file?

No. A skill.md is English plus a YAML header. The installer in `Resources/BundledSkills/` ships example files like `telegram.skill.md` that read a notification, query a SQLite DB or macos-use MCP, and reply. The model fills in the AX tree calls at runtime. If you can write a standard operating procedure, you can write a skill.

What does a consultant engagement actually cost for an SMB?

Public ranges: ProsperSpark-style workflow consultants run $150 to $300 per hour, Bain and other tier-one firms start at six figures, and a mid-range boutique engagement for one end-to-end process (discovery plus UiPath/Power Automate build) sits in the $15k to $60k band for a two-month build. That is the reference point for deciding whether to hire or self-serve. If the process lives on one Mac, touches four or fewer native apps, and the user can describe it clearly, a skill file is the faster path.

Can I verify the accessibility mechanism myself?

Yes. Look at `AppState.swift` in the Fazm desktop source: the accessibility probe calls `AXUIElementCreateApplication(frontApp.processIdentifier)`, then copies `kAXFocusedWindowAttribute` and `kAXChildrenAttribute`. When TCC reports the permission is granted but the app cannot read the tree (common after macOS updates), it runs Finder as a known-good tie-breaker and retries every 5 seconds up to 24 attempts. That retry logic is visible in the open-source repo at github.com/mediar-ai/fazm.

What workflows still need a consultant even with Fazm?

Anything that requires an org-wide policy change, anything with SOX or HIPAA compliance that needs a signed workflow artifact, anything touching enterprise systems that still live behind a VPN-gated thin client from 2011, and anything where the win depends on re-designing the underlying business process rather than faster execution of the current one. Fazm automates the process the consultant would have documented. It does not replace the person who figures out whether that process should exist.