Automation business process, the part every guide skips: when the agent shares a desk with a live human

Every top-ranking guide for this keyword (IBM, Red Hat, UiPath, ServiceNow, Salesforce, Informatica, Atlassian, Acronis, Wikipedia) describes business process automation as a batch workflow that fires unattended. That framing is correct for server-side iPaaS. It falls apart the minute the automation runs on the same Mac the human is sitting at, because now the question is not what the workflow does but how it negotiates control with the person in the chair. This guide is about the three dials Fazm ships to handle exactly that negotiation, and the exact Swift files they live in.

M
Matthew Diakonov
12 min read
4.8from early Mac users
Open-source Swift + TypeScript
Reads the macOS accessibility tree, not screenshots
Three user dials tune agent behavior in real time

What the top 10 SERP results quietly share, and where they stop

Line up the first ten results for automation business process and read the first three paragraphs of each. You will find the same four-beat structure: one sentence of definition, three bullet points of benefits (cost, accuracy, compliance), a section on common process examples (onboarding, invoice routing, CRM updates), and a CTA for the vendor's cloud product. None of them discuss what happens when a human is actively using the same computer the automation runs on, because none of them are shipping an agent that does. That is the SERP gap this page fills.

IBM: 6 benefits listWikipedia: generic defRed Hat: iPaaS framingUiPath: RPA vendor pitchServiceNow: workflow platformSalesforce: BPA = iPaaSInformatica: data pipelinesAtlassian: Jira workflowsAcronis: cost + accuracyGartner: vendor matrix

The gap

All ten top results assume the automation runs on a server, or at most inside a headless browser. Zero of them describe how the automation should behave when the human is still at the keyboard. That is a different design problem, and it has its own vocabulary: Ask vs Act, Passive vs Proactive, verbose vs terse. Those are the three dials.

Dial 1: Ask mode vs Act mode

The coarsest setting. Either the agent is allowed to mutate state, or it is not. In traditional BPA there is no such distinction because the agent never has a human next to it who might ask a read-only question; it only runs when its workflow has been scheduled. On Fazm, Ask is the default posture for reporting-style questions (what did I do this week, summarize this screen, find the email from Scott about the invoice) and Act is the posture for execution (reply to Scott, move this file to the project folder, rename these 14 PDFs). The mode is a single enum with two cases.

ChatProvider.swift + ShortcutSettings.swift

That is the whole surface area of the dial in the desktop source. The value then travels to the ACP bridge at ACPBridge.swift:1515, and from there is exported as FAZM_QUERY_MODE to the MCP subprocess that executes tools, which gates the tool whitelist at acp-bridge/src/fazm-tools-stdio.ts:14. One enum value, three processes, one behavior change.

How 'Ask vs Act' flows from UI toggle to tool whitelist

Ask toggle
ChatProvider
ACP bridge
Read tools
Write tools
Capture tools

Dial 2: Proactiveness

The dial that surprises people most. Proactiveness is not a heuristic inside the agent runtime; it is a literal string appended to the LLM's system prompt. When you flip the dial in Fazm's settings, no code path changes. What changes is the text the model reads before every query. This is the anchor fact for the entire page: the automation's behavior is edited by rewriting its prompt, not by writing more Swift.

.proactive

Assume the user needs things done on their computer. Proactively find programmatic ways to accomplish tasks. Install missing dependencies (brew install, pip install, npm install) without asking.

ChatProvider.swift, floatingBarSystemPromptPrefix(compactness:proactiveness:), line 314

Desktop/Sources/Providers/ChatProvider.swift

Three cases, three different system-prompt suffixes. The model reads whatever string the dial selected, and its disposition changes accordingly. Passive gets no extra instructions. Balanced gets "take obvious actions, ask on ambiguous ones." Proactive gets the paragraph above, including the exact license to run brew install and npm install without a confirmation dialog.

Dial 3: Compactness

The third dial answers a question none of the top SERP guides ask: how much space should the automation take up on screen while the user is still trying to get their own work done? The floating bar is a 500-pixel-wide strip that hovers over a real work app. If it replies with a bulleted 200-word answer, it steals the user's attention. If it replies with one sentence, it blends back into the background. Fazm resolves this with three preset modes.

Off

No compactness enforcement. The agent answers as long as it thinks the question deserves. Good default for a full chat window.

Soft

Injects 'Be concise, prefer short answers (1-3 sentences) unless the question needs more detail.' into the prompt. For users who like context but want brevity by default.

Strict

Injects 'Respond in exactly 1 sentence. No lists. No headers. No follow-up questions.' Used when the floating bar is a coworker at your shoulder, not a chatbot.

Notice there is no max_tokens involved. Compactness does not cap output length after the fact; it changes what the model decides to generate in the first place. Setting .strict makes the model skip the outline, skip the bullet points, and write a single declarative sentence. A token cap would have cut that output off mid-bullet and looked broken.

How the three dials compose into nine behaviors

The dials are independent, so you get a grid of 2 x 3 x 3 = eighteen states. A few of them are worth naming because they match how real BPA buyers actually want to work. Here are the six that show up in user research.

Ask · Passive · Strict

The lawyer preset

Read-only, never volunteers, one-sentence answers. Good for a legal ops lead running a BPA pilot who wants the agent to answer questions about the state of the business without ever touching a file.

Act · Balanced · Soft

The SMB owner preset

Can execute, asks before ambiguous actions, answers in 1-3 sentences. Matches an owner-operator who wants the agent to do invoice intake and month-end reconciliation but still wants confirmation on the weird cases.

Act · Proactive · Off

The engineer preset

Full autonomy, auto-installs deps, writes the full rationale so you can audit it later. Good for a developer who knows exactly what they're asking and wants the agent to keep going without interruption.

Ask · Balanced · Strict

The reporting preset

Read-only BI-style questions with one-sentence answers. "How many emails from customers this week." "Which tasks are overdue." Replaces a lightweight BI dashboard on a single person's desk.

Act · Passive · Off

The compliance preset

Writes are allowed, but the agent waits for explicit instructions and explains each step. For a finance team doing quarterly close who wants a paper trail on everything the automation touched.

Act · Proactive · Strict

The assistant preset

Execute fast, one-sentence status back. Feels like delegating to a coworker who already knows the context. The loudest version of the floating bar and the most likely to feel invasive if the user's default is slower-paced.

Why traditional BPA tools could never have shipped dials like this

The dials only make sense if the automation is driven by an LLM reading a system prompt. Traditional BPA tools (UiPath Studio, Power Automate, Automation Anywhere, Blue Prism) compile their automations into a static artifact (.xaml, JSON flow, .atmx) and then execute that artifact deterministically. There is no system prompt to edit. A Proactiveness dial on UiPath would have to be rebuilt as a graph-level setting applied to every node, which no vendor has done because their graphs are too complex to retrofit.

Where the dials would live in each approach

FeatureTraditional BPA toolFazm
Ask vs Act modeWould require a graph-wide read-only flag and per-node gatingOne enum value, applies to the whole agent turn
ProactivenessWould require re-authoring every conditional node to check a global flagString concatenation into the LLM system prompt
CompactnessNot applicable (compiled workflows do not generate prose)System-prompt paragraph that shapes output form
Where the dial is definedScattered across workflow designer, runtime config, and per-node propertiesTwo files: ChatProvider.swift:302 and ShortcutSettings.swift:222
How to change the dial at runtimeRe-publish the workflow package; restart the bot runtimedefaults write com.fazm.app shortcut_proactivenessLevel -string Proactive

Verifying the anchor fact yourself

Everything on this page is checkable. Here is the exact command sequence. The app must be installed from fazm.ai for the bundle identifier to resolve.

verify-dials.sh

If you want to see how the dials mutate the system prompt, run Fazm with console logging on (tail -f /tmp/fazm-dev.log for a dev build, /tmp/fazm.log for production), ask a question from the floating bar, and grep for FLOATING BAR MODE. You will see the exact system-prompt block the LLM received, with the dial-specific paragraph inlined.

Choosing the right preset for your process

1

Step 1, decide whether the process is read-only

If the first version of the automation only reads state (answers questions, summarizes, reports), start in Ask mode. You can always flip to Act once you trust it. Putting a read-only process in Act mode is a common source of early abandonment because the agent will try to help by writing something.

2

Step 2, pick the proactiveness based on your risk tolerance

If the cost of a wrong action is high (sending an email to the wrong customer, renaming a file that breaks a downstream script), choose Balanced. If the cost of asking an unnecessary confirmation is higher (interrupting flow every 30 seconds), choose Proactive. Passive is for the rare case where the agent is a curiosity and you only want what you explicitly ask.

3

Step 3, pick compactness based on where the agent lives on screen

Floating bar hovering over Xcode? Strict. Floating bar during a sales call? Strict. Full-screen chat window on a second monitor while you review monthly books? Off. The compactness dial is the one most users never touch past setup, so get the default right for the room the agent lives in.

4

Step 4, capture the preset

Once a preset works for a given process, write it down in the skill file for that process. A well-authored skill.md can tell the user 'this process runs best in Act + Balanced + Soft' in its description, so the next person adopting the workflow starts in the right posture.

What this implies for the broader BPA question

The IBM / Red Hat / UiPath framing of business process automation has one implicit axis: efficiency per dollar. The pitch is that you hand a repeatable process to software and get it back cheaper. That framing is still correct for pure server-side work. What it leaves out is a second axis the next generation of automation is converging on: disposition per context. How should the automation behave when the user is watching? When the user is not watching? When the user asks a question mid-flow?

0user-facing dials
0distinct agent postures
0Swift files define them all
0graph/workflow recompiles

A dial is much cheaper to ship than a node in a workflow graph, and much faster for a non-technical buyer to understand. The shift from graphs to dials is the same shift that happened with thermostats, cruise control, and autofocus. The underlying mechanism went from a set of levers a technician tuned to a single knob the user twists. That is what happens to BPA once the runtime is an LLM.

What the three dials replace from the old BPA stack

  • A 'read only' flag on the workflow engine
  • Per-node 'require confirmation' checkboxes
  • A separate BI product for reporting-style questions
  • A verbose/terse logging toggle bolted on top of the runtime
  • A human-in-the-loop node type in the workflow designer

Curious which preset fits your process

Walk us through the workflow live and we will show you which of the 18 dial combinations it wants.

Book a call

Questions buyers ask about dial-based automation

Why do the top guides for 'automation business process' all feel interchangeable?

Because they share a hidden premise: that the automation runs on a server with no human co-pilot. IBM, Red Hat, UiPath, ServiceNow, Salesforce, Informatica, Atlassian, and the Wikipedia entry all describe BPA as a batch workflow you design once and then let fire unattended. That framing is correct for iPaaS-style orchestration but wrong for the growing class of local agents that live on a consumer Mac. When the automation and the human share the same keyboard, screen, and filesystem, the design problem is no longer 'what does the workflow do' but 'how does the workflow negotiate control with the person sitting in the chair.'

What are the three Fazm dials this guide is built around?

First, Ask versus Act. `ChatMode` is an enum in `Desktop/Sources/Providers/ChatProvider.swift` at line 302 with two cases: `case ask` (read-only) and `case act` (can execute write actions). The same mode value is forwarded to the ACP bridge at line 1515 of ACPBridge.swift and consumed by the stdio MCP server at `acp-bridge/src/fazm-tools-stdio.ts` line 14 via the FAZM_QUERY_MODE environment variable. Second, Proactiveness. `ProactivenessLevel` in `FloatingControlBar/ShortcutSettings.swift` line 222 has `passive`, `balanced`, `proactive`, each of which injects a different paragraph into the LLM's system prompt. Third, Compactness. `FloatingBarCompactness` in the same file at line 237 has `off`, `soft`, `strict`, which clamp response length to none, 1-3 sentences, or exactly 1 sentence.

What does the Proactive dial actually do under the hood?

It rewrites the system prompt. Open `Desktop/Sources/Providers/ChatProvider.swift` and read `floatingBarSystemPromptPrefix(compactness:, proactiveness:)` starting at line 314. When proactiveness is `.proactive`, the function appends the exact line: 'Assume the user needs things done on their computer. Proactively find programmatic ways to accomplish tasks, use tools, scripts, and LLM-based approaches. Just work on the task and get it done without involving the user unless clarifications are truly needed. When starting a task, check what tools, libraries, or dependencies are needed and install them automatically (e.g. brew install, pip install, npm install), do not fail or ask the user just because something isn't installed yet.' That is not a flag the agent sees at runtime. It is literally text concatenated into the LLM prompt. The dial is the prompt.

Why does Ask mode matter for business process automation?

Because half of what people actually automate about a business process is reporting, not execution. 'How long did I spend in Slack this week. Show me the vendors I have not paid this month. Summarize today's meetings. What's on my calendar tomorrow.' Those are read-only queries over the state of a person's Mac: SQLite databases, calendar events, accessibility tree, running app list. In Ask mode, the ACP bridge sets `FAZM_QUERY_MODE=ask` for the MCP subprocess at `acp-bridge/src/fazm-tools-stdio.ts` line 14, which gates the tool whitelist to read-only tools (execute_sql with SELECT only, capture_screenshot, query_browser_profile) and removes the write tools (execute_sql with INSERT/UPDATE/DELETE, complete_task that mutates the DB). This is the BPA 'reporting layer' that IBM's write-up treats as a separate BI product, collapsed into the same agent with a one-click mode toggle.

How is this different from the 'human in the loop' checkbox in UiPath or Power Automate?

A human-in-the-loop node in UiPath pauses a pre-built workflow at a predetermined step to collect a signature or approval. The graph is frozen; the human just unblocks it. Fazm's dials do something structurally different: there is no graph to pause. The automation is whatever the LLM decides to do this turn, reading the live accessibility tree. The dials change the LLM's disposition: does it ask before acting, take obvious actions and ask on ambiguous ones, or treat every task as an order to execute end to end including auto-installing missing CLI tools. 'Human in the loop' is a pause inside a static flow. Proactiveness is a personality setting on a dynamic agent.

Is this really 'business process automation,' or is it just a desktop assistant?

Every process automated by a small business owner on their Mac counts: invoice intake and filing, month-end reconciliation in a spreadsheet, customer-support triage across Mail and a CRM, product listing updates, weekly reporting, competitor price scraping, social comment triage. Those are textbook BPA processes. The historical distinction between 'BPA' and 'personal automation' only existed because the server-side BPA tools physically could not reach the apps those processes live in. Once the automation agent runs on the same Mac as Numbers, Mail, Safari, Finder, and a specific small-business CRM, the 'BPA versus personal' divide collapses. The dials exist because the automation and the user now share a machine.

Can I see the Proactive dial in action from a terminal?

Yes. The proactiveness value is persisted in standard macOS UserDefaults under the key `shortcut_proactivenessLevel`. Run `defaults read com.fazm.app shortcut_proactivenessLevel` after installing Fazm and changing the setting in `Settings > Shortcuts > Ask Fazm`. You can set it from the command line with `defaults write com.fazm.app shortcut_proactivenessLevel -string Proactive`. Restart the app and the next query your floating bar submits will include the proactive system-prompt block. This is not an API contract, it is just where the app happens to write the setting today.

Does Ask mode disable every write tool, or only some?

It disables the side-effecting ones. `execute_sql` still runs SELECTs in Ask mode but is denied INSERT, UPDATE, DELETE, CREATE, DROP, ALTER statements; `capture_screenshot` and `query_browser_profile` still work because they do not mutate anything; `complete_task` (which writes the user's task state to the local fazm.db) is denied. You can see the mode check on every tool call path in `acp-bridge/src/fazm-tools-stdio.ts`. The effect is that the agent can still answer 'what did I do today,' 'what's in my inbox,' 'summarize this screen,' but it cannot send a message, schedule a meeting, or mark a task done.

What does Compactness do that 'set a max_tokens' does not?

It shapes the form of the response, not the length budget. `.strict` literally injects 'Respond in exactly 1 sentence. No lists. No headers. No follow-up questions.' into the system prompt. `.soft` injects 'Be concise, prefer short answers (1-3 sentences) unless the question needs more detail. No unnecessary lists or headers.' max_tokens would cut a list in half mid-bullet; Compactness makes the model not generate the list in the first place. Fazm exposes this as a user dial because a floating bar floating over a work app is a completely different read experience than a full chat window. One deserves one sentence. The other deserves a thorough answer.

If I'm building internal BPA on my team's Macs, why should I care about these dials?

Because the adoption curve of a local agent lives and dies on behavioral tuning. A finance lead who gets interrupted with 'are you sure you want me to continue' before every step will abandon the product. A product manager whose screenshots get autosent to Slack without a confirmation will also abandon. The same automation with different dial positions is the difference. For an internal rollout, the meta-question is not 'which processes can we automate' but 'which dial preset does each role want.' An engineer wants `proactive + strict`; a legal ops lead wants `balanced + soft`; a CFO running their own month-end close wants `ask` during review and `act` only after they say so. The dials are the deployment artifact.