macOS desktop agent: UI vs CLI

A native UI does not change the agent loop, it changes what survives between sessions

The popular framing of UI vs CLI for a macOS desktop AI agent is editor-vs-terminal, or one vendor against another. That framing misses the actual difference for people running an agent loop like Claude Code or Codex day to day. The agent is identical: Fazm wraps the same Claude Code loop the CLI runs, through ACP. What changes is the persistence layer wrapped around it. A UI fixes three concrete failure modes of the raw CLI that hurt the moment a chat lives longer than one terminal tab.

M
Matthew Diakonov
9 min read

Direct answer (verified 2026-05-19)

For a one-shot shell task, the raw CLI is the simpler choice: it is faster to spin up, it pipes cleanly, and it runs in any terminal. For ongoing work where a chat outlives the current terminal tab, a native UI that wraps the same agent loop wins on three specific things the CLI does not do: keep your open chats across a Mac restart, fork a thread in one click, and leave the full conversation in context without an auto-compaction pass dropping old decisions. The agent itself is the same loop in both. Source for the Fazm side is at github.com/m13v/fazm.

Same agent loop, different shell around it

The Claude Code CLI and Fazm both run the Anthropic agent loop through the Agent Client Protocol. The CLI hosts the loop in a terminal: stdin in, streamed tokens and tool calls out, the session backed by a JSON file on disk. Fazm hosts the same loop in a Swift UI app. Under the hood the Swift app launches a Node subprocess called acp-bridge that runs @agentclientprotocol/claude-agent-acp for Claude and codex-acp for Codex. The UI talks to acp-bridge over JSON-RPC, the bridge talks to the agent over ACP. The model is the same model, the tools are the same tools, the MCP servers are the same MCP servers. What is genuinely different is the layer wrapped around the loop.

That layer is the persistence layer. It is responsible for what a chat is, where it lives, how you fork it, and whether the whole history stays available to the agent. The CLI's answers to those questions are minimal because a terminal is a thin shell. The UI's answers are richer because it has UserDefaults, window controllers, and a SwiftUI button you can wire to one line of code.

Failure mode 1: chats die on a Mac restart

The most common complaint about running an agent in a terminal is what happens after the Mac reboots, the laptop sleeps for too long, or the shell crashes. The session JSON survives on disk. Everything else does not. The open tabs, the live sockets, the in-memory scrollback, the streaming response that was in flight. After login you are looking at an empty terminal, and the only way back to where you were is to remember which session ids you had open and resume each one by hand.

What you actually see after a Mac restart

In Fazm the same problem has a different shape because the window registry is persisted. When the app launches, DetachedChatWindowController.restoreWindows() decodes the saved snapshots from UserDefaults under the key registryKey, and for each snapshot it loads the message history from the local DB with a ten-attempt retry loop and a 500ms backoff per attempt. The retry is there because the DB warmup can race the window restore on a cold boot; failed entries stay persisted so the next launch can try again instead of silently losing the chat. The source lives at Desktop/Sources/FloatingControlBar/DetachedChatWindow.swift around line 761 and you can read it on GitHub. Net effect: a reboot opens every chat back at its old position with its full conversation visible.

Failure mode 2: forking a chat is a multi-step session-id dance

Forking is the move where you want to keep the parent conversation intact and try a different branch from the same point. It is one of the most common things to want when an agent is doing exploratory work, because the cheapest way to test a different approach is to clone the current state and steer the copy. In raw Claude Code there is no first-class fork verb. The workflow is to find the session JSON, copy it to a new UUID, open a new terminal so the parent socket does not get torn down, and resume from the clone.

What fork looks like on each side

# raw CLI: fork the current chat into a new branch

# 1. find the session id of the chat you want to fork
ls ~/.claude/projects/<project>/sessions/ | sort -r | head -5
# returns: 8a3f-..., 2c1e-..., ...

# 2. copy the one you mean
SRC=8a3f-1c40-4e7d-9b22-aeebbc11f0a8

# 3. open a new terminal so the parent chat keeps running
# (otherwise you lose the live socket on the source session)
$ tmux new-window

# 4. resume the source as the seed for the branch
$ claude --resume $SRC

# 5. now you are in the same session, not a fork.
# to actually branch you have to clone the session file:
$ cp ~/.claude/projects/<project>/sessions/$SRC.json \
     ~/.claude/projects/<project>/sessions/$(uuidgen).json
$ claude --resume <new-uuid>
# and remember which tmux pane is which
0% lines you do not write

The UI side is a single SwiftUI button. The view is ForkChatButton at line 1708 of AIResponseView.swift. The action it fires goes through ChatProvider.forkSession(fromKey:toKey:) (line 1752 of ChatProvider.swift) into ACPBridge.forkSession at line 846, which sends a JSON-RPC message of type forkSession to the Node bridge. The bridge calls session/fork on the ACP server, gets back a new session id for the branch, and emits a session_forked event carrying both ids. The UI pivots into a new detached window pre-loaded with the prior conversation. Here is what that traffic looks like.

What one tap of the Fork button actually sends

Fork buttonChatProviderACPBridge (Node)Claude ACPuser taps arrow.triangle.branchforkSession(fromKey, toKey)session/forknew sessionId for branchsession_forked (from, to)open new window, full history

The original chat keeps streaming. No new terminal, no uuidgen, no cp into the sessions directory. If you decide the branch was a dead end, close the window and the parent is still running where you left it.

Failure mode 3: auto-compacting silently drops decisions

The third failure mode is the quietest and the most expensive when it bites. As a long conversation approaches the context window, the Claude Code agent runs a summarisation pass that compacts older turns into a condensed recap and drops the originals. The summary captures the gist of what happened. It does not capture every line of code you pasted, every tool output, every concrete decision made along the way. If you were relying on a fact from message 80 still being in the model's context at message 200, auto-compaction can quietly leave it out. The first you know is the agent doing something that contradicts what you decided two hours ago.

The ACP server reports this event. In ACPBridge.swift around line 160 there is a status event compactBoundary(trigger: String, preTokens: Int) that fires whenever a compaction pass completes, with the token count before the pass and the trigger that caused it. The CLI runs through these boundaries on whatever cadence the agent decides. Fazm's default keeps the full chat history live in context for the window's lifetime. If you want to compact, run the slash command on purpose; otherwise the conversation stays whole.

When the CLI is still the right answer

The UI is not a strict superset. Three cases where the raw CLI is still the better tool:

  • One-shot shell tasks. A throwaway task that fits in a single short conversation, where you do not care about a chat surviving a restart because you are going to discard it anyway. The CLI is faster to spin up and gets out of the way.
  • Piping into other shell tools. If you want the agent's output to flow into jq, into a file, into a make target, the CLI's stdout is a clean stream. The UI is rendered, not piped.
  • Headless or remote environments. A CI box, a server over SSH, a tmux session on a remote Linux machine. The UI is a macOS app, so it is local to your laptop only. The CLI runs anywhere bash runs.

For anything else, the test is simple. If a chat is going to live longer than the current terminal tab, you want the persistence layer. If it is not, you want the simpler shell. The agent loop is the same either way.

Beyond the terminal: the same agent reaches into other apps

One thing a UI app can do that a terminal cannot is hold the macOS Accessibility entitlement. Once the user grants Fazm Accessibility permission in System Settings, the same agent loop can call the macos-use MCP server to read the AX tree of the frontmost app, walk it by named attributes, and press buttons in real apps. The probe lives at Desktop/Sources/AppState.swift around line 488 and uses AXUIElementCreateApplication(frontApp.processIdentifier) on the real frontmost PID. A CLI agent could shell out to AppleScript, but it cannot easily walk arbitrary frontmost windows because the calling process is the terminal, not the app the user thinks of as the agent. So when the question is “what reaches outside the terminal at all”, the UI form factor wins by default. Voice input is the same story: the WhisperKit transcription path needs a UI process to hold the mic hotkey, and the audio never leaves the laptop.

Where to read the source

Every claim on this page is auditable in the open source repo. The exact entry points:

  • Fork: Desktop/Sources/Chat/ACPBridge.swift around line 846 (forkSession) and Desktop/Sources/FloatingControlBar/AIResponseView.swift around line 1708 (ForkChatButton).
  • Window restore: Desktop/Sources/FloatingControlBar/DetachedChatWindow.swift around line 761 (restoreWindows, ten-retry loop).
  • Compaction visibility: Desktop/Sources/Chat/ACPBridge.swift around line 160 (compactBoundary status event with trigger and preTokens).
  • Accessibility probe: Desktop/Sources/AppState.swift around line 488.

Repo: github.com/m13v/fazm. MIT licensed. The hosted build is a convenience; building from source produces the same agent.

Want help deciding if the UI form factor is worth it for your workflow?

Quick 20-minute call. Bring the chats you actually want to survive a restart and we will walk through whether the persistence layer pays for itself in your setup.

Frequently asked questions

Is the UI actually running the same agent as the CLI, or is it a different model wrapper?

Same agent. Fazm wraps Claude Code through @agentclientprotocol/claude-agent-acp and Codex through codex-acp. The Swift UI talks to a local Node subprocess (acp-bridge) over JSON-RPC, and the bridge speaks ACP to the same agent process the CLI would spawn. The model picker in the UI maps to the same `--model` value you would pass on the command line. So when someone says 'I prefer the CLI because the agent is better' they are usually thinking of a different product, like Cursor's editor agent. Fazm and the raw Claude Code CLI run the same agent loop.

What specifically dies when I restart my Mac while running raw Claude Code?

The session files on disk survive. What dies is everything in process memory: the open terminal tabs, the active socket to each session, the in-memory chat scrollback that has not been flushed, the streaming response if one was in flight. After login you have an empty terminal and have to manually resume each session you remember the id for. In a UI wrapper that snapshot is the persistence layer's job. Fazm writes a window registry to UserDefaults and on launch calls DetachedChatWindowController.restoreWindows() (DetachedChatWindow.swift line 761), which retries loading messages 10 times with a 500ms backoff per window so a slow DB warmup does not lose a chat. Every window that was open at quit comes back at the same position with the conversation intact.

How exactly does fork work in the CLI versus the UI?

In raw Claude Code, forking is not a first-class verb. To preserve the parent and start a branch you find the session JSON on disk, copy it to a new UUID, open a new terminal so the parent's live socket does not get torn down, and resume from the cloned file. Then you remember which terminal is which. In Fazm the ForkChatButton at AIResponseView.swift line 1708 is one SwiftUI Image button with the icon arrow.triangle.branch. Tapping it calls ChatProvider.forkSession, which calls ACPBridge.forkSession at line 846, which sends a forkSession JSON-RPC message to acp-bridge, which issues session/fork to the ACP server. The bridge replies with session_forked carrying both fromSessionId and toSessionId, and the UI opens a new detached window pre-loaded with the full prior conversation. The original chat keeps streaming.

What does auto-compacting actually do, and why is it a problem?

When a Claude Code conversation gets long enough to risk hitting the context window, the agent runs a summarisation pass that compacts older turns into a condensed recap and drops the originals. The recap captures the gist; it does not capture every decision, code reference, or tool output verbatim. If you were depending on something specific from message 80 to be in context at message 200, compacting can quietly leave it out. The CLI does this whenever it decides to. Fazm exposes the same compactBoundary status event the ACP server emits (ACPBridge.swift line 160), and the default in the UI is to keep the full chat history live for the window's lifetime so nothing summarises behind your back. If you want to compact, that is a deliberate action, not an ambient one.

When is the raw CLI still the right choice?

Three cases. One: a one-shot task that fits in a single short conversation, where you do not care if the chat survives a restart because you are going to throw it away anyway. Two: heavy scripting or piping, where you want the agent's output to flow into another shell tool. The CLI's stdout is a clean stream; the UI's chat is rendered. Three: any environment where you cannot run a GUI app, like a server, a CI box, or a remote SSH session. The CLI runs in tmux on a headless Linux box; the UI does not.

Does the UI add latency compared to the CLI?

Not meaningfully. The bottleneck for the agent loop is model inference, not local IPC. Fazm's Swift UI talks to acp-bridge over a local pipe; acp-bridge talks to the ACP agent over a local socket. The hops are local and the messages are small. Where the UI does add cost is memory: a long-lived window with full chat history loaded is a bigger resident set than a terminal scrollback. On a modern Mac with 16GB plus that has not been a problem in practice.

Can I see the same accessibility-API and computer-use tools from a CLI agent?

Not really. The macOS Accessibility API needs to be called from a process running on the same macOS instance as the windowserver, and it needs the host app to have been granted Accessibility permission in System Settings. A CLI agent running in a terminal can shell out to AppleScript, but it cannot easily traverse the AX tree of arbitrary frontmost apps because the calling process is your terminal, not a UI app the user thinks of as 'the agent'. Fazm's UI app holds the Accessibility entitlement, so when the agent calls macos-use to read the frontmost window's element tree, the call comes from a process the user has explicitly authorized.

Is this open source, or just the wrapper?

The whole thing is open source under MIT at github.com/m13v/fazm. The Swift UI, the acp-bridge Node process, the AX probe, the WhisperKit voice path, the MCP wiring, and the window restoration logic are all in the repo. You can read forkSession in ACPBridge.swift, restoreWindows in DetachedChatWindow.swift, the ForkChatButton view in AIResponseView.swift, and the compactBoundary status event handling all in the open. If you would rather build from source than install the binary, that path works and the agent loop is identical.

fazm.AI Computer Agent for macOS
© 2026 fazm. All rights reserved.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.