From inside a shipping consumer Mac agent

April 2026 piled up six frontier models in 30 days. One TextField decides which one moves your Mac.

DeepSeek V4, Qwen 3.5-Omni, Gemma 4, Meta Muse Spark, GPT-6, and Claude Opus 4.7 all shipped within a single calendar month. Most published guides walk you through the benchmarks. This one walks through the four-symbol code path inside Fazm, a consumer Mac app, that lets a user redirect the agent to any local LLM bridge: one AppStorage key, one TextField placeholder, one env var, one async hook.

customApiEndpointANTHROPIC_BASE_URLFazm 2.4.2April 26, 2026

Matthew Diakonov, Fazm, consumer Mac agent

Published April 26, 202611 min read

4.7from Sourced directly from /Users/matthewdi/fazm: SettingsPage.swift, ACPBridge.swift, ChatProvider.swift, MCPServerManager.swift, CHANGELOG.json

Single AppStorage key (customApiEndpoint) and three lines of env injection

ChatProvider.restartBridgeForEndpointChange() applies the change on the next query

2.4.2 changelog explicitly names local LLM bridges as the use case

Six frontier launches. One TextField.

The Custom API Endpoint seam inside Fazm, April 2026

DeepSeek V4 ships at $5.2M training cost

Qwen 3.5-Omni opens 10+ hours of audio context

Gemma 4 lands under Apache 2.0

User pastes a localhost URL into Settings

Next message routes through the local bridge

0:00 / 0:06

The short version

April 2026 was the busiest month the field has had for new frontier launches. DeepSeek announced V4 Flash and V4 Pro at roughly $5.2 million in training cost. Alibaba shipped Qwen 3.5-Omni with native ten-hour audio context. Google opened Gemma 4 under Apache 2.0. Meta announced Muse Spark with $115B-$135B in 2026 capex behind it. OpenAI shipped GPT-6. Anthropic flipped Claude Opus 4.7 to GA. Most published articles about this either rank the launches by benchmark or curate a leaderboard. Inside Fazm, a consumer Mac app, the question that mattered is different: which surface running on a user's Mac today can be pointed at any of those new models in seconds, without an App Store review, without a fork? The answer is one TextField, one env var, one async hook.

DeepSeek V4 FlashDeepSeek V4 ProQwen 3.5-OmniQwen 3.6 35B-A3BGemma 4 31B DenseGemma 4 familyMeta Muse SparkGPT-6GPT-5.5 agenticClaude Opus 4.7Claude Sonnet 4.6Llama 4 (legacy)GLM 5.1

The four-symbol seam

The whole apparatus that lets a user redirect Fazm at any April 2026 LLM is four named entities in the source. One AppStorage key. One TextField placeholder. One env var. One async hook. Together they are smaller than a typical settings card.

SettingsPage.swift

The placeholder string https://your-proxy:8766 deliberately implies a localhost-style port number. The help text under the field, also in source verbatim, names three example use cases by class: a local LLM bridge, a corporate proxy, or a GitHub Copilot bridge. None of them require a rebuild or a release.

Where the TextField goes once the user presses Return

The env injection inside the bridge launcher

When the bridge subprocess is spawned, the value of customApiEndpoint is read from UserDefaults and, if non-empty, assigned to ANTHROPIC_BASE_URL on the spawned environment. The packaged Anthropic SDK (`@anthropic-ai/claude-agent-sdk` inside acp-bridge) honors that env var at SDK init.

ACPBridge.swift

From the agent runtime's point of view, nothing changed. From the network's point of view, every /v1/messages call is now hitting a process the user controls.

The async hook that makes the change apply on the next message

Fazm does not eagerly restart the bridge when the endpoint changes. It stops the current bridge and flips an internal flag. The next user query spawns a fresh bridge with the new env. The hook is six lines.

ChatProvider.swift

The settings UI calls this hook from two places. The toggle on the settings card (line 926) calls it when the toggle is flipped off and the field is cleared. The TextField onSubmit handler (line 940) calls it when the user presses Return after editing the URL. Both paths land in the same async function.

Six steps from settings card to a non-Anthropic April 2026 LLM

User toggles Custom API Endpoint on

Settings card at SettingsPage.swift line 906. Toggle is bound to a derived state: enabled when either the toggle is on or customApiEndpoint is non-empty.

User types https://your-proxy:8766

TextField at line 936 binds to the AppStorage-backed property. SF Symbol on the card is server.rack, deliberately mechanical.

User presses Return

onSubmit handler at line 939 fires await chatProvider?.restartBridgeForEndpointChange(). The bridge is not eagerly restarted; the in-flight bridge is stopped and the flag flipped.

Next message triggers a fresh bridge spawn

ACPBridge.swift line 380 reads customApiEndpoint again. If non-empty it sets env[ANTHROPIC_BASE_URL] = customEndpoint. The env is handed to the spawned subprocess at line 396.

Anthropic SDK respects ANTHROPIC_BASE_URL

The packaged @anthropic-ai/claude-agent-sdk reads ANTHROPIC_BASE_URL from process.env at SDK init. Every /v1/messages call from the agent now goes to your bridge.

Bridge translates and serves your chosen LLM

Outside Fazm. A user-supplied process speaking Anthropic's /v1/messages format on the chosen port. The model could be DeepSeek V4, Qwen 3.5-Omni, Gemma 4, or any April 2026 release with an open-protocol shim.

1 TextField

“Updated Custom API Endpoint help text to mention local LLM bridges as an example.”

Fazm 2.4.2 release notes, /Users/matthewdi/fazm/CHANGELOG.json, dated 2026-04-26

The byte budget that makes a local model viable here

The reason a small consumer Mac app can plausibly route Mac tool-use traffic through a local DeepSeek V4 or Qwen 3.5-Omni is that the per-step input is small. Fazm sends the model a structured accessibility tree, not a screenshot. The bundled MCP server lives at Contents/MacOS/mcp-server-macos-use inside the app bundle and harvests AX state through Apple's AXUIElementCopyAttributeValue API.

0 KB

PNG screenshot per UI step

0 KB

AX tree per UI step

Input shrinkage per step

Symbols in the endpoint seam

A typical Gmail inbox is roughly two kilobytes of accessibility tree, or roughly eighty kilobytes encoded as a base64 PNG. Models bill by token, not by pixel. That is the shape of the math that makes a local Gemma 4 31B Dense or a DeepSeek V4 Flash even thinkable for chained Mac tool use, where a real workflow may run forty turns deep.

April 2026, mapped onto one Settings field

Six release events in one month. Two of them are reachable by default (Sonnet 4.6 and Opus 4.7, served by api.anthropic.com). The other four are reachable only via a user-supplied bridge whose URL ends up in one TextField.

DeepSeek V4 Flash + V4 Pro

Released April 24, 2026. Roughly $5.2M in training spend, Anthropic-class coding and agentic benchmarks. Open weights are runnable on capable Mac silicon, but only reach a Mac surface if a local Anthropic-compat bridge serves them.

Qwen 3.5-Omni

Native omnimodal: 10+ hours of audio context and 400+ seconds of 720P video in one shot. Open weights. Plugs into Fazm only via the Custom API Endpoint TextField backed by an Anthropic-protocol bridge.

Gemma 4 family

April 2 release under Apache 2.0. The 31B Dense beats models 20x its size on intelligence-per-parameter. Reachable from Fazm via Ollama or MLX behind a local /v1/messages bridge listening on a chosen port.

Meta Muse Spark

Alexandr Wang's first frontier model under Meta Superintelligence Labs. Proprietary, multimodal, agentic. Backed by a $115B-$135B 2026 capex announcement, signaling that the 'one shipped binary serves any model' question is becoming a multi-vendor problem.

Claude Opus 4.7 GA

Flipped to GA on April 22, 2026. Sits inside Fazm's Smart pill via the four-row substring table. Routed through api.anthropic.com by default. ANTHROPIC_BASE_URL flips that route in one TextField.

GPT-6 + GPT-5.5

Top-end OpenAI release plus an agentic preview. Reachable from Fazm only via a local OpenAI-to-Anthropic translation bridge serving /v1/messages on the configured port.

What the bridge spawn looks like with a custom endpoint set

acp-bridge stderr, captured during a bridge spawn with customApiEndpoint=http://localhost:8766

The first line is the actual log statement at ChatProvider.swift line 2104. The third and fourth lines are the env override that ACPBridge.swift line 381 produced. The POST URL is what the packaged @anthropic-ai/claude-agent-sdk emits when ANTHROPIC_BASE_URL is set in process.env.

The April 2026 timeline as it actually rolled into a shipping app

Six dated events. One Settings field. The endpoint TextField predates April 2026; what shifted in April was the help text that names local LLM bridges as the example, and the supply of frontier models behind that help text.

April 2, 2026 — Gemma 4 family lands

Google opens Gemma 4 under Apache 2.0. Four variants, the 31B Dense beats much larger closed models on intelligence-per-parameter. Runnable locally via MLX or Ollama on M-series silicon. To touch a Fazm session, it has to come through a local Anthropic-compat bridge.

April 16, 2026 — Qwen 3.6 / 3.5-Omni window

Alibaba ships Qwen 3.6 35B-A3B and the omnimodal Qwen 3.5-Omni in the same release window. Native ten-hours-of-audio context. Open weights. Same plumbing requirement: a local /v1/messages translator, plugged into Fazm via the Custom API Endpoint TextField.

April 20, 2026 — Fazm 2.4.0

Fazm 2.4.0 ships. Two relevant entries land in the same release: 'Available AI models now populate dynamically from the agent, so newly released Claude models appear without an app update' and 'Added custom MCP server support via ~/.fazm/mcp-servers.json with Settings UI to add, edit, and toggle servers'. Both seams are about admitting external configuration without rebuilding.

April 22, 2026 — Claude Opus 4.7 GA

Anthropic flips Opus 4.7 to GA. The default route through api.anthropic.com starts serving the new model. Users running Fazm with `customApiEndpoint` empty pick it up automatically. Users with a custom endpoint route through their bridge instead.

April 24, 2026 — DeepSeek V4 unveiled

DeepSeek announces V4 Flash and V4 Pro. Roughly $5.2M training cost. Open weights. The route into a Fazm session, on the day of the announcement, is the same as for Gemma or Qwen: stand up a local Anthropic-protocol bridge, paste its URL into the TextField.

April 26, 2026 — Fazm 2.4.2 ships explicit naming

Version 2.4.2 of Fazm ships the explicit changelog entry: 'Updated Custom API Endpoint help text to mention local LLM bridges as an example.' The same day, the help text under the TextField now reads: 'Route API calls through a custom endpoint (e.g. local LLM bridge, corporate proxy, or GitHub Copilot bridge).'

Two ways to expose an April 2026 LLM to a Mac surface

One requires a provider abstraction, a build, and a release. The other is a pasted URL.

Feature	Native provider per LLM	Fazm (Custom API Endpoint TextField)
Adding a brand-new April 2026 LLM to the active surface	Add a provider abstraction in source, ship a build, push to App Store review, wait for users to update, repeat per model	User pastes a URL into the Custom API Endpoint TextField; the next message routes through their chosen LLM
Per-step input to the model	~80 KB base64 PNG of the active window per step; encoder cost dominates token bill	~2 KB structured accessibility tree harvested via AXUIElementCopyAttributeValue, served by the bundled mcp-server-macos-use
Switching models mid-session	Pick a different model in the picker, re-validate prompts, restart the chat	Edit the URL, press Return; restartBridgeForEndpointChange() flips the next-spawn env and the next message is on the new model
What the user picks visually	Drop-down listing every model ID from every lab	Three pills in the floating bar (Scary, Fast, Smart). Provider-agnostic. The custom endpoint sits behind a separate toggle in Advanced
Backing for non-Anthropic April 2026 releases	Wait for a native provider integration per release	Bring your own /v1/messages bridge. Fazm gives you the seam, you pick what runs behind it

Why the consumer surface stayed three pills wide

The floating bar at Cmd+Shift+Space shows three labels: Scary, Fast, Smart. They map to Anthropic Haiku, Sonnet 4.6, and Opus 4.7 respectively, and the labels do not fork by provider. April's flood of releases did not add fourth or fifth pills. It did not add a provider picker. It did not add a per-model preference matrix. What it did was justify, in the changelog of 2.4.2, a single line of help text under one TextField. That minimalism is the design. The seam exists for users who know what they want behind it, hidden behind a Toggle in Advanced. The default surface stayed three pills wide because the right product answer for a consumer Mac user opening their menu bar at 9am on a Tuesday is not which of seventeen frontier launches to use; it is which task to run.

What is not in the box

Fazm does not bundle a local LLM. It does not run DeepSeek V4 or Qwen 3.5-Omni or Gemma 4 inference on your Mac on its own. Those models are open weights from their original publishers, and getting them to speak Anthropic's /v1/messages shape is the responsibility of whichever bridge a user picks. The seam Fazm provides is small on purpose: one env var, one subprocess respawn, one help-text line citing local LLM bridges as an example. That is the entire surface this guide documents. The April 2026 news cycle of frontier launches changes which bridges are interesting; the seam itself remained the same shape across the month.

Want to see this routed at a local Gemma 4 or DeepSeek V4 bridge?

Walk through the Custom API Endpoint card, the env override, and the byte-budget math on a 20-minute call.

Frequently asked questions

Which LLM launches actually mattered in April 2026?

Inside one calendar month, DeepSeek released V4 Flash and V4 Pro at roughly $5.2 million in training spend; Alibaba pushed Qwen 3.5-Omni, an omnimodal model that ingests over ten hours of audio and 400 seconds of 720P video; Google opened Gemma 4 under Apache 2.0 with the 31B Dense beating much larger closed models on intelligence-per-parameter; Meta unveiled Muse Spark, the first proprietary frontier model out of Alexandr Wang's Superintelligence Labs; OpenAI shipped GPT-6 and previewed an agentic GPT-5.5; and Anthropic flipped Claude Opus 4.7 to GA on April 22 alongside Claude Sonnet 4.6 staying the everyday default. The interesting question for a consumer Mac app is not which one wins which benchmark. It is which of these can a user point their Mac surface at, the same hour they download the weights, without waiting for an app update.

What part of Fazm exposes any of those models to a local Mac workflow?

One AppStorage key in the settings UI. The exact line in the Swift source is `@AppStorage("customApiEndpoint") private var customApiEndpoint: String = ""` at /Users/matthewdi/fazm/Desktop/Sources/MainWindow/Pages/SettingsPage.swift line 840. The card that wraps it is at lines 906 to 952. The card title is the literal string Custom API Endpoint, the SF Symbol used is server.rack, and the placeholder is `https://your-proxy:8766`. The help text reads verbatim: 'Route API calls through a custom endpoint (e.g. local LLM bridge, corporate proxy, or GitHub Copilot bridge). Leave empty to use the default Anthropic API.' That string lives at line 943 of the same file. The 2.4.2 changelog entry on April 26 even records the moment that help text was edited to call out local LLM bridges by name: 'Updated Custom API Endpoint help text to mention local LLM bridges as an example.'

How does that one TextField actually reach the agent subprocess?

Through three lines of Swift inside the bridge launcher. /Users/matthewdi/fazm/Desktop/Sources/Chat/ACPBridge.swift lines 379 to 382 read: '// Custom API endpoint (allows proxying through Copilot, corporate gateways, etc.) / if let customEndpoint = defaults.string(forKey: "customApiEndpoint"), !customEndpoint.isEmpty { env["ANTHROPIC_BASE_URL"] = customEndpoint }'. That `env` dictionary becomes the environment of the spawned ACP subprocess at line 396. The Anthropic SDK that ships inside acp-bridge respects ANTHROPIC_BASE_URL, so every /v1/messages call from the agent runtime then targets your bridge instead of api.anthropic.com. From the agent's point of view, nothing changed. From the model's point of view, traffic is being served by whatever process is listening on `https://your-proxy:8766`.

What happens when the user changes the endpoint mid-session?

The settings UI calls a Swift async function called `restartBridgeForEndpointChange()`. It is defined at /Users/matthewdi/fazm/Desktop/Sources/Providers/ChatProvider.swift lines 2101 to 2107. The body is six lines: it reads the current `customApiEndpoint` from UserDefaults, logs the change, calls `await acpBridge.stop()`, and flips an internal `acpBridgeStarted` flag back to false. The bridge is not restarted eagerly. It is restarted lazily on the next user query, so the new ANTHROPIC_BASE_URL is picked up cleanly, with the new env, on the very next message. The on-screen toggle on the settings card and the TextField onSubmit handler both call the same hook, at SettingsPage.swift lines 926 and 940 respectively.

Could I really wire DeepSeek V4 or Qwen 3.5-Omni into Fazm with this?

Yes, with one external dependency you bring yourself: a local server that speaks the Anthropic /v1/messages protocol and translates it to whichever LLM you actually want to call. Open-source bridges that do this exist (LiteLLM's Anthropic-compatible mode, the `claude-anthropic-proxy` family, and several model-specific projects on GitHub). You run that bridge on `localhost:8766`, you put `http://localhost:8766` into Fazm's Custom API Endpoint field, and the next time you press Cmd+Shift+Space the agent goes through your local server. Caveats are real and worth flagging. The agent expects Anthropic-shaped tool-use blocks; bridges vary in how faithfully they convert to OpenAI tool-call schemas and back, and quality differs sharply between Sonnet-class and smaller open weights when chaining 40+ tool steps. Fazm itself does not certify those bridges.

Why is the per-step input small enough that a local model can keep up?

Because Fazm sends a structured accessibility tree, not a screenshot, on every UI step. The bundled MCP server lives at Contents/MacOS/mcp-server-macos-use inside the app bundle, written in Swift, and harvests AX state through Apple's AXUIElementCopyAttributeValue. A typical Gmail inbox returns roughly two kilobytes of structured text. The same inbox encoded as a base64 PNG is roughly eighty kilobytes. The model bills by token, not by pixel, so per-step input drops from roughly twenty-eight thousand image-like tokens to roughly seven hundred text tokens. That is the math that makes routing through a local DeepSeek V4 or Qwen 3.5-Omni even thinkable on consumer hardware.

Where is the setting in the actual UI?

Settings > Advanced > AI Chat. The sidebar entry that takes you there is registered at /Users/matthewdi/fazm/Desktop/Sources/MainWindow/SettingsSidebar.swift line 57 with `settingId: "aichat.endpoint"`, the icon `cpu`, and the keyword list `["endpoint", "proxy", "base url", "anthropic", "copilot", "gateway", "corporate"]`. So if you type any of those words into the settings searcher, the card jumps to the top. The card itself is collapsed behind a Toggle by default, because it is the kind of setting that should ship hidden until somebody knows they want it.

What about MCP server-style integrations? Can I plug a non-Anthropic model in that way?

Yes. Separate from the endpoint TextField, Fazm 2.4.0 also added user-defined MCP servers. The manager is at /Users/matthewdi/fazm/Desktop/Sources/MCPServerManager.swift, and the config file lives at `~/.fazm/mcp-servers.json` per the docstring on line 4. The format mirrors Claude Code's `mcpServers`, with `command`, `args`, `env`, and `enabled`. That is a different lever: it adds tool-providing servers to the agent rather than redirecting the model itself. Together, the Custom API Endpoint (model-side) and `~/.fazm/mcp-servers.json` (tool-side) are the two seams a user can open without touching app source.

Why doesn't Fazm just ship native support for DeepSeek, Qwen, Gemma, and GPT?

Two reasons, both visible in the codebase. First, the agent runtime that drives Mac-side tool use is Anthropic's ACP, packaged in /Users/matthewdi/fazm/acp-bridge. The bridge talks to a Claude agent SDK and emits an `availableModels` payload whose IDs the floating-bar picker maps to three labels (Scary, Fast, Smart). Adding non-Anthropic native providers would mean shipping a parallel runtime per provider. Second, the surface area of a consumer app is the picker itself. Three pills is the design. The Custom API Endpoint setting is the escape hatch for users who want a non-Anthropic model under the hood without forking the picker. The 2.4.2 changelog entry on April 26 records the design's framing: the help text was specifically updated to mention local LLM bridges as an example.

Can I check the file paths and line numbers in this article?

Custom API Endpoint card, AppStorage declaration: /Users/matthewdi/fazm/Desktop/Sources/MainWindow/Pages/SettingsPage.swift line 840. The card UI: lines 906 to 952. The placeholder string `https://your-proxy:8766`: line 936. The help text mentioning local LLM bridges: line 943. The env injection: /Users/matthewdi/fazm/Desktop/Sources/Chat/ACPBridge.swift lines 379 to 382. The async restart hook: /Users/matthewdi/fazm/Desktop/Sources/Providers/ChatProvider.swift lines 2101 to 2107. The settings sidebar registration: /Users/matthewdi/fazm/Desktop/Sources/MainWindow/SettingsSidebar.swift line 57. The MCP server manager docstring naming `~/.fazm/mcp-servers.json`: /Users/matthewdi/fazm/Desktop/Sources/MCPServerManager.swift line 4. The 2.4.2 changelog entry: /Users/matthewdi/fazm/CHANGELOG.json, version 2.4.2, dated 2026-04-26, the line beginning 'Updated Custom API Endpoint help text'.

Related guides from the Fazm field notes

Architecture

Latest LLM models April 2026

The four-row substring table inside Fazm that absorbs new Claude model IDs without an app release. Companion piece on the Anthropic-side seam.

Read

Claude

Anthropic Claude latest updates April 2026

The byte-budget shift behind Sonnet 4.6 and Opus 4.7 on a Mac, with the bundled MCP server path and the per-call input math.

Read

API

Anthropic API changelog April 2026

What changed at the API level inside Anthropic's protocol bumps in April, and why the Fazm bridge picks them up without UI churn.

Read