APRIL 2026 — FIVE FAMILIES, ONE MENU

April shipped five model families. Your Mac agent cannot rebuild for every one.

Claude Opus 4 and Sonnet 4 on April 2. Gemini 2.5 Pro on April 1, Flash on April 3. GPT-5 Turbo on April 7. Llama 4 Scout and Maverick on April 5. Four Gemma 4 variants with day-one vLLM support. Every Mac roundup this month lists the models. None answer the question a desktop agent author actually has: how does a signed macOS app absorb a new model ID the same day it lands, without shipping a build? This guide reads April 2026 through that lens, with three source files and thirty lines of Swift you can verify in the Fazm tree today.

M
Matthew Diakonov
12 min read
4.9from Written from the April 2026 release changelogs plus the Fazm source tree (MIT, github.com/mediar-ai/fazm)
Five model families released inside four weeks
Three pre-warmed ACP sessions so model swap has zero cold-start cost
Substring-match classifier that auto-labels a fresh release on day zero
ShortcutSettings.swift lines 159-163 — the nine lines that do it
Works across Claude, GPT-5 Turbo, Gemini 2.5, and local models behind a shim

The April release calendar, in one glance

This was not a drip. Four of the five major model labs pushed their spring flagship inside the same eight calendar days. A desktop agent that hardcodes a model pin had to patch five times. A desktop agent that matches by family patched zero.

0frontier families shipped
0days between first and last launch
0pre-warmed sessions on Mac
0Fazm app updates required to expose them

the release cluster

What landed, when, on what substrate

1

2026-04-01 — Gemini 2.5 Pro

1M token context window, native multimodal reasoning in a single prompt. On Mac: through the Google AI Studio API or a Vertex proxy; no local inference.

2

2026-04-02 — Claude Opus 4 and Sonnet 4

Opus 4 for extended coding sessions (SWE-bench verified 72.1%). Sonnet 4 as the everyday default. Both reach the Mac through the Claude Code SDK, which is what the Fazm ACP bridge consumes.

3

2026-04-03 — Gemini 2.5 Flash

The cost-optimized variant. Same capabilities surface, lower latency, lower price. Folds into an existing Gemini client without API shape changes.

4

2026-04-05 — Llama 4 Scout + Maverick

MoE architecture, 17B active parameters per token, 10M token context on Scout. On Mac: run locally via Ollama, LM Studio, or llama.cpp on Apple Silicon; a Q4 Scout still comfortably serves on 32 GB of unified memory.

5

2026-04-07 — GPT-5 Turbo

Native image and audio generation inside the text model, same API. On Mac: through the OpenAI API or an Anthropic-shape shim sitting between Fazm and the OpenAI endpoint.

6

2026-04-09 to 2026-04-14 — Gemma 4 (four variants)

E2B, E4B, 26B MoE, 31B Dense. Day-one vLLM v0.19.0 support. All Apache 2.0. On Mac: 16 GB of unified memory serves the E4B and 26B MoE comfortably at Q4.

1 patch

In the month Claude 4, GPT-5 Turbo, Gemini 2.5, Llama 4, and Gemma 4 all shipped, the Fazm model menu updated in place. Zero Sparkle deltas went out for a model label change.

Fazm release log vs. model release calendar, April 2026

the anchor fact

Nine lines of Swift that absorb a new release

The classifier lives in a single file: Desktop/Sources/FloatingControlBar/ShortcutSettings.swift. The interesting region is ten screenfulls smaller than you would expect. It is a static tuple that maps three substrings (“haiku”, “sonnet”, “opus”) to three display labels and an order. That is the whole dictionary. A fresh release with a new version string still contains one of those three substrings, so it classifies itself on arrival.

Desktop/Sources/FloatingControlBar/ShortcutSettings.swift

the classifier in motion

updateModels: the bit that runs on every ACP push

Whenever the ACP bridge tells Swift the model list changed, this function runs. It iterates the incoming list, tries a substring match against modelFamilyMap, and if nothing matches it falls through to a bucket at order 99 instead of discarding the model. That is the graceful-degradation guarantee: a release Fazm has never heard of still appears in the menu under its API name.

Desktop/Sources/FloatingControlBar/ShortcutSettings.swift

how a fresh id reaches Swift

The ACP bridge, deduplicated, on a single message

The bridge lives in Node and runs as a child process of the Swift app. When the Claude Code SDK reports a model list, the bridge filters out the “default” pseudo-id and dedupes against the last emit so the Swift side never re-renders on an identical payload. One message type crosses the boundary: models_available.

acp-bridge/src/index.ts

how the signal flows

Five sources, one hub, one UI

Anthropic, OpenAI, Google, Meta, and the open-weight labs all reach the same hub in different ways. The bridge normalizes every incoming model list into the same shape, emits one message, and the Swift UI re-labels. Nothing on the right side of this diagram changes when a new family ships on the left.

Release routing, April 2026

Claude 4 (Opus, Sonnet)
GPT-5 Turbo
Gemini 2.5 (Pro, Flash)
Llama 4 / Gemma 4
acp-bridge
Scary
Fast
Smart
Unknown family

the swap-latency trick

Three sessions, pre-warmed at launch

A label change is cosmetic. The part that actually changes how a Mac agent feels when you swap model is whether the new session is warm. Fazm pre-warms three ACP sessions at app start, each with its own system prompt: the main chat, the floating bar, and the chat-observer that watches conversations. When the user flips between Fast and Smart, the cold-start cost was already paid on boot.

Desktop/Sources/Providers/ChatProvider.swift

verify it yourself

Probe the running app without opening the UI

If Fazm is running on your Mac, you can dump the current model state to a JSON file with a single distributed notification, then swap the model slot from the command line. This is the same getState / setModel protocol documented in the project CLAUDE.md.

Fazm control protocol — com.fazm.control

what the protocol actually looks like

The one message that crosses the bridge

At wire level, this is quieter than it sounds. The Node subprocess emits a models_available message to the Swift app, the Swift app calls updateModels, and the UI re-renders. No roundtrip is needed because the SDK already has the latest IDs; the Mac side is a pure consumer of that stream.

models_available over ACP

Claude Code SDKacp-bridge (Node)Swift ShortcutSettingsFloating bar menuModelsResponse { models: [...] }filter modelId !== "default"dedupe vs lastEmittedModelsJsonmodels_available { models }substring-match vs modelFamilyMapset availableModels (sorted)re-render pill menu

why named slots beat hardcoded ids

Label vs version: two approaches, one winner

Two desktop agents shipped the same day. One hardcoded the model ID in its UI. The other used a substring match on the family name. On April 2, when Anthropic promoted a new Sonnet alias, one of them needed a rebuild. The other did not.

FeatureHardcoded IDFazm (named slots)
April 2026 Claude Sonnet releaserequires a UI rebuildFast absorbs the new alias
April 2026 Claude Opus releasea pin in the dropdown, now staleSmart re-points via substring match on "opus"
Unknown family (no haiku/sonnet/opus in ID)never reaches the menufalls through to order 99, still selectable
User muscle memory on a model renameusers re-learn the dropdownlabels are decoupled from versions
Cold-start cost on model swapfull warmup every swapthree pre-warmed sessions on boot
Sparkle delta for each releaseone per release, minimumnot needed; SDK feeds the UI

the provider shape

April's five families, on a Mac, through one menu

Each family reaches the Mac through a slightly different plumbing, but past the bridge they all look the same: a list of IDs, a substring match, a label.

Claude 4 (Opus, Sonnet)

Native path. The ACP bridge talks to the Claude Code SDK; DEFAULT_MODEL at acp-bridge/src/index.ts:1245 is currently claude-sonnet-4-6. When Opus 4 or a new Sonnet alias ships, the list updates through the same wire message and Smart/Fast re-point automatically.

GPT-5 Turbo

Via an Anthropic-shape shim (one of several OSS options). Point UserDefaults customApiEndpoint at the shim URL. The shim exposes an id like gpt-5-turbo; it falls through to order 99 in the current classifier unless you add a family mapping.

Gemini 2.5 Pro / Flash

Same shim pattern. 1M token context still works because the ACP bridge streams everything through as opaque bytes.

Llama 4 (Scout, Maverick)

Local on Apple Silicon via Ollama or llama.cpp, behind an Anthropic-shape shim. 17B active parameters MoE, ~32 GB unified memory for a comfortable Q4 Scout.

Gemma 4 (E2B, E4B, 26B MoE, 31B Dense)

Local via vLLM v0.19.0 with day-one weights. Apache 2.0. 16 GB unified memory serves E4B and 26B MoE at Q4. Same shim in front.

The future model without a haiku / sonnet / opus substring

Falls through to order 99, appears in the menu under its API name, stays selectable. One line added to modelFamilyMap promotes it to a family slot next build.

the practical checklist

If you are building a Mac agent this quarter

Five things the Fazm tree makes cheap that most Python desktop agents on GitHub still do manually in 2026.

What to copy (or import) in your own agent

  • Decouple the model label from the model ID. Label by family, not by version string.
  • Pre-warm every session at app start, one per role. The cold-start cost is a one-time hit at boot.
  • Dedupe your model-list emits against the last JSON payload. Swift (or any UI framework) re-renders are not free.
  • Treat the "default" pseudo-id as a filter rule, not a selectable model.
  • Fall through unknown model families to a visible-but-deprioritized bucket. Never discard a model the SDK knows about.

what the user sees

From release to pill, in one beat

This is the shortest path the signal takes from an Anthropic or OpenAI or Google release announcement to a reshaped dropdown in the floating bar on your Mac.

SDK reports new idbridge filters "default"dedupe lastEmittedModelsJsonsend models_availableSwift updateModels()substring matchScary / Fast / Smartpill re-renders

one stat at a time

0
family substrings recognized by the classifier
0
fallback sort order for unknown-family releases
0
protocol message needed to relabel every pill

What this gets wrong, honestly

Three weaknesses worth naming. The substring match assumes Anthropic keeps using the words haiku, sonnet, and opus in model IDs; if they collapse the naming scheme, the classifier stops producing family labels and everything falls through to order 99 until modelFamilyMap grows. The pre-warm cost is paid at every app launch even if the user never leaves the Fast slot; that is a tradeoff for instant swap. And this design assumes a single bridge, single SDK; if you plug multiple providers in simultaneously through the same channel, ID collisions (e.g. two providers both shipping a “sonnet”) become the classifier's problem instead of the user's. None of these are blockers today; they are the places this approach will break first when it breaks.

See the April 2026 release pile live in the Fazm floating bar

Fifteen minutes with the team. Bring a model you want to route through Scary, Fast, or Smart.

Book a call

Frequently asked questions

What LLMs actually released in April 2026, and which run on a Mac?

Inside four weeks: Claude Opus 4 and Sonnet 4 on 2026-04-02, Gemini 2.5 Pro and Flash on 2026-04-01 and 2026-04-03, GPT-5 Turbo on 2026-04-07, Llama 4 Scout and Maverick on 2026-04-05, and four Gemma 4 variants with day-one vLLM v0.19.0 support. On a Mac, the hosted releases (Claude, GPT-5 Turbo, Gemini) reach you through the same API surfaces they always did. The open releases (Llama 4, Gemma 4, Qwen 3.5 MoE, DeepSeek) run locally on Apple Silicon because unified memory lifts the VRAM ceiling that hobbles consumer PCs. A Q4-quantized 13B model fits comfortably in 16 GB of unified memory. The harder Mac question is not which one to run, it is how a desktop agent picks one up without requiring the user to wait for an app rebuild.

How does Fazm pick up a new Claude release on the same day it ships?

Through a substring-match classifier at Desktop/Sources/FloatingControlBar/ShortcutSettings.swift lines 159 to 163. The map has three entries: "haiku" to Scary, "sonnet" to Fast, "opus" to Smart. When the Claude agent SDK reports a new model ID like claude-opus-4-7 or claude-sonnet-5-0, ShortcutSettings.updateModels iterates the list, checks modelId.contains(substring), and assigns a label + display order. The UI menu updates in place. No code change, no app update, no Sparkle delta. The bridge emits models_available over the ACP protocol in acp-bridge/src/index.ts lines 1271 to 1281 whenever the raw list from the Claude Code SDK shifts, and the Swift side reacts to that message in FloatingControlBarWindow.swift.

What are the three pre-warmed sessions, and why do they matter for model swap?

In Desktop/Sources/Providers/ChatProvider.swift lines 1047 to 1051 the app calls acpBridge.warmupSession with three configs: key "main" for the main chat, key "floating" for the floating bar, and key "observer" for the chat-observer role that watches conversations for tool-use cues. Each session is opened against its own system prompt at app start, so the first turn on any of them is warm. When the user flips from Fast to Smart, the bridge routes to a new session under the hood but the warmup already paid the cold-start cost on the default. The user sees a label change, not a latency spike. That is the consumer-friendly part of the design: swap the model, not the app.

Does the substring matcher break when a new family name appears, like a model called claude-structo-1?

It does not break, it falls through. The matcher at lines 180 to 190 first tries modelFamilyMap; if nothing matches, it builds a ModelOption with the SDK-reported display name as both the label and the shortLabel, and assigns order 99 so the unknown family sorts below Scary, Fast, and Smart. The user sees the model appear under its API name and can still select it. When Anthropic (or anyone whose ID routes through the same SDK) introduces a new family, adding one line to modelFamilyMap is enough to relabel it on the next app build. Until then, nothing is lost. This is why the model pill in the floating bar does not require a dictionary of every model the SDK has ever shipped.

How does Fazm decide which release is "latest" for each slot?

It does not decide, the Claude agent SDK does. The ACP bridge at acp-bridge/src/index.ts line 1274 filters out the pseudo-model id "default" and forwards the remaining list verbatim. When the SDK reports an ordered list of available IDs, the first ID that contains "opus" is what Smart routes to, the first with "sonnet" is Fast, the first with "haiku" is Scary. The "latest" label in the UI ("Fast (Sonnet, latest)") is a promise about the family, not a pinned version. If Anthropic promotes claude-sonnet-4-7 to the default sonnet alias, Fast follows it with no user action. This matters on Mac because the alternative is shipping a new signed, notarized, Sparkle-delta build every time a pin moves, which nobody does fast enough to catch a Monday release cycle.

What does GPT-5 Turbo, Gemini 2.5, or Llama 4 look like in this system?

Not through the stock ACP bridge today. Fazm's default model slot runs Claude via the ACP bridge, which talks to Anthropic's Messages API (or a local shim at the URL set under UserDefaults key customApiEndpoint). To plug GPT-5 Turbo or Gemini 2.5 Pro in, you run an Anthropic-shape shim in front of them and point customApiEndpoint at it. To plug Llama 4 or Gemma 4 in locally, you run Ollama or vLLM with the same kind of shim. In both cases the classifier and session pre-warmer do not care: as long as the endpoint reports a model list over the ACP protocol, the label pills ("Smart", "Fast", "Scary") light up however the IDs match. The Mac-side code is stable across providers.

Why name the slots "Scary", "Fast", and "Smart" instead of Haiku, Sonnet, Opus?

Two reasons. First, consumer legibility. A person opening a floating bar to rename a file does not want to decide between "claude-haiku-4-5" and "claude-opus-4-7"; they want to pick how the reply will feel. Scary means fast and cheap, Fast means the everyday default, Smart means use real thinking. Second, durability. Named slots outlive model versions. Every April release shifted the best-in-class ID for at least one family; the labels did not. Users who built muscle memory on Fast in January 2026 still press Fast today and get the current Sonnet generation without re-learning a menu. The label-vs-ID split is in Desktop/Sources/FloatingControlBar/ShortcutSettings.swift at struct ModelOption on line 144.

Where do I verify these claims in the source?

Three files, all in the MIT-licensed tree at github.com/mediar-ai/fazm. Desktop/Sources/FloatingControlBar/ShortcutSettings.swift lines 144 to 215 contains the ModelOption struct, defaultModels, modelFamilyMap, and updateModels. Desktop/Sources/Providers/ChatProvider.swift lines 1047 to 1051 contains the three-session warmup call. acp-bridge/src/index.ts lines 1245 to 1281 contains DEFAULT_MODEL and emitModelsIfChanged. The SDK wire format for models_available is filtered there: the "default" pseudo-model is dropped at line 1274, and the emit is deduplicated via lastEmittedModelsJson so the Swift side only re-renders on a real change.

Is there any case where an April 2026 release fails to light up the UI on day zero?

Yes, two. First, if the Claude Code SDK bundled with the installed Fazm version does not yet know about the new ID, the SDK itself will not report it until the user installs an ACP-bridge update; the bridge is shipped as a Node subprocess inside the .app and the bridge version is visible in the bridgeMode field of the control state (see CLAUDE.md). Second, if a release carries a model family Fazm has never seen (not haiku, not sonnet, not opus), it appears under its API name with order 99. Neither case is a breakage; they are two ways the graceful-degradation path expresses itself. A full-family rename would require a one-line edit to modelFamilyMap, which is a minutes-long shipping effort and does not block the model from being selectable in the meantime.