2026 model release wave, year to date

Every major 2026 AI model update, absorbed by a 3-line substring map

The roundups all rank these by benchmark. This one is about the architectural pattern a shipping macOS app uses to keep its picker correct, its prompts stable, and its release cycle independent of which provider shipped a new alias yesterday. Year to date, that pattern has absorbed roughly a dozen frontier-class updates without a single user-facing release of its own.

Matthew Diakonov, Written with AI

Published May 26, 202610 min read

4.9from Shipping on macOS through every 2026 release

Sonnet 4.6, Opus 4.6, Opus 4.7: live the day each shipped

Picker updated via ACP, not via a Swift rebuild

Substring family map: 3 entries, 0 lines of glue per release

Same pattern absorbs Gemini and Codex models

Direct answer (verified 2026-05-26)

The 2026 major releases so far: Claude Sonnet 4.6, Claude Opus 4.6, Claude Opus 4.7 (April 16, $5 in / $25 out, updated tokenizer, high-resolution image input); Google Gemini 3.1 Pro (Vertex GA, 2M tokens, 94.3% GPQA Diamond); OpenAI GPT-5 Turbo and the GPT-5.5 step; DeepSeek R2 then V4 Preview (trillion-parameter, Huawei Ascend, sub-$0.14 / M Flash); Meta Llama 4 Maverick + Scout and Meta Muse Spark from Meta Superintelligence Labs; Mistral Large 3 with EU data residency; Microsoft's three foundational models (text, voice, image); Qwen 3 and Gemma 4. Aggregators like llm-stats.com/llm-updates track the running list; Anthropic's announcement for Opus 4.7 is the canonical source for the April 16 benchmark deltas.

The anchor fact

0 entries in a static constant, every 2026 release routed

The whole model picker in Fazm sits on top of a single Swift constant: a 3-tuple substring map at ShortcutSettings.swift:193-197. The substrings are haiku, sonnet, opus. The labels are Scary, Fast, Smart. The order is fixed. Every new Claude alias Anthropic ships in 2026 lands in the correct slot via modelId.contains(substring). That is the whole pattern.

The constant that absorbed the year

Here is the actual code, unmodified, from the shipping app. Three rows. No release knobs. No per-model code path. The order field is the display order in the picker. The family field is the human-readable persona name. The substring is the only thing that matters at runtime.

Desktop/Sources/FloatingControlBar/ShortcutSettings.swift

When the Agent Client Protocol bridge advertises a new alias like claude-opus-4-7 in April, or claude-sonnet-4-6-20260203 in February, the matching logic at line 256 picks the first family whose substring appears in the ID. That alias is now routed into the right persona slot for the rest of the app's lifetime, until the bridge advertises something different. Fazm did not ship a Swift update for any of the seven Claude alias changes year to date.

How a new alias travels from Anthropic to the picker

The pipeline has three stops. Anthropic publishes the alias and the SDK announces it. The Node.js ACP subprocess pipes the announcement through the bridge as a models_available message. The Swift side substring-matches and rewrites the picker. No build, no release, no user action.

The path of a new Claude alias into the Fazm picker

The 3-line ACP message handler

The models_available case in the bridge is literally three lines. The bridge converts the array of dicts into a typed tuple list and hands it to the Swift settings store. Everything downstream is the substring match and a sorted, atomic swap into a published property.

Desktop/Sources/Chat/ACPBridge.swift

On the Swift side, ShortcutSettings.updateModels at line 251 walks the tuples, runs modelFamilyMap.first(where:) with the contains check, surfaces any bracket annotations like sonnet[1m] so context-variant aliases are distinguishable, sorts by family order, and replaces lastClaudeModels. Then recomputeAvailableModels() merges with the Codex and Gemini halves and the picker rerenders. No additional logic is needed for the SwiftUI picker to redraw because availableModels is a @Published property on an ObservableObject.

The year, month by month

The 2026 release wave is not random. There is a rhythm. Anthropic shipped two Claude families in the first half of the year. Google held Vertex GA for Gemini 3.1 Pro through the spring. OpenAI shipped GPT-5 Turbo and then the GPT-5.5 step in close succession. DeepSeek went from R2 to V4 Preview inside a single quarter. Open weights kept landing. Below is the picture from a shipping app's vantage point, where every one of these arrived as a runtime announcement, not a build trigger.

February: Claude 4.6 wave

Anthropic moved the Claude line in two steps: Sonnet 4.6 on the everyday tier, then Opus 4.6 as the GA ceiling. Fazm's picker rerendered both labels the moment the ACP bridge advertised them. No Swift release.

Gemini 3.1 Pro on Vertex GA

2M token context, 94.3% on GPQA Diamond, 77.1% on ARC-AGI-2. Picked up via the gemini_probe_result path with the un-versioned latest aliases.

March: cadence accelerates

OpenAI's GPT-5.4 step landed alongside continued Sonnet 4.6 iteration. The release window between frontier updates kept shrinking.

April: peak month

April was the densest single month of model updates in the year so far. Opus 4.7 (April 16), GPT-5 Turbo / GPT-5.5, DeepSeek V4 Preview, Llama 4 Maverick and Scout, Mistral Large 3, Meta Muse Spark, three Microsoft foundational models. Fazm shipped zero Swift updates that month for any of them.

Tokenizer change

Opus 4.7's updated tokenizer maps the same input to roughly 1.0 to 1.35x more tokens than 4.6. Pricing held at $5 in / $25 out per million.

May: billing reshuffle

Third-party agent tools moved to a separate credit meter on May 14, 2026. Claude Code's 5-hour limit doubled on May 6 across Pro, Max, Team, and seat-based Enterprise.

Late May to June: Opus 4.8, then Fable 5

Anthropic shipped Claude Opus 4.8 on May 27, 2026 (it took the top spot on the Artificial Analysis Intelligence Index), then the Mythos-class Claude Fable 5 on June 9, 2026. OpenAI's GPT-5.5 line, including Pro and Instant variants, kept iterating. Each new alias showed up in Fazm's picker on next launch. Still no Swift release for any of them.

February: Anthropic moves first

Sonnet 4.6 landed early in the month and Opus 4.6 followed in the same window. Both were straight family rolls: same substring, new version. The picker did not need a code change. The label string is built dynamically from the family name and any bracket annotation in the ID, so claude-sonnet-4-6-20260203 rendered as Fast (Sonnet, latest) the first time a user opened the bar after the bridge picked it up.

On the Gemini side, the picker held two whitelist entries (gemini-flash-latest, gemini-pro-latest) which auto-rolled to Gemini 3.1 Pro on Vertex GA without code changes. The whitelist exists because un-versioned aliases are not in the bridge's curated probe response but resolve correctly when passed to session/set_model. Tested against gemini-cli 0.42.0 on 2026-05-21.

March: cadence accelerates

OpenAI's GPT-5.4 step landed in the first half of March, picked up via the Codex side of the bridge through codex_probe_result. The Codex name is used verbatim from the adapter response, so the picker showed exactly what the provider advertised. Anthropic continued Sonnet 4.6 iteration with point-version updates that the substring match absorbed without any user-visible churn.

By late March, the median time between frontier-class announcements across providers had compressed to under three weeks. The decision shifted from "which model do we hard-pin against" to "how do we keep the surface area of the model picker constant while the underlying list shuffles every two weeks". The substring family map is one answer; the per-provider probe path is the other half.

April: the densest month

April was the heaviest month of the year so far. On April 16, Anthropic released Claude Opus 4.7 across the API, Bedrock, Vertex, and Foundry. The benchmark deltas published in the announcement: SWE-bench Verified from 80.8% to 87.6%, Terminal-Bench 2.0 from 65.4% to 69.4%, GPQA Diamond from 91.3% to 94.2%, Finance Agent from 60.7% to 64.4%. Pricing held at $5 input and $25 output per million tokens; the tokenizer changed (1.0 to 1.35x more tokens for the same input) and high-resolution image input went from 1568px / 1.15MP up to 2576px / 3.75MP.

GPT-5 Turbo shipped with native multimodal in the same window, then the GPT-5.5 step. DeepSeek V4 Preview dropped roughly 24 hours after one of the OpenAI announcements: trillion-parameter, open weights, built on Huawei Ascend chips, Flash variant priced at $0.14 per million input tokens. Meta pushed Llama 4 Maverick and Scout into mainstream open weights, and Meta Superintelligence Labs (under Alexandr Wang) shipped Muse Spark. Mistral announced Large 3 with EU data-residency positioning. Microsoft shipped three foundational models covering text, voice, and image.

From the Fazm side, every one of these arrived through the same path. The bridge advertised the model. The substring match (for Claude) or the verbatim display name (for Codex and Gemini) ran. The picker rerendered. Zero Swift releases shipped that month for model wave reasons. The releases that did ship in April were for unrelated bug fixes and feature work.

May: the billing reshuffle

May was quieter on model releases but louder on billing. On May 6, 2026 Anthropic permanently doubled the 5-hour limits on Pro, Max, Team, and seat-based Enterprise plans. On May 14, 2026 Axios reported that Anthropic moved third-party agent tool usage to a separate monthly credit meter, so a Claude Code wrapper invoked through certain OAuth paths spends out of a different pool than direct Claude.com usage. The two changes pull in opposite directions: more headroom on the main plan, less crossover credit from third-party tooling.

Neither change required a Fazm release. The picker still asks the bridge for the current model list. The wrapper still uses the user's own Claude Pro or Max account. The new credit meter is upstream of the picker, not inside it. If a user hits the new third-party limit, the bridge surfaces a model_entitlement_missing message (parsed at ACPBridge.swift:1747) with the downgraded-to model, and the app handles it the same way it would any other entitlement issue.

0 Swift releases

“The model list is empty in the binary. The provider announces it at runtime. The picker rerenders without a deploy.”

ShortcutSettings.swift:251 — updateModels(_:) on every models_available event

Watch a fresh alias travel through the system

Imagine Anthropic ships claude-sonnet-4-7-20260901 at 9 a.m. one morning. Here is what happens inside Fazm the next time the app launches, end to end.

T+0: launch

Fazm starts. The ACP subprocess spawns. The bridge connects to the Claude Agent SDK and sends session/new.

What is transferable

The 2026 release rhythm has changed the shape of what a shipping product needs to absorb. The lesson from a half year of bridge logs:

Do not put the model list in the binary. Read it at runtime from whatever upstream actually knows.
Match on family substrings, not exact aliases. The aliases churn every two to three weeks; the families churn once a year.
Surface bracket annotations like [1m] in the display label. Context-variant aliases are real and need to be distinguishable.
Use @Published (or your framework's equivalent) so the UI updates without an event bus.
Keep the per-provider probe paths separate. Claude, Codex, and Gemini have different conventions and the merged list looks cleaner if they stay split until the last step.
When pricing or entitlement changes (May 6 and May 14), let the upstream signal it. The picker does not need to know about credit meters.

The short version of all of this

2026 has shipped roughly a dozen major AI model updates so far. Anthropic moved Sonnet to 4.6 and Opus from 4.6 to 4.7. Google rolled Gemini 3.1 Pro to Vertex GA. OpenAI shipped GPT-5 Turbo and the 5.5 step. DeepSeek went from R2 to V4 Preview at sub-fifteen-cent pricing. Meta, Mistral, Microsoft, Alibaba, and Google open weights all moved. Fazm's picker absorbed every one through a 3-tuple constant and a 3-line ACP message handler. The architecture decouples shipping cadence from release cadence, which is the only honest way to ship against a wave that does not slow down.

claude-sonnet-4-6claude-opus-4-6claude-opus-4-7gemini-pro-latestgemini-flash-latestgpt-5-turbogpt-5.5deepseek-v4-previewllama-4-maverickllama-4-scoutmistral-large-3models_available

Want to see the picker rerender against a new alias live?

Book 20 minutes and we will screenshare a Fazm session with the ACP log tailing next to it. You can watch models_available fire and the picker update with no rebuild.

Questions

What are the major AI model releases in 2026, year to date?

Through May 26, 2026 the year has shipped roughly a dozen frontier-class updates. Anthropic released Sonnet 4.6 in February and Opus 4.6 the same month, then Opus 4.7 on April 16 with high-resolution image input and an updated tokenizer at the same $5 in / $25 out per million tokens. Google shipped Gemini 3.1 Pro with a 2M token context window on Vertex GA. OpenAI shipped GPT-5 Turbo with native multimodal in April and the GPT-5.5 step shortly after. DeepSeek went from R2 (AIME 92.7%) into V4 Preview, a trillion-parameter open-source build on Huawei Ascend chips priced at roughly $0.14 per million input tokens on the Flash variant. Meta pushed Llama 4 Maverick and Scout into mainstream open weights and Meta Superintelligence Labs (under Alexandr Wang) shipped Muse Spark. Mistral announced Large 3 with EU data-residency positioning. Microsoft shipped three foundational models covering text, voice, and image. Qwen 3 and Gemma 4 landed in the same window. The story is not the list; the story is the cadence.

What was the largest single update of 2026 so far?

Measured by jump on shared coding benchmarks, Claude Opus 4.7. SWE-bench Verified moved from 80.8% on Opus 4.6 to 87.6% on Opus 4.7, Terminal-Bench 2.0 from 65.4% to 69.4%, GPQA Diamond from 91.3% to 94.2%, and Finance Agent from 60.7% to 64.4%. Anthropic kept pricing identical to 4.6 at $5 input / $25 output per million tokens, though the new tokenizer maps the same input to roughly 1.0 to 1.35x more tokens. Measured by capability-class shift, DeepSeek V4 Preview is the more notable release: trillion-parameter, open weights, Ascend-trained, and priced under fifteen cents per million tokens. The two answer different questions.

Did anything ship in 2026 that broke prior model picker assumptions?

Two things. First, the move from Opus 4.6 to Opus 4.7 changed the tokenizer in place. Apps that displayed estimated token counts before the call need to recompute, and apps that pinned a context budget at exactly 200K had to widen their safety margin. Second, third-party agent tools moved to a separate monthly credit meter on May 14, 2026 (per Axios reporting on Anthropic's policy change), which means a Claude Code wrapper now spends out of a different pool than the user's plain Claude Pro or Max usage when invoked through certain OAuth paths. The 5-hour limit was permanently doubled on May 6, 2026 on Pro, Max, Team, and seat-based Enterprise plans, which softens the impact.

How does Fazm handle a new Claude model the day it ships?

It does not need to handle it. The model picker is not hardcoded. The whole routing rule is a three-tuple constant at Desktop/Sources/FloatingControlBar/ShortcutSettings.swift lines 193 to 197: ('haiku', 'Scary', 'Haiku', 0), ('sonnet', 'Fast', 'Sonnet', 1), ('opus', 'Smart', 'Opus', 2). Each entry is a substring the app searches for inside any incoming model ID coming from the Agent Client Protocol bridge. When Anthropic publishes a new alias like claude-sonnet-4-7-20260901, the substring match at line 256 sees 'sonnet', routes the model into the Fast persona, and the picker renders it as 'Fast (Sonnet, latest)'. Zero Swift changes, zero release.

What is the models_available message and how does it work?

models_available is an inbound protocol message parsed at Desktop/Sources/Chat/ACPBridge.swift lines 1743 to 1745. The Node.js ACP subprocess (the Claude Agent SDK and the codex-acp adapter) announces the current model list as an array of dicts. The Swift bridge converts that into tuples of (modelId, name, description?) and hands them to ShortcutSettings.updateModels at line 251, which runs the substring match, sorts by the family order, and atomically replaces availableModels. The published property triggers a SwiftUI rerender of the picker. The user sees the new label the next time they open the bar. The whole flow is data, not code.

Does the same trick work for non-Claude models?

Yes, separately. Codex models arrive through a parallel codex_probe_result path that calls updateCodexModels at line 306; the adapter's display name is used verbatim because Codex names already include the variant in parens (for example 'GPT-5.5 (high)'). Gemini models arrive via gemini_probe_result and call updateGeminiModels at line 283; that path uses a tighter whitelist (gemini-flash-latest and gemini-pro-latest) because the un-versioned aliases roll forward automatically when Google ships a new Pro or Flash generation. The merged availableModels list (Claude + Codex + Gemini) is recomputed on every update so the picker always reflects the freshest set across all three providers.

Why does any of this matter for a 2026 release roundup?

Because the 2026 release rhythm has moved past 'one model every few months' into 'a frontier-class update every two to three weeks across at least four providers'. Any product that hardcodes the picker, the pricing copy, or the prompt template against a specific model alias becomes a maintenance project. The pattern Fazm uses is the opposite of that: the model list is empty in the binary, the provider announces it at runtime, the labels are derived from substring matching the IDs against a tiny family map, and the picker UI rerenders without a deploy. The architecture decouples shipping cadence from release cadence.

What is the safest read on which model to default to today?

Default to a small flagship that is generally available across at least two clouds. For Anthropic that is Sonnet 4.6 (Bedrock, Vertex, Foundry, direct API). For Google, Gemini 3.1 Pro on Vertex GA. For OpenAI, GPT-5 Turbo. For open weights, Llama 4 Scout when latency matters and DeepSeek V4 Flash when cost matters. Save the heavy ceilings (Opus 4.7, Gemini 3.1 Pro at full context, GPT-5.5) for the turns that actually need them. Inside Fazm the default is Sonnet 4.6 on the Fast slot; users flip into Smart (Opus 4.7) per turn for the long-horizon work where the price difference earns its keep.

Where do I track the rest of 2026?

Anthropic, Google DeepMind, OpenAI, Meta, and DeepSeek each publish their own changelogs; aggregators like llm-stats.com/llm-updates and aiflashreport.com/model-releases collapse those into a single feed. Inside Fazm, every new model that the ACP bridge advertises automatically appears in the picker on next launch, so you do not have to track the list separately to use it. If you want to be paged the moment a new Anthropic alias appears in the bridge, watch the models_available log lines in /tmp/fazm-dev.log (dev) or /tmp/fazm.log (production).

0 entries in a static constant, every 2026 release routed

The constant that absorbed the year

How a new alias travels from Anthropic to the picker

The path of a new Claude alias into the Fazm picker

The 3-line ACP message handler

The year, month by month

February: Claude 4.6 wave

Gemini 3.1 Pro on Vertex GA

March: cadence accelerates

April: peak month

Tokenizer change

May: billing reshuffle

Late May to June: Opus 4.8, then Fable 5

February: Anthropic moves first

March: cadence accelerates

April: the densest month

May: the billing reshuffle

Watch a fresh alias travel through the system

T+0: launch

What is transferable

The short version of all of this

Want to see the picker rerender against a new alias live?

Questions

Comments (••)

Comments ()