Forward look, verified May 26, 2026

Upcoming LLM releases for the rest of 2026: what is real, what is rumor

No frontier lab publishes a binding roadmap. Most posts that promise an upcoming-releases list quietly mix shipped models, vendor hints, and guesses under one heading. This page sorts every credible item into three tiers by source quality, names the signal where one exists, and ends with the structural reason a release calendar is the wrong abstraction for a working agent setup.

M
Matthew Diakonov
9 min read
Direct answer (verified May 26, 2026)

Four upcoming releases have any credible source. Grok 5 (xAI, late Q2 or Q3 2026; prediction markets give ~33 percent probability by June 30, 2026). Gemini 4 (Google, likely Q4 2026 or Q1 2027 based on yearly cadence plus Hassabis's January 2026 statement). Llama 4 Behemoth remains in training, no public date, and Meta's April 8, 2026 pivot to closed-weight Muse Spark makes a ship increasingly unlikely. The next Claude Opus revision (no pre-announcement, but 4.5, 4.6, 4.7 in five months implies another minor in the next two quarters). Everything else (named GPT-6 dates, Claude Mythos availability, Llama 5) is speculation.

Primary sources: Polymarket on Gemini 4, llm-stats.com live tracker, Meta's 2025 Llama 4 launch post.

Tier 1 / Confirmed by the vendor or a binding market

The vendor has acknowledged the release on the record, or a liquid prediction market is pricing the date with non-trivial volume. Dates are still soft, but the program itself is real.

  • Grok 5

    Late Q2 or Q3 2026
    xAI

    Musk publicly missed the original Q1 2026 target. xAI's official X account now points to Q2 2026. Polymarket pegs the probability of ship by June 30, 2026 at roughly 33 percent. Architecturally announced as a 6 trillion parameter Mixture-of-Experts build.

    Signal: Polymarket: Grok 5 by June 30

  • Gemini 4 (next major generation)

    Q4 2026 or Q1 2027
    Google DeepMind

    Demis Hassabis stated in January 2026 that the team is focused on Gemini 4 this year. No model named Gemini 4 has an API release as of May 19, 2026. The yearly cadence (Gemini 1.0 late 2023, 2.0 late 2024, 3.0 late 2025) puts the next major generation in the same window. Recent intermediate releases (Gemini 3.1 Pro, 3.5 Flash, Omni series) keep the cadence fed without naming the next number.

    Signal: Polymarket: Gemini 4.0 by June 30, 2026

Tier 2 / Rumored with a credibility signal

The vendor has not committed to a date. The release is plausible based on past cadence, a code-leak, an executive aside, or a public delay history. Treat the window as a hint, not a promise.

  • Claude Opus 4.8 (or next Opus tier)

    Mid to late 2026
    Anthropic

    Opus 4.7 shipped April 16, 2026, the third Opus point release of the year after 4.5 and 4.6. The cadence implies another minor revision before any major-number jump. Anthropic does not pre-announce. Treat dates with skepticism until the model id appears in the API.

  • Llama 4 Behemoth

    Indefinitely delayed
    Meta

    Previewed April 2025 as still in training. Multiple internal delays through 2025. On April 8, 2026 Meta launched the closed-weight Muse Spark under Meta Superintelligence Labs, a strategic pivot away from open weights. Behemoth as a shipped public model is now unlikely.

  • GPT-5.6 or GPT-6 line

    Second half of 2026
    OpenAI

    GPT-5.5 shipped April 23, 2026 with three variants. GPT-5.5 Instant followed May 5, 2026. The 5.x cadence suggests at least one more numbered minor before a GPT-6 jump. No OpenAI roadmap statement names a date or the next number.

  • DeepSeek V4 (general availability)

    Q3 2026
    DeepSeek

    V4 Preview shipped during the April wave as a trillion-parameter open-weights build trained on Huawei Ascend, priced near 14 cents per million input tokens on Flash. The general-availability variant has not been dated. Preview to GA typically runs one to two quarters.

Tier 3 / Pure speculation

Recurring in tech press without a binding source. Any specific date given here is invented. Included because they are the names people search for, not because there is any signal.

  • Claude Mythos

    Unknown
    Anthropic

    Reported as gated to a Project Glasswing cohort with no public availability timeline. Any specific 2026 launch date is invented.

  • GPT-6 with breakthrough reasoning

    Unknown
    OpenAI

    Recurring topic in tech press. No OpenAI statement names the version, the capability set, or the date. Treat as marketing chatter.

  • Llama 5

    Unknown
    Meta

    Meta has not announced a Llama 5 program. After the Behemoth delay and the Muse Spark pivot, the open-weight roadmap itself is unclear. Anyone giving a date for Llama 5 is guessing.

Why a release calendar is the wrong abstraction

The calendar approach assumes you do something different on the day a new model ships. In practice, the day-of action for almost every user is the same regardless of the model: point your agent at the new id and run your real task on both the new model and your current one. The forecast did not change the action.

The forecast does change two things. It tells you whether to budget for a tokenizer change (Opus 4.7 silently shifted token counts even at the same context window) and it tells you whether your tool is going to need a manual update before you can route the new id. The first is a one-off audit. The second is an architecture question that has nothing to do with the calendar at all.

When the model picker is built on runtime discovery, the release date becomes infrastructure trivia. The day claude-opus-4-8 lands in the Agent Client Protocol bridge response, the picker has it. There is nothing to forecast.

The anchor: three lines that absorb the next year of releases

The whole routing rule for Claude families in Fazm is a three-tuple constant. When Anthropic publishes any new model id that contains the substring "haiku", "sonnet", or "opus", the substring match at line 256 routes it into the correct persona slot and the picker re-renders on next launch with no rebuild. The file is Desktop/Sources/FloatingControlBar/ShortcutSettings.swift.

ShortcutSettings.swift

The OpenAI side runs in parallel. Codex models arrive through a codex_probe_result path that calls updateCodexModels (around line 306 of the same file) and uses the adapter's display name verbatim, so "GPT-5.5 (high)" ships through without rewriting. Gemini arrives via gemini_probe_result and uses the unversioned gemini-flash-latest and gemini-pro-latest aliases, which auto-roll forward when Google ships a new Pro or Flash generation. The merged available list across Claude, Codex, and Gemini is recomputed on every probe.

3 lines

Every Claude release for the rest of 2026, including ones the vendor has not announced, slots into the model picker through one substring match. No App Store update, no rebuild, no calendar.

Desktop/Sources/FloatingControlBar/ShortcutSettings.swift, lines 193 to 197 and 256

Quarter-by-quarter outlook

Treat the windows below as risk estimates, not promises. Every historical release has missed at least one announced target by a quarter. The preparation that matters is the architecture, not the calendar entry.

  1. 1

    Q2 2026 (now)

    Grok 5 is the only release with a credible window inside the next 30 days. Wire a runtime-discovery picker so the day xAI publishes the model id, your agent picks it up without a deploy.

  2. 2

    Q3 2026

    Expected GA window for DeepSeek V4. Possible next Opus point release. Possible Grok 5 if Q2 slips. The right preparation is a backend swap path (Anthropic-compatible gateway, Codex adapter, runtime model list), not a forecast.

  3. 3

    Q4 2026

    The likely window for Gemini 4 based on Google's yearly cadence and Hassabis's January 2026 statement. Possible GPT-6 if OpenAI's 5.x line lands a major number. Plan for new tokenizer assumptions (Opus 4.7 changed its tokenizer in place) and recompute any context budgets pinned to exact token counts.

  4. 4

    Q1 2027 carry-over

    Several Q4 windows realistically slip into Q1 2027. Gemini 4, Behemoth (if it ever ships), and any closed-beta releases that get teased at NeurIPS 2026 are likely to land here. Build for the slip, not the announcement.

What to actually do this week

  1. Stop saving release-roundup links. They go stale in days. Save the vendor changelog pages instead and put them in a feed reader.
  2. Audit anything in your stack that pins a model id by exact string. The Opus 4.7 tokenizer change broke estimated-token counters that assumed the prior count. The next minor will do the same to something else.
  3. Check whether your coding tool discovers models at runtime or hardcodes them. If hardcoded, every release means waiting on a wrapper update before you can compare on your real task.
  4. Pre-decide your evaluation task. The action on release day is running your real workload on both the new and the current model; you should already have the task ready to fork onto.

Wire your stack so release day stops mattering

A 15-minute walk through swapping backends per chat, runtime model discovery, and the fork-on-release workflow.

Frequently asked questions

What LLM models are actually coming out for the rest of 2026?

As of May 26, 2026, four upcoming releases have a credible source. Grok 5 from xAI is in active training with a Q2 or Q3 2026 window; prediction markets give it roughly a 33 percent probability of shipping by June 30, 2026. The next Gemini major generation (Gemini 4) is publicly acknowledged by Demis Hassabis as the team's 2026 focus, with a likely Q4 2026 or Q1 2027 window based on Google's yearly cadence. Llama 4 Behemoth remains in training with no public release date, and Meta has pivoted to the closed-weight Muse Spark line, which makes a Behemoth ship date increasingly unlikely. Anthropic does not pre-announce, but the Opus point-release cadence (4.5, 4.6, 4.7 within roughly five months) implies at least one more Opus revision before any major-number jump. Everything else (specific GPT-6 dates, Claude Mythos availability, Llama 5) is speculation.

Is there a public roadmap for any of the frontier labs?

No. Anthropic, OpenAI, Google DeepMind, xAI, and Meta have all declined to publish binding roadmaps for the rest of 2026. The closest things to public signal are: a Hassabis interview in January 2026 naming Gemini 4 as the year's focus, xAI's own posts on X acknowledging Grok 5 is in training, and OpenAI's blog and DevDay statements about ongoing 5.x releases. None of those name a specific calendar date that the vendor has committed to.

Why do release roundups disagree with each other?

Three reasons. First, several lists conflate already-shipped releases with upcoming releases by reusing the phrase 'new' loosely. Second, prediction markets and X leaks update faster than the articles that cite them, so a roundup dated last week may already be wrong. Third, vendors regularly miss their own targets (Grok 5 missed Q1 2026, Llama 4 Behemoth has missed every announced window since summer 2025), which means anything written before a target date is partly fiction by the day after.

How does Fazm handle a new Claude or Codex model the day it ships?

It does not need to handle it. The model picker is data, not code. Desktop/Sources/FloatingControlBar/ShortcutSettings.swift defines a three-tuple at lines 193 to 197 mapping the substrings 'haiku', 'sonnet', and 'opus' to the Scary, Fast, and Smart persona labels. When the Agent Client Protocol bridge announces an available model named claude-opus-4-8-20260901 (or claude-sonnet-5-0-anything), the substring match at line 256 routes it into the right slot and the picker re-renders 'Smart (Opus, latest)' the next time the user opens the floating bar. No App Store update. No rebuild. The OpenAI side uses a parallel codex_probe_result path with the same property: when GPT-5.6 ships, the Codex adapter exposes it and the merged available list picks it up.

Why does that matter if I already know a model is coming?

Because the gap between 'a model is coming' and 'a model is in your hands and you have run your real task on it' is where most of the workflow cost lives. If your tool hardcodes the model list, every release means waiting on a wrapper update before you can compare. If your tool discovers models at runtime, the release date becomes irrelevant infrastructure detail. You point the picker at the new model and run your task. The forecasting question collapses into a routing question.

Where should I track upcoming releases between roundups?

The vendor changelogs and the vendor blogs are the only sources that bind the vendor to anything. For Anthropic that is the model release notes page; for OpenAI the platform changelog; for Google DeepMind the Vertex AI and Google AI blog; for xAI the company's posts on X; for Meta the AI blog. Third-party trackers (llm-stats.com, llmgateway.io's timeline, llm-arena leaderboards) update fastest. Prediction markets (Polymarket, Manifold) are the cleanest aggregator of crowd belief on whether a rumored date will hold. Treat everything else as commentary.

Is there an open-weights frontier release expected in 2026?

Two credible candidates. DeepSeek V4 Preview is open weights and trillion-parameter; a general-availability variant is expected in Q3 2026 based on the typical preview-to-GA cadence. Qwen 3 and Gemma 4 already shipped within the April wave; their next minor revisions are likely before year end based on Alibaba and Google's release tempo. Llama 4 Behemoth was the headline open-weight candidate, but the Meta pivot to closed-weight Muse Spark on April 8, 2026 has made its arrival increasingly unlikely.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.