AI model releases · updated 2026-05-30

The latest AI model releases, and how to run a new one the day it ships

Every tracker will tell you what dropped. None of them tell you the part that actually changes your afternoon: a frontier model just shipped, so how fast can you run it inside the agent you already use? This page does both. A dated snapshot and the live feeds up top, then the exact runtime path on a Mac.

Matthew Diakonov, Written with AI

Published May 30, 20267 min read

Direct answer · verified 2026-05-30

No static page can be a live 24-hour ticker. The set of releases changes faster than an article can be rewritten and re-indexed, so the only sources that stay correct are the live ones: llm-stats.com/llm-updates, a launch tracker like Digital Applied's release tracker, OpenAI's own model release notes, and the Google AI blog. As of 2026-05-30, the most recent confirmed frontier drops were Gemini 3.5 Flash (announced May 19 at Google I/O 2026), GPT-5.5 Instant (May 5, the new ChatGPT default), and Grok 4.3 (May 6).

Snapshot of the most recent frontier drops

A photograph, not a clock. Treat the dates as the floor: by the time you read this there may be newer checkpoints on the trackers above.

Model	Lab	Announced	Note
Gemini 3.5 Flash	Google	May 19, 2026	Announced at Google I/O 2026; API priced at $1.50 / $9 per 1M tokens, ~4x faster output than the prior Flash.
GPT-5.5 Instant	OpenAI	May 5, 2026	Set as the new default in ChatGPT.
Grok 4.3	xAI	May 6, 2026	Incremental frontier update from xAI.

The question the trackers skip

Reading that Gemini 3.5 Flash is 4x faster, or that GPT-5.5 Instant is the new default, is interesting for about thirty seconds. The thing that decides whether the release matters to you is mechanical: when a new model id exists, how many steps stand between you and running it inside the tool you already work in? For most coding agents the honest answer is "wait for the next app release," because the model names are baked into a dropdown.

I build fazm, a native macOS app that wraps Claude Code and Codex through the Agent Client Protocol, so I had to make a decision about this on the runtime side. The decision was to never hardcode the model list. Below is exactly how that plays out when a model lands.

Three levers that decide your day-one latency

The picker is rebuilt, not hardcoded

ShortcutSettings.updateModels() takes the model list the agent reports at session start and rebuilds the menu around it. The only hardcoded entries are three fallbacks (haiku, sonnet, claude-opus-4-8) used when the agent hasn't answered yet.

One key for the endpoint

customApiEndpoint becomes the ANTHROPIC_BASE_URL env var passed to the bridge. Point it at any Anthropic-compatible gateway and the day-one model is reachable.

Three backends, one menu

Claude Code, Codex (GPT), and Gemini are each probed and merged. A new checkpoint from any of the three shows up where you already pick models.

The path from lab announcement to running it in your window

A model lands at a lab

Gemini 3.5 Flash, GPT-5.5 Instant, a new Claude checkpoint. The lab updates its API; the live trackers update within hours.

The agent backend learns the name

The underlying CLI (Claude Code, Codex, or Gemini) recognises the new model id. In fazm that backend is a pinned acp-bridge dependency, so updating the bridge is what surfaces a new id, not a full app re-release.

fazm rebuilds the picker on next session

When you open a window, ShortcutSettings.updateModels() reads onModelsAvailable from the ACP SDK and rebuilds the menu. If your previous selection disappeared, it migrates you to the closest available model instead of erroring.

Or you force it via the endpoint

If the model is only live behind a gateway, set customApiEndpoint in Settings > Advanced > AI Chat. fazm validates the URL, sets ANTHROPIC_BASE_URL, and routes the session there. Bad URLs silently revert to the default.

You A/B it on a real task, same window

Because the session and its full history survive, you can run the new model against the same prompt the old one just handled and compare, without losing the thread.

The part you can verify in the source

This is the uncopyable bit, because it is a property of the code rather than a marketing line. fazm is open source, so you can open the files and check.

The model menu is assembled by ShortcutSettings.updateModels() in Desktop/Sources/FloatingControlBar/ShortcutSettings.swift. It reads the list the ACP SDK reports via onModelsAvailable after a new session is created and rebuilds the picker. The only hardcoded entries are three fallbacks used before the agent answers:

// ShortcutSettings.swift
static let defaultModels: [ModelOption] = [
  ModelOption(id: "haiku",           label: "Scary (Haiku, latest)"),
  ModelOption(id: "sonnet",          label: "Fast (Sonnet, latest)"),
  ModelOption(id: "claude-opus-4-8", label: "Smart (Opus, latest)"),
]

The endpoint override is a single setting. In SettingsPage.swift it is an @AppStorage("customApiEndpoint") string. When you set it, ACPBridge.swift validates the URL and exports it to the agent process as the standard ANTHROPIC_BASE_URL environment variable:

// ACPBridge.swift
if let raw = defaults.string(forKey: "customApiEndpoint")?
     .trimmingCharacters(in: .whitespacesAndNewlines), !raw.isEmpty,
   let endpoint = Self.validCustomAPIEndpoint(raw) {
  env["ANTHROPIC_BASE_URL"] = endpoint
}

And the three backends are pinned dependencies in acp-bridge/package.json: @agentclientprotocol/claude-agent-acp@0.29.2, @zed-industries/codex-acp@0.12.0, and @google/gemini-cli@^0.42.0. Bumping one of those is what surfaces a new model id from that lab, no full app re-release required.

Backends merged into one model picker

Claude Code via ACPCodex / GPT (codex-acp)Gemini CLICustom ANTHROPIC_BASE_URL gatewayBring your own Claude Pro or Max

When the new model is not the upgrade you wanted

A fair counterpoint: most days the latest release will not move your work much. For everyday coding and desktop automation, the loop around the model tends to matter more than the checkpoint. Whether your session survives a Mac restart, whether the context gets silently compacted, and whether you can fork a thread cleanly all change your throughput more than swapping one frontier model for another. That is the actual reason to care about runtime mechanics: not so you can chase every release, but so that when a release genuinely helps, there is zero friction to adopting it, and when it does not, you have lost nothing.

Want the day-one model path on your own Mac?

Walk through how fazm resolves models, swaps backends, and routes custom endpoints, on a real release.

Questions people actually ask

What AI models were released in the last 24 hours?

The set is different by the time you read this, so the only honest answer is a pointer to a live source. The trackers that stay correct are llm-stats.com/llm-updates, OpenAI's model release notes, and the Google AI blog. As of May 30, 2026, the most recent confirmed frontier drops were Gemini 3.5 Flash (announced May 19 at Google I/O 2026), GPT-5.5 Instant (May 5, the new ChatGPT default), and Grok 4.3 (May 6). Anything hand-typed past that window is a guess.

Why can't a regular article tell me what dropped in the last 24 hours?

Because frontier labs and open-weight families push checkpoints faster than a static page can be rewritten and re-indexed. By the time an article is written and crawled, several more releases have landed. A page can give you a dated snapshot and the live feeds to watch, but the snapshot is a photograph, not a clock.

A new model just shipped. How fast can I actually use it in a coding agent?

That depends entirely on how your tool resolves the model list. If the model names are hardcoded in a dropdown, you wait for an app update. In fazm the picker is rebuilt from whatever the underlying agent reports at session start (ShortcutSettings.updateModels), so a model the Claude Code or Codex backend already knows about shows up without a new release of the app.

Can I point fazm at a brand-new model behind a custom gateway or proxy?

Yes. The setting is a single UserDefaults key, customApiEndpoint, exposed in Settings > Advanced > AI Chat. When set, fazm passes it to the agent bridge as the ANTHROPIC_BASE_URL environment variable (ACPBridge.swift), so any Anthropic-compatible endpoint, corporate proxy, or gateway is routed through transparently. Invalid URLs fall back to the default endpoint instead of breaking the session.

Does fazm only run Claude, or can I swap to a model from another lab when it lands?

It runs three backends. The acp-bridge pins @agentclientprotocol/claude-agent-acp (Claude Code), @zed-industries/codex-acp (Codex / GPT models), and @google/gemini-cli (Gemini). Each is probed independently and its models are merged into one picker, so when a new GPT or Gemini checkpoint ships you swap the backend on that window rather than switching apps.

Is a brand-new frontier model going to make my work meaningfully faster?

Usually less than the launch post implies. For everyday coding and desktop automation, the loop around the model (whether your session survives a restart, whether context is silently compacted, whether you can fork a thread) moves throughput more than swapping one frontier checkpoint for another. Treat a release as a candidate to A/B test on a real task, not a guaranteed upgrade.

The runtime concerns that outlast any single model release

Keep reading

Sessions

Keeping a session alive across long runs

Why a session that survives a restart matters more than the model on launch day.

Read

Context

Claude Code auto-compacting and token waste

What silent context compaction costs you, and how fazm keeps the full history live.

Read

Endpoints

Pointing the agent at a custom ANTHROPIC_BASE_URL

How the base-url override works when you route through a gateway or mock endpoint.

Read

Snapshot of the most recent frontier drops

The question the trackers skip

Three levers that decide your day-one latency

The picker is rebuilt, not hardcoded

One key for the endpoint

Three backends, one menu

The path from lab announcement to running it in your window

A model lands at a lab

The agent backend learns the name

fazm rebuilds the picker on next session

Or you force it via the endpoint

You A/B it on a real task, same window

The part you can verify in the source

When the new model is not the upgrade you wanted

Want the day-one model path on your own Mac?

Questions people actually ask

Keep reading

Keeping a session alive across long runs

Claude Code auto-compacting and token waste

Pointing the agent at a custom ANTHROPIC_BASE_URL

Comments (••)

Comments ()