AI model releases and LLM launches, June 2026

June 2026 shipped a new model about every two days. Most pages about this month are a list of weights and benchmark lines that is wrong by the time you read it. This is the running list, plus the part that does not go stale: on a Mac, the agent harness you run a model inside is the thing you actually keep, and it decides whether any of these launches change your day.

Matthew Diakonov, Written with AI

Published June 16, 20267 min read

Direct answer · Verified June 16, 2026

The named June 2026 launches I can source cleanly: on June 2 Microsoft announced MAI-Thinking-1 and MAI-Code-1-Flash (both closed). On June 3 Google DeepMind released the open-weight Gemma 4 12B under Apache 2.0. On June 9 Anthropic released Claude Fable 5, its first publicly available Mythos-class model. The continuous open-weight and preprint stream kept moving on top of those, so the exact set on any given day is the live feed, not an article: Hugging Face new models and the llm-stats updates log.

The running list (the part that goes stale)

Four named launches in the first nine days, split down the open/closed line. The dates are each vendor's own announcement date. I am deliberately not padding this with every entry on the trackers, because the value of a hand-kept list is that each row is checked, and the value of a tracker is that it is complete. Use both for what each is good at.

Date	Model	What it is	Weights
June 2	MAI-Thinking-1 Microsoft	Reasoning (sparse MoE), private preview on Foundry	Closed
June 2	MAI-Code-1-Flash Microsoft	Coding model, rolling out inside GitHub Copilot	Closed
June 3	Gemma 4 12B Google DeepMind	Encoder-free multimodal, Apache 2.0, runs on a 16 GB laptop	Open
June 9	Claude Fable 5 Anthropic	Mythos-class frontier model, public access (briefly suspended after launch)	Closed

One thing this list already taught me: GPT-5.5 feels like a June model, but it shipped April 23, 2026. The version number is the only reliable timestamp. A roundup that gets the month wrong is worse than no roundup, so every row above is pinned to a vendor or first-hand source, and anything I could not pin sits in the live feeds, not here.

The model is the part that changes. The harness is the part you keep.

If a model launches every two days, then the model is not the durable thing in your setup. The durable thing is the loop you run it inside: what tools it can call, whether it reaches past the terminal, whether it keeps your context across a long session, and how fast you can point it at whatever shipped this morning. That loop is the harness, and it is the thing worth choosing carefully, because you will keep it while the model under it turns over a dozen times.

fazm is a harness, not a model. It wraps Claude Code and Codex through the Agent Client Protocol and runs that agent as a local subprocess. The model is an input to that subprocess, not something baked into the app. So when a new frontier model goes live on your provider, you do not wait for a fazm release to use it. The next agent process just carries the new id.

A launch flows into the loop, not into a new app build

How a June launch becomes reachable: no model id is compiled in

Here is the mechanism, in the actual source. When fazm spawns the agent subprocess, it builds the environment that process inherits. Two of those variables are the entire story of why a same-week model works without an update. The selected model is read from a setting and passed as FAZM_SELECTED_MODEL. And if you set a custom endpoint, it becomes ANTHROPIC_BASE_URL on that subprocess, after being validated as a real http(s) URL with a host so a typo cannot silently brick chat.

Desktop/Sources/Chat/ACPBridge.swift

Read what that buys you. The model is a string in a setting, not a constant in the binary, so the set of models you can run is whatever your endpoint currently serves, not whatever fazm shipped with. The custom endpoint path means an Anthropic-compatible gateway, a corporate proxy, GitHub Copilot, or a local server in front of an open weight like Gemma 4 12B is reachable the same way the default endpoint is. And there is a deliberate safety detail in that snippet: when you route through your own endpoint, fazm swaps its bundled key for a placeholder (sk-fazm-custom-endpoint) so it never hands its key to your proxy. The endpoint validator and this env build both live in ACPBridge.swift; the Codex backend is separate and carries its own per-chat model picker (GPT-5.5 variants at the current frontier) in CodexBackendManager.swift.

Pointing fazm at a model that launched this week

The flow is short because the model is just a setting. For a hosted launch on your provider, it is two steps. For a self-hosted open weight, you add a local gateway in front of it and point the custom endpoint there.

From launch to running it on your real work

1
Pick the backend
Claude Code by default, or flip on the Codex backend for a per-chat GPT model picker.
2
Select the model or set the endpoint
Choose the new model in the picker, or for a self-hosted open weight point Custom API Endpoint at your local Anthropic-compatible gateway.
3
Open a chat and drive real work
The next subprocess inherits the new model id and base URL. The agent reaches your browser and native apps, not just a terminal.
4
Judge it over days, not one prompt
Sessions survive a restart and nothing auto-compacts, so a multi-day trial does not drift because the harness dropped earlier turns.

Papers and open source projects move faster than any list

The model launches above are the named, dated events, but most of June 2026 lands as preprints and open source repos that never get a press release. A weight like Gemma 4 12B ships under Apache 2.0 with its model card and a report on the same day, and a dozen smaller open checkpoints land beside it. For that layer there is no honest static list, only feeds you read on the day. The three I actually keep open: the open-weight stream on Hugging Face sorted by creation date, the daily research picks on Hugging Face Papers (and the raw arXiv cs.AI recent listing behind it), and new tooling on GitHub Trending.

The same harness point applies to all three. A new open weight is reachable the moment you put a local Anthropic-compatible gateway in front of it and point the custom endpoint there. A paper that proposes a new prompting or tool pattern is something you test inside the agent loop on your real work, not in a notebook. And a trending repo is just another MCP server or CLI the agent can call. The releases turn over weekly; the loop you read these feeds into is what stays.

Why a playground tells you almost nothing

A model that looks sharp on a clean prompt in a vendor playground can fall apart the moment it carries your real files, your constraints, and the dead ends you already ruled out. The only honest test of a June release is to put it behind the same agent loop, with the same tools and the same running context, on the surface you actually work on. Two fazm properties make that test hold over the days it takes to form a real opinion: the agent loop reaches past the terminal through macOS accessibility APIs, so you evaluate the model on your full surface rather than a code-only sandbox, and nothing auto-compacts, so the comparison does not quietly drift because earlier turns got dropped. It runs locally and is open source, so what you feed a local model stays on your machine.

That is the case for treating the harness as the durable decision. June 2026 will be a footnote by August, and so will most of the models in the list above. The loop you run them inside is the thing that compounds.

Want to run a June launch through your own Mac workflow?

Book a short call and I will walk through selecting a fresh model, or pointing a custom endpoint at a self-hosted open weight, inside an agent loop that reaches your browser and native apps.

Questions about June 2026 AI model launches

What AI models launched in June 2026?

The named launches I can source cleanly: on June 2 Microsoft announced MAI-Thinking-1 (its first in-house reasoning model, a sparse mixture-of-experts in private preview on Foundry) and MAI-Code-1-Flash (a coding model rolling out inside GitHub Copilot), both closed weights. On June 3 Google DeepMind released the open-weight Gemma 4 12B under Apache 2.0. On June 9 Anthropic released Claude Fable 5, its first Mythos-class model for public use, which it then briefly suspended after a U.S. export directive. Open weights and preprints kept landing on top of those, at roughly a model every two days, so the only fully current view is the live feed, not a static list.

Did GPT-5.5 launch in June 2026?

No. GPT-5.5 launched on April 23, 2026, and the GPT-5.5 Instant default landed in early May. The OpenAI movement in June was the GPT-5.6 line, continuing the roughly six-week cadence. This is exactly the trap a dated roundup falls into: a model that feels like 'this month' is often last month, and the version number is the only thing that tells you. Pin the date to the vendor's own release note before you rely on it.

Is Gemma 4 12B open source, and Claude Fable 5?

Gemma 4 12B is open weights, publicly downloadable under Apache 2.0. Claude Fable 5 is closed: you reach it through Anthropic's API and apps, not a weights download. Microsoft's two June 2 MAI models are also closed (one in Foundry preview, one inside Copilot). So of the four named June launches here, one is something you can self-host and three are hosted services.

Why does fazm not need an update when a new model ships?

Because fazm does not compile a model id into the app. It wraps Claude Code and Codex through the Agent Client Protocol and launches that agent as a subprocess, then injects which model to run as an environment variable (FAZM_SELECTED_MODEL) at launch time. The model lives in a setting, not in the binary. A new frontier model from your provider is reachable the moment it is live on the endpoint, because the next subprocess just carries the new id.

Can I point fazm at a model that is not the default Anthropic endpoint?

Yes. There is a Custom API Endpoint setting (Settings, Advanced, AI Chat). Whatever Anthropic-compatible URL you put there is validated to be an http(s) URL with a real host, then set as ANTHROPIC_BASE_URL on the agent subprocess, and the bundled key is swapped for a placeholder so fazm never sends its key to your proxy. That covers routing through a corporate gateway, GitHub Copilot, or any Anthropic-compatible gateway. The Codex backend is separate and exposes its own per-chat model picker.

How do I actually try a June release inside a real Mac workflow?

Set it as the selected model (or, for a self-hosted open weight like Gemma 4 12B, point the custom endpoint at your local Anthropic-compatible gateway), open a chat, and drive your real work. In fazm the agent loop reaches past the terminal through macOS accessibility APIs, so you evaluate the model on the surface you actually use (browser, native Mac apps, Google Workspace), not a code-only sandbox. Sessions survive a restart and nothing auto-compacts, so a multi-day trial of a same-week release does not quietly drift because the harness dropped earlier turns.

Where is the always-current list of June 2026 releases?

A live tracker, not an article. The continuous open-weight stream lands on Hugging Face sorted by creation date, and the dated frontier launches are logged on the llm-stats updates feed. Any roundup, including this one, is a photograph of a moving thing. Use the feeds for the exact set on a given day and use a page like this for the part that does not change week to week.