Open-weight release ledger, verified May 26, 2026

The open-source AI model release for May 2026 that everyone is searching for did not happen

The month's headline model is closed. The open weights you can actually download shipped in April. This is the verified ledger of what is real, plus the single setting that points a real Claude Code agent at any of them.

M
Matthew Diakonov
7 min read

Direct answer · verified May 26, 2026

As of May 26, 2026, no major new open-weight frontier model shipped during May. May's headline release, Alibaba's Qwen3.7-Max (announced May 20 at the Apsara Summit), is closed-weight and API-only. The downloadable open-weight flagships all landed in April 2026: DeepSeek V4 (Apr 24, MIT), Kimi K2.6 (Apr 20, Modified MIT), Mistral Medium 3.5 (Apr 29), GLM-5.1 (Apr 7, MIT), plus Qwen 3.6 and Gemma 4.

Source checked: llm-stats.com release tracker. Open-weight availability confirmed against the official model pages for Kimi K2.6 and GLM-5.1.

The ledger: open vs closed, May and April 2026

One column does all the work here, the "open weights" one. A model with downloadable weights is one you can run yourself; a closed model is rentable but not runnable. Sorted newest first.

ModelMakerReleasedOpen weights?License
Qwen3.7-MaxAlibabaMay 20, 2026No ×Proprietary, API-only
Gemini 3.5 FlashGoogleMay 2026No ×Proprietary
Grok 4.3xAIMay 2026No ×Proprietary
GPT-5.5 InstantOpenAIMay 2026No ×Proprietary
Mistral Medium 3.5MistralApr 29, 2026Yes ✓Modified MIT
DeepSeek V4 (Pro / Flash)DeepSeekApr 24, 2026Yes ✓MIT
Kimi K2.6Moonshot AIApr 20, 2026Yes ✓Modified MIT
GLM-5.1Z.aiApr 7, 2026Yes ✓MIT

Teal rows are downloadable open weights. Dates and open-weight status cross-checked against the llm-stats tracker and each model's official page on May 26, 2026.

Why the "May" search keeps returning April models

The pattern is consistent across the popular roundups: they pin a fresh month to the heading and then list the same April releases underneath. DeepSeek V4, Kimi K2.6, GLM-5.1, Qwen 3.6, and Gemma 4 are all real and all current, but their weights dropped in April. May's genuinely new headline, Qwen3.7-Max, is the one most lists quietly drop from the download column, because there is nothing to download. It reaches a 1M-token context and ships through DashScope and Model Studio, but no public weights existed as of May 25, 2026.

That is the honest shape of the month: the frontier moved, but it moved behind an API. If your goal is to run a model on your own hardware, your shortlist is the April set, not the May headline.

0

token context, DeepSeek V4 (MIT, Apr 24)

0

token context, Kimi K2.6 (1T MoE, ~32B active)

0B

parameters, GLM-5.1 MoE (~40B active, MIT)

From release note to working agent: one setting

Picking a model is the easy part. The harder question, the one the release roundups skip, is how to drive an open weight as an actual coding agent without losing the harness you already like. Fazm is a native macOS app that wraps Claude Code over the Agent Client Protocol. Its Custom API Endpoint field is the seam where an open weight plugs in. Here is the exact path.

Routing an open weight through the Claude Code agent loop

1

Serve the weights behind an Anthropic-format endpoint

Load a downloadable model (GLM-5.1, a DeepSeek V4 build, Kimi K2.6) in a local server such as LM Studio or Ollama, and front it with a shim that speaks the Anthropic messages API. The wire format is the requirement, not the vendor.

2

Open Settings, Advanced, AI Chat

In Fazm, the Custom API Endpoint card lives under Settings then Advanced then AI Chat. The toggle reveals a single text field whose placeholder shows the shape of an endpoint URL (host plus port, like your-proxy:8766).

3

Paste your server URL

Whatever you type is stored under the customApiEndpoint key and injected as ANTHROPIC_BASE_URL when Fazm launches the agent bridge. Submitting the field restarts the bridge so the change takes effect immediately.

4

Run the same agent, now on your weights

Persistent sessions, one-click forking, no auto-compacting, and the Mac-wide reach into the browser and native apps all stay. Only the model behind the loop changed. Clear the field to drop back to the default Anthropic API.

point-fazm-at-an-open-weight.txt

The one honest constraint

Fazm's own help text says it directly: the endpoint must speak the Anthropic API format, and a raw Gemini or OpenAI key will not work there. So an open weight needs to be served through an Anthropic-compatible messages endpoint, not just any inference server. If the server has no model loaded, the bridge detects the upstream error (for example LM Studio's "no models loaded" message) and rewrites it into an instruction to load one, rather than failing silently or blaming the app.

Why the harness outlasts the model of the month

There will be another headline model next month, and the month after. Qwen 3.7 open weights may arrive in summer; a new DeepSeek point release is always plausible. If your agent is welded to one provider, every one of those releases is a migration. If your agent loop is decoupled from the model, each release is a single line in a settings field.

That decoupling is the whole reason the Custom API Endpoint exists. The part that should stay stable, your sessions surviving a restart, clean forking, full context with no compacting, the agent reaching past the terminal into your browser and Mac apps, lives in the app. The part that changes weekly, which model is best this week, lives behind a URL you control. Fazm is fully open source, so you can read exactly how that handoff works.

Want to run an open weight as your daily agent?

Walk through pointing Fazm at a local or proxied open-weight model, and what the Anthropic-format requirement means for your setup.

Frequently asked questions

Which open-source AI model was released in May 2026?

As of May 26, 2026, no major new open-weight frontier model shipped during May. The headline May release, Alibaba's Qwen3.7-Max (announced May 20 at the Apsara Summit), is closed-weight: it is API-only through Alibaba Cloud's DashScope and Model Studio, with no Hugging Face weights as of May 25. The other notable May releases, Google's Gemini 3.5 Flash, xAI's Grok 4.3, and OpenAI's GPT-5.5 Instant, are all proprietary. The open-weight flagships you can actually download right now shipped in April 2026.

So what are the latest downloadable open-weight models?

The April 2026 window is where the open weights are. DeepSeek V4 (Pro and Flash) landed on April 24 under an MIT license with a 1M-token context. Moonshot AI's Kimi K2.6 shipped April 20, a 1-trillion-parameter Mixture-of-Experts model (about 32B active per token) with a 262,144-token context under a Modified MIT license. Mistral Medium 3.5 arrived April 29, a 128B dense multimodal model with a 256K context under a modified-MIT open-weights license. Z.ai's GLM-5.1 (April 7) is a 744B MoE (~40B active, 200K context) under a standard MIT license. Qwen 3.6 and Gemma 4 also fall in the same window.

Why do so many guides say there were big open-source releases in May 2026?

Most 'best open-source LLM, May 2026' roundups are relisting April releases under a fresh month heading. The models they rank (DeepSeek V4, Kimi K2.6, GLM-5.1, Qwen 3.6, Gemma 4) are real and current, but their weights dropped in April. The genuinely new May headline, Qwen3.7-Max, is the one most lists quietly leave out of the 'download and run' column because it has no public weights.

Can I run a May/April open-weight model as my coding agent on a Mac?

Yes, if the model is served behind an Anthropic-API-compatible endpoint. Fazm wraps Claude Code via the Agent Client Protocol, and it exposes a Custom API Endpoint setting (Settings, then Advanced, then AI Chat). Whatever URL you put there is injected as the ANTHROPIC_BASE_URL environment variable for the agent bridge, so the same agent loop talks to your local server (for example LM Studio or Ollama with an Anthropic-compatible shim) instead of Anthropic's cloud. The model serving the open weights does the thinking; Fazm keeps the persistent sessions, forking, and Mac-wide reach.

Will any open-weight model work, or are there limits?

The hard requirement is the wire format. Fazm's own help text says it plainly: the endpoint must speak the Anthropic API format, and a raw Gemini or OpenAI key will not work there. So you need a server that exposes the open weights through an Anthropic-compatible API (the messages endpoint), not just any inference server. If your local server has no model loaded, Fazm detects the upstream error and tells you to load one rather than blaming itself.

Is Qwen3.7-Max going to get open weights later?

Alibaba has not committed to it. As of late May 2026 Qwen3.7-Max and the smaller Qwen3.7-Plus are closed and reachable only through DashScope, Model Studio, and third-party aggregators. Smaller open-weight Qwen 3.7 variants are speculated for the June to July window based on the Qwen 3.6 release cadence, but that is not confirmed by Alibaba. If you want open weights from the Qwen family today, Qwen 3.6 is the current downloadable flagship.

Do the licenses let me use these models commercially?

It varies by model, so read the card. GLM-5.1 and DeepSeek V4 ship under MIT with no commercial-use restriction. Kimi K2.6 uses a Modified MIT license that behaves like standard MIT below large usage thresholds (roughly 100M monthly active users or $20M monthly revenue), above which you must display attribution. Mistral Medium 3.5's modified-MIT terms carry tighter commercial limits than plain MIT. Always confirm against the official model card before shipping.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.