Two-day roundup
Latest AI model releases, papers, and open source projects: June 16-17, 2026
The interesting thing about these two days is that the top release was not a frontier lab. It was open weights. A Chinese lab put an MIT-licensed model at the top of the open-weights leaderboard, hosted endpoints appeared within a day, and the price gap against the closed models stayed wide. Below is the verified list. After it, the part no other recap covers: the one client-side step that lets you actually drive any of these models in a persistent agent loop instead of a throwaway terminal.
Direct answer - verified June 19, 2026
On June 16, 2026, Z.ai released GLM-5.2 open weights under an MIT license. It became the leading open-weights model on the Artificial Analysis Intelligence Index v4.1 (score 51), with 753B parameters (40B active, Mixture of Experts) and a 1M-token context window. Documented by Simon Willison on June 17, 2026.
Authoritative source: simonwillison.net/2026/Jun/17/glm-52
GLM-5.2 by the numbers
The figures that made GLM-5.2 the story of the window. All four come from the same June 17 analysis.
What actually shipped, dated honestly
One genuine June 16-17 headline, one same-window infrastructure move, and one piece of context that recaps routinely mis-date. Every date is tied to a source so you can check it.
GLM-5.2 (open weights)
Z.aiOpen weights published under an MIT license. 753B parameters with 40B active (Mixture of Experts), a 1M-token context window (up from GLM-5.1's 200K), and the leading score on the Artificial Analysis Intelligence Index v4.1 (51). Ranks 2nd on the Code Arena WebDev leaderboard, behind only Claude Fable 5.
simonwillison.net, Jun 17GLM-5.2 hosted endpoints
OpenRouter providersMultiple providers stood up hosted GLM-5.2 within a day. Most charged about $1.40 per million input tokens and $4.40 per million output, versus GPT-5.5 at $5/$30 and Claude Opus 4.5-4.8 at $5/$25. The model uses ~43k output tokens per Intelligence Index task (up from 26k for GLM-5.1), so it is cheaper per token but more verbose per task.
simonwillison.net, Jun 17MiniMax M3 (context for the open-weights run)
MiniMaxEarlier in June, not June 16-17. Included because it set the bar GLM-5.2 was measured against: the first open-weight model to combine frontier coding, a 1M-token window, and native multimodality, topping open-weight SWE-Bench Pro at 59.0%. If you read a June recap that lumps M3 into the 16-17 window, it has the date wrong.
datanorth.aiTwo corrections worth making, because they were common in the recaps that week: MiniMax M3 shipped June 1, not June 16-17, and the Safetensors move into the PyTorch Foundation was announced April 8, 2026, not June. Neither belongs in a strict two-day window.
Open weights are only as good as the loop you run them in
GLM-5.2 is the third different model to hold the top open-weights spot in roughly six weeks. That cadence is the actual story. If every leaderboard shuffle means re-learning a CLI, losing your sessions on restart, or watching context auto-compact mid-task, the benchmark win disappears into friction before you feel it.
The part that survives a model swap is the harness: the agent loop, the session state, the ability to fork a conversation and keep the full history live. That is the lens I build Fazm through, so the honest question for a roundup like this is not which weights topped the chart on June 16. It is what you change when they do.
For an Anthropic-compatible model, the answer is one value. A hosted GLM-5.2 endpoint that speaks the Anthropic API (several OpenRouter providers and local gateways do) becomes a drop-in by setting the base URL the Claude Code agent talks to. Here is exactly how Fazm wires that, from the file that builds the bridge process environment.
Three details there are the uncopyable part of this page, because they come from the product and not from a press release. First, validCustomAPIEndpoint() rejects a scheme-less value like localhost:8766, so a malformed entry cannot silently brick built-in chat by landing in ANTHROPIC_BASE_URL. Second, Fazm replaces the bundled key with a placeholder, sk-fazm-custom-endpoint, so your subscription key never reaches the proxy serving the open model. Third, the swap touches only the environment of the Claude Code subprocess; the persistent session, the one-click fork, and the no-auto-compact behaviour are unchanged.
What happens when you point Fazm at an open-weights endpoint
You paste a base URL
Settings > Advanced > Custom API Endpoint. Any Anthropic-compatible host serving GLM-5.2 or another open model.
Fazm validates and writes env
validCustomAPIEndpoint() checks for an absolute http(s) URL with a host, then sets ANTHROPIC_BASE_URL and a placeholder key.
The Claude Code loop launches against it
Same agent loop, same tools, same MCP servers. Only the model endpoint changed.
Sessions persist and fork as before
Restart the Mac, the window restores with full history. Fork in one click. No auto-compacting.
Want to drive this week's open-weights model in a real agent loop?
Walk through pointing a persistent Claude Code session at a GLM-5.2 endpoint, sessions and forking included.
Questions people searched alongside this
Frequently asked questions
What was the biggest AI release on June 16-17, 2026?
GLM-5.2 from the Chinese lab Z.ai. On June 16, 2026 it published open weights under an MIT license and became the leading open-weights model on the Artificial Analysis Intelligence Index v4.1 with a score of 51. It is a 753B-parameter Mixture of Experts model with 40B active parameters and a 1M-token context window. Simon Willison documented the details on June 17, 2026.
Is GLM-5.2 really MIT licensed?
Yes. Per Simon Willison's June 17, 2026 write-up, Z.ai released the GLM-5.2 weights under an MIT license, which is unusually permissive for a model at the top of the open-weights leaderboard. MIT means you can run, fine-tune, and redistribute the weights with almost no restrictions, including commercially.
How much does GLM-5.2 cost to run versus the closed frontier models?
Via OpenRouter, most providers listed GLM-5.2 around $1.40 per million input tokens and $4.40 per million output. For comparison the same listing showed GPT-5.5 at $5/$30 and Claude Opus 4.5-4.8 at $5/$25 per million. One caveat: GLM-5.2 spends roughly 43k output tokens per Intelligence Index task versus 26k for GLM-5.1, so the per-task cost gap is smaller than the per-token gap.
Did MiniMax M3 release on June 16-17?
No. MiniMax M3 launched on June 1, 2026, earlier in the same month. It matters as context because it set the open-weight bar (1M context, native multimodality, 59.0% on open-weight SWE-Bench Pro) that GLM-5.2 was measured against, but it is not a June 16-17 release. Several roundups blur this; the dates are distinct.
How do I actually run an open-weights model like GLM-5.2 inside a real agent loop?
Point an Anthropic-compatible gateway at the model and give that gateway's URL to your agent client. In Fazm, Settings > Advanced > Custom API Endpoint writes the URL into the ANTHROPIC_BASE_URL environment variable for the Claude Code subprocess. Fazm validates the URL (it must be an absolute http or https URL with a host, so values like 'localhost:8766' are rejected) and swaps in a placeholder key, sk-fazm-custom-endpoint, so its bundled Anthropic key never reaches your proxy. The agent loop, session persistence, and forking stay exactly the same; only the model endpoint changes.
Why does a model-release roundup keep talking about the harness?
Because the model is the part that churns and the harness is the part you keep. GLM-5.2 is the third leading open-weights model in roughly six weeks. If switching to it means re-learning a CLI, losing your sessions on restart, or watching context auto-compact mid-task, the benchmark win evaporates in friction. The durable question is not which weights topped the chart this week, it is whether your agent loop survives the swap.
Where do I find AI releases like this as they happen?
Primary and first-tier sources beat aggregated recaps for accuracy: Simon Willison's blog for narrative analysis of new models, the Artificial Analysis Intelligence Index for leaderboard movement, Hugging Face's trending papers for research, and each lab's own release notes. Cross-check the date on any 'past 24 hours' style page, since they routinely fold older releases into a fresher-looking window.
Keep reading
AI model releases in 2026: the verified list so far
The first-half-of-2026 frontier timeline, and why the harness outlives every launch on it.
AI releases from the past 24 hours: where to look
How to track new models and papers without relying on mis-dated recaps.
Control Claude Code context compaction
The harness behaviour that survives a model swap: full chat history that does not auto-compact mid-task.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.