Latest open source LLM releases, 2026
A running list of the open-weight models that have shipped this year, in release order, with the facts that age well: ship date, parameter count, context window, and license. Then the part every other list leaves out: the one Settings field that lets a native Mac agent point its engine at any of these weights and drive your real computer with it.
Direct answer, verified June 20, 2026
The most recent open-weight releases of 2026 are MiniMax M3 (June 1, frontier coding with a 1M-token context and native multimodality), followed by a mid-June pair, Kimi K2.7 Code from Moonshot AI and GLM-5.2 from Z.ai (week of June 12 to 13). Behind those sit DeepSeek V4, Kimi K2.6, Mistral Medium 3.5, Google Gemma 4, Qwen 3.5, and Llama 4 Scout. The full calendar, with licenses, is the table below. Sourced against vendor model cards on Hugging Face.
The 2026 open-weight calendar, newest first
Every roundup reorders these by benchmark, which goes stale in a week. This table is ordered by ship date and carries the two facts that do not move: parameter shape and license. Benchmark figures cited in the notes are vendor-reported unless stated otherwise.
| Model | Lab | Shipped | Parameters | Context | License |
|---|---|---|---|---|---|
| MiniMax M3First open-weight model pairing frontier coding, 1M context, and native multimodality. Vendor-reported 59.0% SWE-Bench Pro. Weights staged to Hugging Face after launch. | MiniMax | Jun 1, 2026 | Undisclosed (MoE) | 1M | Conditional (review terms) |
| GLM-5.2Successor to GLM-5.1 (vendor-reported 58.4% SWE-Bench Pro). Positioned for long-horizon agentic engineering. | Z.ai (Zhipu AI) | Jun 12-13, 2026 | MoE | Long-horizon agentic | Open weight |
| Kimi K2.7 CodeCoding-focused refresh of the K2 line shipped the same week as GLM-5.2. | Moonshot AI | Jun 12-13, 2026 | ~1T total (MoE) | Agentic coding | Modified MIT |
| DeepSeek V4Two MoE sizes, native 1M context, trained on 32T+ tokens. The clean-license frontier MoE of the spring. | DeepSeek | Apr 24, 2026 (preview) | V4-Pro 1.6T/49B active; V4-Flash 284B/13B active | 1M | MIT |
| Kimi K2.6Native multimodal agentic model. MIT-equivalent below 100M MAU / $20M monthly revenue, attribution clause above. | Moonshot AI | Apr 20, 2026 | 1T total / 32B active (MoE) | Agentic | Modified MIT |
| Mistral Medium 3.5Dense coder. Permissive for individuals and most companies; read the revenue clause before large-scale commercial deployment. | Mistral | Apr 29, 2026 | 128B dense | 256K | Modified MIT |
| Google Gemma 431B dense runs on a single H100; 26B MoE gives near-4B serving cost. Native function calling, 100+ languages. | Apr 2026 | 31B dense; 26B A4B MoE | 256K | Apache 2.0 | |
| Qwen 3.5Hybrid Gated Delta + sparse MoE with native vision-language. Qwen3-Coder is SOTA among open models on agentic coding. | Alibaba | Feb 2026 | Family incl. Coder-480B-A35B | Long | Apache 2.0 |
| Llama 4 ScoutIndustry-leading context window. The license is not OSI-approved and carries a monthly-active-user ceiling. | Meta | 2026 | 109B MoE / 17B active (16 experts) | 10M | Meta custom (700M MAU clause) |
Apache 2.0 and MIT are unconditional. Modified MIT licenses (Kimi K2 line, Mistral Medium 3.5) add a clause above a revenue or user threshold. Llama 4 Scout uses Meta's custom license, which is not OSI-approved.
The part the other lists skip: running one on your actual Mac
A new open-weight model only matters to a Mac user if something on the desktop can point its inference at the model and then act on the screen. That is a different problem from downloading weights. A terminal chat with the model is easy; getting an agent to read your accessibility tree, click a real button, and fill a real form using that model is the gap.
Fazm closes it with one field. In Settings there is a custom endpoint box (the placeholder reads your-proxy-host : 8766). Whatever you type there is handled by a single block in the chat bridge before the agent subprocess launches. Here is the actual code, not a paraphrase:
Three things in that block are worth slowing down on. First, the value is validated: validCustomAPIEndpoint (around lines 301 to 312 of the same file) requires a parseable URL with an http or https scheme and a non-empty host, so a typo like localhost:8766 is ignored rather than forwarded into a dead chat. Second, the valid value is assigned to env["ANTHROPIC_BASE_URL"] on the spawned process, which is the single switch that redirects the whole agent loop. Third, and this is the detail no model roundup mentions, Fazm replaces its bundled Anthropic key with the placeholder sk-fazm-custom-endpoint so your proxy never receives the real key. The moment you route your own model, Fazm stops trusting your endpoint with its credentials.
How an open-weight model becomes a Mac agent
The proxy in the middle is what turns any of the models above into something the bridge can talk to. It re-emits the Anthropic API surface and forwards to the actual inference server.
Inference path when a custom endpoint is set
What it looks like end to end
Stand up a gateway in front of your chosen weights, paste its URL into the Settings field, and the agent runs against your real desktop with that model behind it.
Nothing in that path is model-specific. Swap the LiteLLM target for Qwen 3.5, Gemma 4, or a Modified MIT model you have cleared for your use, and the Fazm side does not change. The bridge never special-cases a vendor, which is why the calendar at the top of this page stays usable from the desktop without waiting for an app update.
Two honest caveats before you pick a model
Launch benchmarks are not settled facts. MiniMax M3's day-one numbers were run on the vendor's own infrastructure with agent scaffolding, and at launch the weights were not yet on Hugging Face and the parameter count was undisclosed. Treat those, and the other mid-June figures from the Chinese frontier labs, as vendor-reported until independent reruns appear. The ship dates and licenses in the table are the durable part.
Where the weights run is a trust decision, not just a quality one. A model served from a third-party API in a jurisdiction with broad government-access laws is a different posture than the same open weights running on hardware you control. Routing Fazm through your own proxy keeps your prompts, screen context, and accessibility-tree reads inside infrastructure you own, which is the practical reason the custom endpoint exists.
Want a Mac agent that runs the open weights you pick
Walk through pointing Fazm at your own proxy and driving real macOS apps with an open-weight model, on a short call.
Frequently asked questions
What are the latest open source LLM releases in 2026?
As of June 20, 2026 the most recent open-weight drops are MiniMax M3 (June 1, the first open-weight model to combine frontier coding, a 1M-token context, and native multimodality, with a vendor-reported 59.0% on SWE-Bench Pro), and then a mid-June pair: Kimi K2.7 Code from Moonshot AI and GLM-5.2 from Z.ai, both shipped the week of June 12 to 13. Behind those, the spring of 2026 produced DeepSeek V4 (previewed April 24, MoE in 1.6T and 284B sizes, MIT), Kimi K2.6 (April 20, 1T total / 32B active, Modified MIT), Mistral Medium 3.5 (April 29, 128B dense, Modified MIT), Google Gemma 4 (April, Apache 2.0), Qwen 3.5 (February, Apache 2.0), and Llama 4 Scout (Meta custom license). Apache 2.0 and MIT dominate the permissive end of the list; the Chinese frontier labs lean on Modified MIT with a revenue or attribution clause. This list is verified against vendor model cards and updated as new weights land.
Which 2026 open-weight model has the cleanest license for commercial use?
DeepSeek V4 (MIT), Qwen 3.5 (Apache 2.0), and Google Gemma 4 (Apache 2.0) are the unconditional picks. MIT and Apache 2.0 carry no field-of-use, revenue, or monthly-active-user clauses, so you can run them inside a paid product without clearing terms. The Modified MIT licenses on the Kimi K2 line and Mistral Medium 3.5 are permissive for individuals and most companies but add a clause above a revenue or user threshold (Kimi requires a 'Kimi K2' UI attribution above 100M MAU or $20M monthly revenue). Llama 4 Scout uses Meta's custom license with a 700M MAU ceiling and is not OSI-approved. MiniMax M3's license has commercial conditions you should read before shipping. If you need a clean weights file for a small business workflow, start with the Apache 2.0 and MIT entries.
How do I actually run one of these open-weight models against my real Mac apps?
The weights only get you halfway. You also need a desktop agent that lets you redirect its inference target from the default Anthropic API to a proxy that re-emits requests as your chosen model. In Fazm that is a single Settings field, customApiEndpoint, backed by a block in Desktop/Sources/Chat/ACPBridge.swift (around lines 2417 to 2439). The block trims the value, validates that it is an absolute http or https URL with a host, sets env["ANTHROPIC_BASE_URL"] on the spawned agent subprocess, and swaps Fazm's bundled Anthropic key for a placeholder so your proxy never receives the real key. Point that field at a LiteLLM or vLLM gateway re-emitting DeepSeek V4, Qwen 3.5, Gemma 4, or any other model on this page, and the agent runs end to end against your real macOS accessibility tree. The code is visible in the open repo at github.com/m13v/fazm.
Why does Fazm validate the endpoint instead of just forwarding it?
Because a malformed value silently bricks chat. If you paste 'localhost:8766' (no scheme) or stray text into the field and the app forwards it raw, the Anthropic SDK throws 'Invalid URL' on every query, and the retry-with-resume path can swallow that into an empty turn so you see no error at all. The validCustomAPIEndpoint helper (ACPBridge.swift, around lines 301 to 312) guards against this: it requires a non-empty trimmed string, a parseable URL, an http or https scheme, and a non-empty host. Anything else is ignored and the app falls back to the default Anthropic endpoint with a log line, so a typo can never leave you with a dead chat window.
Are the headline June 2026 benchmark numbers independently verified?
Not all of them. MiniMax M3's launch figures (59.0% SWE-Bench Pro, 66.0% Terminal-Bench 2.1) were run on MiniMax's own infrastructure with agent scaffolding, and at launch the weights were not yet on Hugging Face and the parameter count was undisclosed, so treat those as vendor-reported until third-party reruns land. The same caution applies to GLM-5.1's 58.4% SWE-Bench Pro and other day-one numbers from the Chinese frontier labs. The ship dates and licenses on this page are the durable facts; the benchmark deltas move as independent evaluations come in.
Do I need to pick a different model per task, or set one and forget it?
Either works, because the override lives at the chat-engine layer, not in your model files. The customApiEndpoint field points the agent at whatever your proxy exposes, so the practical pattern is to run a gateway (LiteLLM is the most common) that routes by model name to several backends, then keep the Fazm endpoint pointed at the gateway. You change which weights answer by changing the model the gateway serves, not by reconfiguring the Mac app. For a small business workflow that mostly needs reliable tool calls over the accessibility tree, a clean Apache 2.0 model like Gemma 4 or Qwen 3.5 is a sensible default; reach for a 1M-context model like DeepSeek V4 when a single run has to hold a large document or a long browser session.
Is there a privacy difference when I route Fazm through my own proxy?
Yes, and it is the main reason to do it. When the custom endpoint is set, Fazm disables its bundled Anthropic key and substitutes a harmless placeholder (sk-fazm-custom-endpoint), so the real key never reaches your proxy and the request goes to the address you control rather than to Anthropic. If that proxy fronts a model you are hosting locally or in your own cloud, your prompts, your screen context, and your accessibility-tree reads stay inside infrastructure you own. That matters more for some 2026 releases than others: a model served from a third-party API in a jurisdiction with broad government-access laws is a different trust posture than the same open weights running on your own box.
Does this work with proprietary models too, or only open weights?
Any Anthropic-compatible endpoint works, which includes proxies in front of open weights, corporate gateways, and GitHub Copilot-style routes. The field is named for custom endpoints precisely because it does not care what is behind the URL, only that the URL speaks the Anthropic API surface. That is what makes the open-weight calendar on this page usable from a Mac: the bridge code never special-cases a vendor, so the day a new permissive model ships and someone stands up an Anthropic shim for it, you can point Fazm at it without an app update.
Where can I see the exact code that does the redirect?
It is open source. The endpoint forwarding lives in Desktop/Sources/Chat/ACPBridge.swift (the env["ANTHROPIC_BASE_URL"] assignment around lines 2417 to 2439), the validation helper is validCustomAPIEndpoint in the same file (around lines 301 to 312), and the Settings text field that writes the value is in Desktop/Sources/MainWindow/Pages/SettingsPage.swift with a placeholder showing the host and port shape of a local proxy (for example a host on port 8766). The whole repo is at github.com/m13v/fazm, so you can read the path a request actually takes before you trust it with your screen and keyboard.
Keep reading
Open source LLM releases in May 2026: the dated calendar
The May snapshot of the same calendar, with the late-April cluster (MiMo, Nemotron, Granite, Mistral) in full.
Upcoming LLM releases 2026: what is previewed but not shipped
The other side of this list: models labs have signaled but not yet published weights for.
AI model releases 2026: the broader proprietary and open picture
Open weights plus the proprietary frontier drops, in one running timeline.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.