Open Source AI Projects Releases: What Shipped in April 2026

Matthew Diakonov·April 10, 2026·8 min read

open-source ai-models releases april-2026 llm

Open Source AI Projects Releases: What Shipped in April 2026

April 2026 has been one of the densest months for open source AI releases in recent memory. Multiple frontier-class models dropped within days of each other, several under permissive licenses that let anyone fine-tune, deploy, and build commercial products without restrictions. If you stepped away for a week, you missed a lot.

Here is a breakdown of every major open source AI release this month, what each model is good at, and how they compare.

The Major Releases at a Glance

| Model | Organization | Parameters | License | Context Window | Standout Feature | |---|---|---|---|---|---| | Gemma 4 31B Dense | Google | 31B | Apache 2.0 | 128K | Fits on one H100, matches 600B+ models | | Gemma 4 26B MoE | Google | 26B (MoE) | Apache 2.0 | 128K | Efficient mixture-of-experts architecture | | GLM-5.1 | Zhipu AI | 744B (40B active) | MIT | 200K | Beat Claude Opus 4.6 on SWE-Bench Pro | | Qwen3.6-Plus | Alibaba | Undisclosed | Apache 2.0 | 1M | Million-token context, strong agentic coding | | DeepSeek-V3.2 | DeepSeek | Undisclosed | Open | 128K | Frontier reasoning with tool-use focus | | MiniMax M2.7 | MiniMax | Undisclosed | Open | 128K | Self-evolving training, 3x faster inference | | Kimi K2.5 | Moonshot AI | MoE | Open | 256K | Native multimodal and tool-use support | | OLMo 2 32B | Allen AI | 32B | Apache 2.0 | 8K | Fully open: weights, data, training code, logs |

How These Models Stack Up

The competitive landscape shifted fast. Google's Gemma 4 family runs on a single 80GB H100 GPU while delivering benchmark scores that rival models 20 times larger. Zhipu AI's GLM-5.1, released under the MIT license, reportedly outperformed both Claude Opus 4.6 and GPT-5.4 on SWE-Bench Pro, a coding benchmark that tests real-world software engineering tasks. Alibaba's Qwen3.6-Plus pushed the context window to one million tokens while maintaining competitive agentic coding performance.

The chart above plots each release by date and total parameter count. GLM-5.1 stands out at 744 billion parameters, though it only activates 40 billion per forward pass thanks to its mixture-of-experts architecture. Most other releases cluster in the 26B to 32B range, reflecting a clear trend: smaller, more efficient models that punch well above their weight class.

Gemma 4: Google's Apache 2.0 Bet

Google shipped Gemma 4 in four sizes: E2B, E4B, 26B MoE, and 31B Dense. The 31B Dense variant is the one getting the most attention. It scored 89.2% on AIME 2026 and 80.0% on LiveCodeBench v6, numbers that put it in the same tier as models with hundreds of billions of parameters.

The Apache 2.0 license means no usage restrictions. You can fine-tune it, deploy it commercially, modify it, and redistribute it without asking Google for permission. Since the original Gemma launch, developers have downloaded Gemma models over 400 million times and created more than 100,000 community variants on Hugging Face.

The 31B model fits on a single 80GB NVIDIA H100 GPU, which makes it practical for small teams and individual developers who do not have access to multi-node clusters.

GLM-5.1: MIT License, Frontier Performance

Zhipu AI released GLM-5.1 under the MIT license, the most permissive license of any frontier-scale model this month. At 744 billion total parameters with 40 billion active per forward pass and a 200K context window, it is one of the largest open source models available.

The headline claim: GLM-5.1 beat both Claude Opus 4.6 and GPT-5.4 on SWE-Bench Pro, a benchmark that tests whether a model can resolve real GitHub issues by generating correct patches. If that result holds up under independent evaluation, it marks the first time a fully open source model has topped the proprietary leaders on a production-grade coding benchmark.

Qwen3.6-Plus: The Million-Token Context Window

Alibaba's Qwen3.6-Plus pushes the context window to one million tokens under an Apache 2.0 license. A million tokens is roughly 750,000 words, enough to fit an entire codebase or book into a single prompt.

The model's agentic coding performance is competitive with Claude 4.5 Opus, and it includes enhanced multimodal capabilities for processing images alongside text. For teams building AI agents that need to reason over large codebases or lengthy documents, the context window alone makes it worth evaluating.

DeepSeek-V3.2 and MiniMax M2.7

DeepSeek-V3.2 focuses on combining frontier reasoning quality with improved efficiency for long-context and tool-use scenarios. It has become one of the best open source options for agentic workloads where the model needs to call tools, interpret results, and chain multiple actions together.

MiniMax M2.7 introduced something genuinely new: a "self-evolving" training approach where the model continuously improves through interaction rather than static training. It scored 56.22% on SWE-Pro and matched or exceeded Claude and GPT-5 on coding and agent benchmarks while running 3x faster. The speed advantage matters for latency-sensitive applications like code completion and interactive agents.

What This Means for Developers

Three trends stand out from this month's releases:

Efficiency over scale. The models getting the most developer attention are not the largest. Gemma 4 31B fits on one GPU and competes with 600B+ parameter models. The era of "bigger is always better" is ending.

Permissive licensing wins. Apache 2.0 and MIT licenses dominate the most impactful releases. Google, Alibaba, Zhipu AI, and Allen AI all chose licenses that impose zero restrictions on commercial use. This makes it practical to build products on top of these models without legal risk.

Agentic capabilities are table stakes. Every major release this month emphasized tool-use, multi-step reasoning, and agentic workflows. Models that can only generate text are no longer competitive. The baseline expectation is that a model can plan, execute, verify, and iterate.

For teams evaluating which model to adopt, the choice depends on your constraints. If you need the most permissive license with strong coding performance, GLM-5.1 under MIT is hard to beat. If you need a model that runs on a single GPU, Gemma 4 31B is the clear pick. If you need maximum context length for processing large documents or codebases, Qwen3.6-Plus with its million-token window is the only option at this scale.

Running These Models Locally

One of the advantages of open source models is the ability to run them on your own hardware, keeping data private and eliminating API costs. Tools like Ollama, llama.cpp, and vLLM make it straightforward to serve these models locally.

For macOS users, Fazm can automate the process of downloading, configuring, and running open source models on your machine. As an open source AI agent that runs locally, Fazm handles the setup so you can focus on using the model rather than debugging CUDA drivers and quantization configs.

Fazm is an open source macOS AI agent. Open source on GitHub.

Open Source AI Projects Releases: What Shipped in April 2026

Open Source AI Projects Releases: What Shipped in April 2026

The Major Releases at a Glance

How These Models Stack Up

Gemma 4: Google's Apache 2.0 Bet

GLM-5.1: MIT License, Frontier Performance

Qwen3.6-Plus: The Million-Token Context Window

DeepSeek-V3.2 and MiniMax M2.7

What This Means for Developers

Running These Models Locally

Related Posts

Open Source AI Projects Releases in April 2026: The Complete Tracker

Open Source AI Projects Releases April 7-8, 2026: What Shipped in 48 Hours

Open Source LLM Releases in 2026: What Has Shipped and What to Expect