APRIL 2026 / OPEN WEIGHTS + REAL DESKTOP AGENT

Open source LLM news in 2026, and the one-line switch that makes any of it useful on your Mac

Every roundup lists the same names. GLM-5.1, DeepSeek-R1, Qwen3-235B, Kimi-Dev-72B, Llama 4, Arcee. What none of them tell you is how to actually run these models inside a desktop agent that drives your real apps. This page covers the releases, then walks through the exact lines of code in an MIT-licensed Mac agent that let a user swap the engine with one toggle.

M
Matthew Diakonov
10 min read
4.9from Written from the Fazm source tree
GLM-5.1, DeepSeek-R1, Qwen3, Kimi, Llama 4, Arcee
MIT-licensed Mac agent
ANTHROPIC_BASE_URL swap, one toggle
Accessibility tree, not screenshots
Cited file paths and line numbers

THE SHORTLIST

The six names everyone cites

One line each, drawn from the April 2026 coverage. No rank order. All open weights. All usable today through at least one hosted inference provider.

1

GLM-5.1 (Zhipu AI)

The 2026 flagship for agentic coding. Positioned by Zhipu for long-horizon software engineering; leads open-weight models on SWE-bench Pro and Terminal Bench per the 2026 comparative writeups. If your use case is 'write and edit code across many files over many turns,' GLM-5.1 is the open model to try first.

2

DeepSeek-R1 (DeepSeek)

Still the reference open reasoning model. Most 2026 guides put it at or near the top of their general-purpose list. Strong on math, complex reasoning chains, and structured outputs. Practical constraint: the very top checkpoints need serious GPUs, but hosted inference is cheap and widely available.

3

Qwen3-235B-A22B (Alibaba)

A 22B active / 235B total MoE. Best fit when you need reasoning plus efficient multilingual dialogue. Widely cited in 2026 as the balanced 'default' open model for apps that span English + other languages.

4

Kimi-Dev-72B (Moonshot)

Coding-specific. If GLM-5.1 is the generalist agentic coder, Kimi-Dev-72B is the size-optimized one, often the pick for teams that want to self-host and cannot justify a 200B+ deployment.

5

Llama 4 Maverick (Meta)

400B parameters, open. Meta is also reportedly preparing two proprietary frontier models codenamed Avocado (LLM) and Mango (multimedia). Relevant because it keeps pressure on the open curve without any one vendor owning the ceiling.

6

Arcee (Arcee AI)

A 26-person U.S. startup that shipped a 400B-parameter open model on roughly a $20M budget per the April 2026 TechCrunch piece. Their newer reasoning-focused release is the one to watch. The business-model story is as interesting as the model.

THE GAP IN THE COVERAGE

Every roundup stops at the model. This one goes one step further.

The question nobody answers in any of the April 2026 open-source LLM lists is simple. If I install one of these models, how do I get it to actually click, type, and operate a real Mac app?

The answer has nothing to do with the model and everything to do with the substrate between the model and your operating system. That substrate needs to read app state as structured data, expose a tool-calling surface, and be swappable at the model layer. Almost nothing in the consumer space does all three.

Fazm is the exception I know about. It is open source under MIT (github.com/mediar-ai/fazm), reads the macOS accessibility tree directly, and its model endpoint is a single field in Settings that maps cleanly onto ANTHROPIC_BASE_URL. The rest of this page is that second half of the story.

1 env var

The question is not which open model is best in 2026. It is: which one can your agent actually run without a fork.

Fazm acp-bridge environment setup

THE ANCHOR

The exact three lines in the Fazm source tree that make this work

No marketing, no recompile. The entire model-swap lives in ACPBridge.swift at lines 379 to 382, which read acustomApiEndpointUserDefault and set it asANTHROPIC_BASE_URLon the Node bridge process. That is it.

Desktop/Sources/Chat/ACPBridge.swift

The default the bridge falls back to when nothing is configured is a single constant. You can read it directly:

acp-bridge/src/index.ts

And this is the Settings UI the user actually touches, with the placeholder string preserved from the Swift source.

Desktop/Sources/MainWindow/Pages/SettingsPage.swift

How the hop actually lands

From your keystroke in Settings to DeepSeek or GLM or Qwen answering, the path is short.

Fazm with a user-chosen open model behind a proxy

DeepSeek-R1
GLM-5.1
Qwen3-235B
Kimi-Dev-72B
Llama 4 Maverick
Anthropic-compat proxy
Fazm ACP bridge
Swift Mac agent
Your Mac apps

The hub in the middle is the piece you bring. The two halves on either side already ship in Fazm. The open source LLM news in 2026 is really a story about how many good options now exist on the left.

What it looks like end to end

Dropped-in walkthrough of setting a custom endpoint, restarting the bridge, and confirming the model picker stays intact. This is what the app actually does, not a mockup.

fazm + custom endpoint setup

Why the model swap on its own is not the hard problem

The hard problem is giving whatever model you pick something useful to work with. A bigger context window does not fix a screenshot-only input. A better reasoning model does not fix a browser-only surface. The 2026 open source LLM conversation is downstream of substrate choices.

Reads structured state, not pixels

Fazm creates AX elements for the frontmost app with AXUIElementCreateApplication, then reads kAXFocusedWindowAttribute. Any model you pick gets a typed tree, not OCR of a screenshot. See AppState.swift lines 439 to 441.

Works with any Mac app, not just the browser

Native MCP tools: macos-use for AX-compliant apps (Finder, Mail, Settings), whatsapp-mcp for Catalyst WhatsApp, Playwright only for the browser. Your open model is not limited to the DOM.

Consumer app, not a developer framework

One toggle in Settings. No .env editing, no recompile, no fork. That is the part everyone skips over when they compare open source LLMs.

MIT, fully inspectable

github.com/mediar-ai/fazm. Every file path on this page points at real source. Clone, read, modify. The proxy on the other side is the only thing you add.

Open-weight pathway vs the default Fazm setup

FeatureDefault FazmOpen-weight proxy
Model familyClaude (Sonnet 4.6 by default, Opus selectable)DeepSeek-R1, GLM-5.1, Qwen3, Kimi, Llama 4, etc.
Configured viaModel picker in the floating barSettings > Advanced > AI Chat > Custom API Endpoint
Underlying env varNone requiredANTHROPIC_BASE_URL (set by Fazm from customApiEndpoint)
Proxy you supplyNoneAny Anthropic-compatible shim (LiteLLM, claude-code-router, OpenRouter)
Tool-calling fidelityFirst-class (native ACP + Anthropic)Depends on the proxy's Anthropic -> OpenAI mapping
Who it fitsConsumers, default UXPower users, regulated orgs, self-hosters
Recompile requiredNoNo

Quality of the open-weight path depends almost entirely on your proxy and the model you route it to, not on Fazm itself.

THE SPECIFIC FILES

If you want to verify this page from the source

Every claim about Fazm on this page maps onto a specific file in the MIT-licensed Fazm repository. You can clone the repo and grep for these exact strings.

  • Desktop/Sources/Chat/ACPBridge.swift:379The three lines that read customApiEndpoint from UserDefaults and export ANTHROPIC_BASE_URL to the Node subprocess environment.
  • Desktop/Sources/MainWindow/Pages/SettingsPage.swift:906The Settings card titled 'Custom API Endpoint', with the 'https://your-proxy:8766' placeholder and a toggle that clears the value on disable.
  • acp-bridge/src/index.ts:1245DEFAULT_MODEL = "claude-sonnet-4-6". The one constant that defines what runs when no endpoint is set.
  • Desktop/Sources/AppState.swift:439AXUIElementCreateApplication + kAXFocusedWindowAttribute. This is how your chosen model sees your Mac: as a typed accessibility tree, not a screenshot.

Why 2026 is the year this gets real

Four numbers that explain the shift from open-source LLM news as pure release theater to open-source LLM news as consumer substrate.

0BLlama 4 Maverick params (open)
0BKimi-Dev-72B (coding, open)
0Named open-weight releases everyone tracks
0Fazm env var to swap them all in

Counting players from 0 open-weight headliners down to 0 env var is the whole compression. Last year the gap between model news and agent reality was enormous. This year it is a toggle.

The community proxies you can actually plug in

Fazm does not ship with any of these. They are third-party projects. Each one presents an Anthropic-shaped HTTP endpoint and internally translates to whatever backend you point it at. Any of them is a legitimate target for your Custom API Endpoint field.

LiteLLM
claude-code-router
OpenRouter (Anthropic endpoint)
Fireworks (Anthropic shim)
Together AI (Anthropic shim)
Self-hosted Ollama + proxy
vLLM + Anthropic adapter
Corporate GitHub Copilot bridge

The quality of your open-weight Fazm experience is the quality of whichever of these you pick, multiplied by the quality of the model behind it. Test with your real tasks, not a benchmark.

The four-step recipe, in order

From a fresh Fazm install to running on an open model

1

Install Fazm

Download from fazm.ai or build from source at github.com/mediar-ai/fazm. Grant macOS Accessibility permission when prompted, Fazm needs it to read AX trees.

2

Stand up an Anthropic-compatible proxy

Pick one: LiteLLM's Anthropic server mode, claude-code-router, OpenRouter's Anthropic endpoint, or your own shim. Point it at the open model you want (DeepSeek-R1, GLM-5.1, Qwen3, Kimi, Llama 4). Note the URL.

3

Flip the Fazm Custom API Endpoint toggle

Open Fazm, go to Settings, Advanced, AI Chat. Toggle 'Custom API Endpoint' on. Paste your proxy URL. Hit return. Fazm calls restartBridgeForEndpointChange() and re-spawns the Node bridge with ANTHROPIC_BASE_URL set.

4

Drive your Mac

Open the floating bar, ask Fazm to do something in a real app. Your open model is now the brain behind an agent that reads the accessibility tree of any AX-compliant app on your Mac. If something feels off, diagnose the proxy, not Fazm.

Want to run an open model inside a real Mac agent?

Talk to us about Custom API Endpoint setups, corporate proxies, and which open-weight path actually works for your workflow.

Book a call

Frequently asked questions

What are the open source LLM releases that defined 2026 so far?

Six names show up across every April 2026 roundup. GLM-5.1 from Zhipu AI, positioned for long-horizon agentic coding and leading open weights on SWE-bench Pro and Terminal Bench. DeepSeek-R1, still the reference open reasoning model and top of many 2026 shortlists. Qwen3-235B-A22B from Alibaba, strong on multilingual and dialogue. Kimi-Dev-72B from Moonshot, the coding-specific pick. Llama 4 Maverick from Meta, 400B parameters, plus reporting that Meta is preparing more frontier models (codenamed Avocado and Mango). And Arcee, the small U.S. team that shipped a 400B reasoning model on roughly $20M.

What is the actual bottleneck between these new open source LLMs and a desktop agent on my Mac?

The model is not the hard part anymore. What is hard is the substrate: an app that can take text from an LLM and turn it into real clicks, keystrokes, and structured state inside other Mac apps, reliably. Most open source AI news covers the model release and stops. The missing piece is a consumer Mac agent that reads the macOS accessibility tree (so the model sees structured state instead of a pixel screenshot), talks to any LLM provider, and does not require the user to be a developer. Fazm is one of the few shipping that layer, it is MIT-licensed, and its model pipe is one environment variable.

Can I actually run Fazm on top of DeepSeek-R1, GLM-5.1, or Qwen3 today?

The switch is real, the effort is on the proxy. Fazm's engine speaks Anthropic's Agent Client Protocol and expects Anthropic-shaped API calls. Its custom endpoint setting sets the ANTHROPIC_BASE_URL environment variable for the bundled Node bridge. If you put an Anthropic-compatible shim in front of DeepSeek, GLM, or Qwen (LiteLLM, claude-code-router, OpenRouter's Anthropic endpoint, and similar community projects all do this), Fazm will call it as if it were Anthropic. You trade some tool-calling fidelity for model sovereignty. It is a real path, not a theoretical one.

Where exactly is this switch in the Fazm source tree?

Three places. The Swift UI that exposes it is at Desktop/Sources/MainWindow/Pages/SettingsPage.swift lines 906 through 952, inside Settings > Advanced > AI Chat, titled 'Custom API Endpoint', with placeholder 'https://your-proxy:8766'. The user's value is stored in a UserDefault called customApiEndpoint. The value is read and injected as an environment variable at Desktop/Sources/Chat/ACPBridge.swift lines 379 through 382: if a non-empty string is present, env["ANTHROPIC_BASE_URL"] is set to that string before the acp-bridge Node process is spawned. The default model that the bridge warms up is set on a single line at acp-bridge/src/index.ts line 1245, DEFAULT_MODEL = "claude-sonnet-4-6".

Is Fazm itself open source?

Yes. The repository is github.com/mediar-ai/fazm, MIT licensed, README says 'Fully open source. Fully local.' The Swift desktop app lives under Desktop/, the Node ACP bridge under acp-bridge/, the DMG installer assets under dmg-assets/. You can clone it, build it with run.sh, and inspect every file the page above references yourself.

Why is the accessibility tree a better substrate than screenshots for these new open models?

Any sufficiently capable LLM can OCR a screenshot, but that work is wasted if structured state already exists. Fazm's AppState.swift calls AXUIElementCreateApplication(frontApp.processIdentifier) and AXUIElementCopyAttributeValue(appElement, kAXFocusedWindowAttribute as CFString, ...) to get the focused window as a typed tree of roles, titles, values, and positions. That is the input an LLM like DeepSeek-R1 is best at, discrete structured data, not a lossy raster. The smaller the gap between model and tool, the more of these open-weight models stay competitive with Claude for desktop work.

What are the tradeoffs of routing an open model through an Anthropic-compatible shim?

Three real ones. First, tool calling: Anthropic's tool_use blocks and the community shims that translate to OpenAI function calling are mostly correct, but edge cases exist with streaming, parallel tool calls, and large tool results. Second, instruction following on long agent runs is still where closed models (Opus 4.6, 4.7) have a measurable edge. Third, latency and context window vary wildly, some open hosts cap context at 128k, others ship 1M. Before committing, run a short battery of your real tasks, not benchmarks, through the shim and compare.

If I am a non-developer, should I bother with open source LLMs in 2026?

Probably not yet, and that is fine. The state of the art for consumer desktop agents is still closed frontier models behind a good substrate. But if you value local and open on principle, or you are in a regulated org, the door is open in a way it was not last year. Fazm's Custom API Endpoint setting is a consumer-grade affordance for that future: one toggle, one URL. The rest is the quality of whatever proxy you point it at.

What should I actually read this week if I am tracking the 2026 open source LLM space?

GLM-5.1 release notes from Zhipu for the agentic/coding benchmarks. DeepSeek's R1 card for reasoning. Moonshot's Kimi-Dev-72B for code. Meta's Llama 4 Maverick documentation for long-context behavior. The SiliconANGLE report on Meta's Avocado and Mango (the rumored next frontier models, rumored to include open weights at some tier). And the TechCrunch piece on Arcee, which is the most interesting business-model story of the year, a 26-person team shipping a 400B model on a $20M budget.