Open-source AI project releases on May 25, 2026: what actually shipped

The frontier was quiet that Monday. No major US lab dropped a new open-weights model. Google's Gemini 3.5 Flash had landed six days earlier on May 19 at I/O, NVIDIA's Nemotron 3 family and Ising on May 18, and Anthropic's news that week was a Q2 revenue projection, not a model. The one platform-level event stamped exactly 2026-05-25 was the shutdown of gemini-3.1-flash-lite-preview, a preview deprecation announced two weeks earlier. That is a retirement, not a release.

The single verifiable, dated, open-source AI release stamped May 25, 2026 is at the agent layer: the macOS Claude Code / Codex / Gemini wrapper Fazm shipped v2.9.37 with nine changes. Every one of them is a string in CHANGELOG.json at the root of github.com/mediar-ai/fazm. This page reads the release line by line and explains why a changelog stamped to a single date is a better answer to the calendar question than another aggregator can offer.

M
Matthew Diakonov
8 min read

Direct answer (verified 2026-05-27)

On 2026-05-25, no major US lab released a new frontier or open-weights model. The one verifiable open-source AI release stamped that date is Fazm v2.9.37, with nine changes covering a destructive-shell guardrail, consistent password handling, a three-way browser-mode setting, the Assrt MCP reframed as a general-purpose visible Chrome agent, Koah live sponsored content for the free tier (paid stays at zero ads, context redacted on-device), Shift+Enter newline behavior in the founder-chat composer, and a 40-message post-interrupt context window. Primary source: CHANGELOG.json on the public repo.

The v2.9.37 release, change by change

Nine items in the changes array. They are not grouped thematically in the file; this page groups them so the shape of the release is legible. Two stand out as load-bearing behavioral changes (the destructive-shell guardrail and the expanded post-interrupt context window). Four are Assrt-MCP related, completing the reframe of Assrt from a QA-only tool into a general-purpose browser agent. One is the Koah sponsored-content rollout. Two are smaller UX fixes. The cards below paraphrase the change-line wording lightly for readability; everything is grep-checkable against the source file.

Destructive-shell guardrail

Agent warns before running shell commands that change system state (disabling Wi-Fi, killing system UI, rebooting) and prefers reversible options, so a system-cleanup task no longer knocks you offline as a side effect.

Consistent password handling

Agent no longer refuses then caves under pushback when you provide a password, and never echoes, logs, or stores it.

Post-interrupt context to 40

Stopping mid-stream now carries the streamed-so-far partial, a short list of tools that just ran, and a 40-message history window (up from 20) into the next reply.

Assrt 401 fix for Gemini

Assrt QA tests no longer fail with 401 when Gemini is the active chat model; Assrt now uses Gemini, Claude OAuth, or Anthropic API key based on what is selected.

Assrt default to Gemini Flash

Assrt's Gemini default switched from Pro to Flash, matching the cost-and-latency profile of the Anthropic Haiku default. Picks the model the user already selected in Fazm.

Third browser mode: Assrt only

Settings now has a 'No browser MCP (Assrt only)' option that disables both the Playwright MCP and the bundled browser-harness, leaving only the Assrt browser available.

Assrt reframed as general browser agent

Defaults to a visible Chrome window, uses a persistent profile so logins and cookies survive across sessions, and the agent leaves the window open between turns instead of closing it.

Koah sponsored content (free tier)

Live contextual sponsored content next to assistant responses for free-tier users. Paid subscribers still see zero ads. Conversation context is redacted on-device (emails, phone numbers, tokens, UUIDs) before any data leaves the Mac.

Shift+Enter in Chat-with-Founder

Shift+Enter in the founder-chat composer now inserts a newline instead of sending the message. Multi-line messages no longer arrive in fragments.

Why the destructive-shell guardrail is the most consequential line

Of the nine, the first one carries the most weight. The pattern it fixes shows up across every desktop AI agent built on top of Claude Code or Codex: the user asks for a routine cleanup, the agent decides the cleanest way to free a resource is to disable a system daemon or networking interface, and the user is offline before they realize what happened. The fix is two-part. First, the agent identifies a command as system-state-changing (a small allowlist of patterns: turning off Wi-Fi, killing WindowServer, fseventsd, launchd, shutdown / reboot, an unconditional rm -rf /). Second, when one of those patterns matches and a reversible alternative exists, the agent prefers the reversible alternative (kill a single Node process instead of the whole window server, empty a cache directory instead of formatting a volume).

That behavior change does not require a new model. It requires the harness around the model to know which side of the boundary a shell command is on. Vendors do not solve that for you because the set of destructive shell commands on macOS is different from the set on Linux is different from the set on Windows; the harness owns the policy. v2.9.37 is the version where Fazm's harness owns it.

And the 40-message post-interrupt context expansion

The second load-bearing change is quieter. When you press the stop button mid-stream, the next message used to start the agent from a thin context: 20 previous messages, no record of the partial output that had streamed before you interrupted, no record of which tools the agent had just called. v2.9.37 carries three things forward into the next turn: the streamed-so-far partial output, a short list of tools that just ran, and a 40-message history window (twice the prior cap).

Practically: interrupting an agent mid-task and saying "no, do this instead" no longer wipes its memory of where it was. The tools-just-ran list is what stops it from re-running an expensive subagent or re-reading the same files. The streamed partial is what stops it from re-deriving an answer it was already half done with. This is the kind of fix that shows up nowhere in a model release note and matters more day-to-day than most.

Verifying it yourself

Nothing on this page asks you to take the changelog on faith. The version object is in a single JSON file on a public GitHub repo. Four steps, and you have the same primary source this page is grounded against.

  1. 1

    Clone the repo

    git clone github.com/mediar-ai/fazm

  2. 2

    Open CHANGELOG.json

    at the repo root

  3. 3

    Search for 2026-05-25

    one match

  4. 4

    Read the v2.9.37 object

    nine changes, listed

verifying v2.9.37 against the public CHANGELOG.json

Why "open source on May 25, 2026" is a harder question than it sounds

The reason a dated open-source-AI question is hard to answer is that the two platforms hosting most of the activity, Hugging Face and GitHub, do not publish a dated index of releases for any given day. Hugging Face has a trending models view and a trending papers view, both ranked by a popularity score over a rolling window. GitHub has its own trending feed on the same principle. Popularity-by-window is useful for discovery; it is the wrong tool for calendar lookup. A model uploaded a week ago that still has momentum can sit on May 25's trending page; a model uploaded on May 25 with no early traction will not.

Vendor blogs are the closest thing to a dated record, but they publish only the lab's own releases, so they are necessarily partial. And vendor blogs cover the model layer, not the agent layer; the work that goes into making a frontier model usable inside a desktop app rarely shows up there.

What is left is the dated changelog of an actively maintained open-source project. That is what this page uses. The trade-off is honest: a single project's changelog is a narrow slice of ecosystem activity, but every entry is a real artifact that actually shipped on the day claimed, with primary-source provenance.

The week around May 25, 2026, in context

Zooming out one click: the meaningful lab-level activity for the seven days bracketing May 25 sat on the Monday and Tuesday before (the I/O week). On May 18, NVIDIA debuted Nemotron 3 (Nano, Super, Ultra) on a hybrid latent mixture-of-experts targeted at multi-agent systems, and the Ising open AI model family for quantum error correction and calibration. On May 19, Google opened I/O with Gemini 3.5 Flash, an agent-first frontier model scoring 76.2% on Terminal-Bench 2.1 and 83.6% on MCP Atlas per the DeepMind model card.

By Monday May 25, the wave had passed. The narrative had moved to the I/O follow-on: which agent harnesses had picked up Gemini 3.5 Flash as a swappable backend, how the new MCP Atlas score translated into agent behavior, whether the cheaper / faster frontier was actually cheaper / faster in production. Fazm's v2.9.37 is one data point in that follow-on. The harness fixes land on a different clock than the model launches; the model-launch clock had finished its tick three days earlier.

The Koah sponsored-content line: what it does and does not do

One change in v2.9.37 has more nuance than the others and is worth spelling out. Free-tier Fazm users now see live contextual sponsored content from Koah next to assistant responses. Three guardrails ride with it. First, paid subscribers see zero ads, unconditionally. Second, conversation context is redacted on the Mac itself before any data leaves the device: emails, phone numbers, tokens, and UUIDs are stripped from whatever signal is sent to Koah for matching. Third, the rollout is feature-flagged (the v2.9.45 release two days later, dated 2026-05-27, gates ads behind a remote flag so the launch can be staged), so the line in v2.9.37 is the on-by-default enablement, not the irreversible flip.

The reason to surface this on a roundup page rather than tuck it into a release post: it is a real fork in how an open-source AI agent monetizes the free tier without putting paying users into the same data path. The pattern, paid-no-ads / free-with-redacted-context-ads, is the kind of thing other open-source agent maintainers are weighing right now. The dated line in CHANGELOG.json on May 25, 2026 is where Fazm picked one.

Building or running an open-source AI agent on macOS?

If you are working on the agent layer (harness fixes, post-interrupt context, destructive-shell policy, ACP backend swaps), talk it through with Matt.

Frequently asked

What open-source AI projects actually shipped a release on May 25, 2026?

The honest, dated answer for that specific Monday: the open-source macOS agent Fazm shipped v2.9.37 with nine changes. That is the single open-source AI release we can point at in a primary-source dated changelog for 2026-05-25. The major US labs were quiet that day. Google's Gemini 3.5 Flash had landed six days earlier on May 19. NVIDIA's Nemotron 3 family and Ising had landed on May 18. Meta's anticipated Avocado open-weights model had gone silent. Anthropic was deep in funding-round news rather than a model drop. If you want a verifiable dated record of what one actively maintained open-source AI agent shipped on May 25, 2026, the only honest place to look is github.com/mediar-ai/fazm/blob/main/CHANGELOG.json.

Where can I verify Fazm v2.9.37 was actually dated May 25, 2026?

Two places. The on-disk source of truth is /Users/[username]/fazm/CHANGELOG.json in any local clone of the public repo. The public mirror is https://github.com/mediar-ai/fazm/blob/main/CHANGELOG.json, which renders the same JSON content. The v2.9.37 release object has a 'date' field of '2026-05-25' and a 'changes' array of nine strings. The version immediately before (v2.9.36) is dated 2026-05-22 (a three-day gap). The version immediately after (v2.9.41) is dated 2026-05-27 (a two-day gap). Versions 2.9.38, 2.9.39, and 2.9.40 do not exist as separate dated entries; the work landed batched as v2.9.41 two days later.

What did v2.9.37 actually change?

Nine things, all in the changes array of the v2.9.37 object. (1) A safety guardrail before destructive shell commands: the agent now warns before running things that change system state (disabling Wi-Fi, killing system UI processes, rebooting) and prefers reversible options. (2) Consistent password handling: the agent no longer refuses then caves under pushback, and never echoes, logs, or stores user-provided passwords. (3) Assrt QA tests no longer fail with 401 when Gemini is the selected model. (4) Assrt's default Gemini model is now Flash, not Pro, to match the Anthropic Haiku cost / latency profile. (5) A third browser mode in Settings, 'No browser MCP (Assrt only)', which disables Playwright and the bundled browser-harness. (6) Assrt MCP reframed as a general-purpose browser agent: defaults to a visible Chrome window with persistent profile so logins survive sessions, and the agent leaves the window open between turns. (7) Live contextual sponsored content from Koah next to assistant responses for free-tier users; paid stays at zero ads, and conversation context is redacted (emails, phone numbers, tokens, UUIDs) before any data leaves the Mac. (8) Shift+Enter in the Chat-with-Founder composer now inserts a newline instead of sending the message. (9) Post-interrupt context expanded: when you stop an answer mid-stream, the next reply sees the streamed-so-far partial, a short list of tools that just ran, and a 40-message history window (up from 20).

Did any major AI lab release a frontier model on May 25, 2026?

No US lab dropped a new frontier or open-weights model that day. The lab-level highlights of that week sat on adjacent dates: Google opened I/O 2026 with Gemini 3.5 Flash on May 19, NVIDIA debuted Nemotron 3 (Nano, Super, Ultra) and the Ising quantum-error-correction open model family on May 18, and Anthropic's news that week was a $10.9B projected Q2 revenue and first-ever quarterly operating profit, not a model. The single platform-level deprecation stamped exactly 2026-05-25 was the shutdown of gemini-3.1-flash-lite-preview, a preview model whose retirement had been announced on May 11. That is a deprecation, not a release. The deliberate framing of this page is that the dated open-source release worth pointing at on May 25, 2026 sits at the agent layer, not at the model layer.

Why use a single open-source agent's changelog as the lens for an industry-wide question?

Because every other source for 'what shipped on May 25, 2026' is either a date-shifted lab announcement from earlier in the week or a trending-by-popularity feed that does not carry a date for the item itself. Hugging Face does not publish a 'released on May 25' index for either models or papers; github.com/trending is a rolling popularity score, not a calendar. A maintained open-source agent's CHANGELOG.json is one of the few places where a real software release is stamped with the exact date the binary went out. Fazm v2.9.37 is that on May 25, 2026.

Is Fazm itself genuinely open source, or just source-available?

Fully open source on GitHub at github.com/mediar-ai/fazm. The repo carries the full macOS Swift / SwiftUI app source, the ACP bridge that wraps Claude Code (@agentclientprotocol/claude-agent-acp), Codex (codex-acp), and the feature-flagged Gemini provider, the bundled MCP servers, and the CHANGELOG.json this page is grounded against. Build instructions are in README.md. The pre-notarized DMG at fazm.ai/download is the same code you can build yourself from the repo.

Why does v2.9.37 ship a sponsored-content feature in the same release as a destructive-shell guardrail?

Because both are independent items that happened to be ready at the same cutoff. The CHANGELOG.json structure treats each version as a flat array of change strings, not a thematic grouping. The sponsored-content change is feature-flagged: paid subscribers see zero ads, and conversation context is redacted on-device (emails, phone numbers, tokens, UUIDs) before any data goes to Koah. The destructive-shell guardrail is unrelated to ad delivery; it is a behavioral patch on the agent loop itself, so the agent prefers reversible operations when a request could knock you offline as a side effect.

What is the 40-message post-interrupt context window in v2.9.37 actually doing?

When you press the stop button mid-stream, the next message used to start the agent from a thin context: 20 previous messages, no record of the partial output that had streamed, no record of which tools had just been called. v2.9.37 expanded that to a 40-message history window, the streamed-so-far partial, and a short list of tools that just ran. The practical effect is that interrupting and redirecting an agent mid-task no longer wipes its memory of where it was; you can say 'no, do this instead' and it picks up the thread.

Does Fazm's May 25 release have anything to do with Claude Code, Codex, or Gemini specifically?

Yes. Fazm wraps Claude Code (@agentclientprotocol/claude-agent-acp), Codex (codex-acp), and Gemini CLI (--experimental-acp, gated by FAZM_GEMINI_ENABLED) as swappable per-chat backends through the ACP bridge. v2.9.37 itself does not touch backend wiring; the relevant backend work landed a few days earlier (v2.9.36 added Google Gemini as a free option when built-in AI credits run out, v2.9.35 added Gemini Flash and Pro as selectable models in the AI picker, both dated 2026-05-22). v2.9.37 is the agent-layer release sitting on top of that backend layer.

What is the right way to track open-source AI releases day-by-day if Hugging Face and GitHub do not publish a dated index?

Pick one or two actively maintained open-source agents, IDE plug-ins, or inference engines and read their dated changelogs on a rolling window. For Fazm specifically, CHANGELOG.json on the public repo is the canonical record and updates within minutes of each release going out. For the frontier model layer, the four lab blogs (openai.com, anthropic.com, blog.google, mistral.ai) cover anything billed as a real release within hours. For the broader trending layer, huggingface.co/papers/trending, huggingface.co/models?sort=trending, and github.com/trending are rolling popularity feeds; useful for discovery, not for date queries.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.