Latest AI model releases, papers, and open-source projects (May 22 to 23, 2026)

The honest record of a dated 48-hour AI window has two layers. May 22 had real shipped work you can find in any roundup. May 23 had none, and the activity worth reading was in unshipped commit histories. Here is what both layers actually contained, with primary sources.

M
Matthew Diakonov
10 min read
Direct answer (verified 2026-05-25)

May 22, 2026: Anthropic launched a Claude Compliance API and rolled Claude Code improvements to background sessions and the renamed /code-review flow. Project Glasswing published its first quantified Claude Mythos Preview results (10,000+ high or critical vulnerabilities found by roughly 50 partner organizations). Hugging Face's top-trending papers led with DelTA, TransitLM, and NVIDIA's Gated DeltaNet-2. Fazm shipped v2.9.35 and v2.9.36, wiring Google Gemini Flash and Gemini Pro as ACP backends and adding Gemini as a free fallback. May 23, 2026: no major lab announcements. The visible activity was in unshipped commit logs: Fazm landed 23 commits including 8 for a new Composio MCP integration (Gmail toolkit), 5 for session-interrupt recovery (prior-context limit raised to 40 messages), per-tool Swift timeouts, and a RunningToolLabel for active-tool visibility. Primary sources: anthropic.com/news, huggingface.co/papers/date/2026-05-22, github.com/mediar-ai/fazm/commits/main.

The thesis: commit logs are the leading indicator

Every dated 24-hour or 48-hour AI release roundup published online stops at the same line: what was tagged, what was announced, what was uploaded. That line is useful, but it always lags. Tagged releases are downstream of work that landed in commits one to three weeks earlier. Announcements are downstream of months of internal builds. If you want a 48-hour window that actually previews where the ecosystem is moving, you have to read at least one maintained project's commit history alongside the headline feeds.

May 22 to 23, 2026 is the cleanest example of why. May 22 was a heavy announcement day: Anthropic, the Claude Mythos disclosure, the trending Hugging Face papers, and an open-source agent shipping two tagged releases. May 23 had none of that. Every aggregator I checked covered May 22 in some form and skipped May 23 entirely, or padded it with stale carry-over content. But May 23 was not a quiet day in the repo. Twenty-three commits landed on a single agent project, including the start of a new MCP integration class that will reach end users in the next release.

Read both layers and the 48 hours have a shape. Read only the announcement layer and you miss half the signal.

May 22 layer: what was tagged or announced

Four things on May 22 are verifiable against primary sources. None of them are speculation.

  1. Anthropic Claude Compliance API. Anthropic added integrations between Claude and security and compliance tools so IT and security teams can govern Claude across the platform and product suite. Same day, Claude Code shipped smarter pinned sessions, a renamed /code-review flow, and bug fixes across terminal, Windows, PowerShell, plugins, and slash commands. The news post and release notes are at anthropic.com/news.
  2. Project Glasswing's first quantified Mythos findings. Anthropic's Project Glasswing published initial results from the Claude Mythos preview: more than 10,000 high or critical severity vulnerabilities, accumulated by roughly 50 partner organizations deploying Mythos. This is the first time a security-focused agent program at one of the major labs published a quantified bug-yield number for an in-progress release.
  3. Hugging Face trending papers. The dated index at huggingface.co/papers/date/2026-05-22 led with DelTA (token-level credit assignment for RLVR, 198 upvotes), TransitLM (map-free transit route generation, 169 upvotes), Perception or Prejudice (multimodal LLM personality assessment, 163 upvotes), and NVIDIA's Gated DeltaNet-2 (decoupled erase and write in linear attention, 152 upvotes). The top entries are method papers with linked code, not announcement posts.
  4. Fazm v2.9.35 and v2.9.36. The open-source macOS agent shipped two tagged releases on May 22, both recorded with ISO dates in CHANGELOG.json. v2.9.35 added Google Gemini Flash and Gemini Pro as selectable models in the AI picker alongside Claude and ChatGPT, and fixed Gemini not being able to call MCP tools at all (capture_screenshot, browser, WhatsApp, accessibility-driven Mac control). v2.9.36 wired Gemini up as a free option when built-in credits run out. Reading the commits across the same day shows the work continued past the v2.9.36 tag: refactoring the Assrt MCP into a general-purpose browser agent, adding a No-browser-MCP setting, and integrating Koah sponsored ads into AI responses with PII redaction.

The Fazm-side May 22 git log shows what shipped and what kept moving:

fazm/ git log 2026-05-22 (head, abridged)

May 23 layer: zero releases, twenty-three commits

The most useful 48-hour window almost always includes one day that is quieter than the other. On May 23, none of the major labs made a frontier-model announcement, Hugging Face's dated-papers list did not produce a comparable cluster of high-upvote entries, and no aggregated newsletter shipped a meaningful update. If you stopped at the announcement layer, May 23 reads as a non-event.

The commit-log layer reads differently. On a single maintained agent project the day looked like this:

fazm/ git log 2026-05-23 (commits, abridged)

Three themes are visible in those commits. First, a new MCP integration class lands as eight commits: Composio configuration fields, environment variables, API routes, a composio module wiring, Composio MCP server support, ACP bridge integration, an auth-token injection fix, and a composio-connect skill scoped to Gmail. The net effect once shipped is that an agent running on Claude Code or Codex via ACP can call into Gmail (and any other Composio-enabled toolkit) without the user having to wire up their own MCP server.

Second, five commits push session-interrupt recovery from "resume on tool failure" to "recover from arbitrary stream interruption". Prior-context recovery after a session interruption lands, the prior-context limit is raised to 40 messages, tool-history-and-partial-responses recovery lands, implicit interrupt recovery silently folds back into context without surfacing a session-expired notification, and the preamble text becomes customizable. This is the category of failure any persistent-session agent eventually has to solve, and it ships in commits before it ships in a release.

Third, the tool-execution layer gets two visibility and ceiling commits: per-tool timeouts for Swift tool calls (so a hung native tool no longer keeps the chat spinning forever), and a RunningToolLabel that surfaces the active tool name and elapsed time on screen during a turn. These are not headline features. They are the kind of bug-class fixes that an aggregator never reports on but that a user notices immediately.

Source: github.com/mediar-ai/fazm/commits/main, accessed 2026-05-25. Verifiable on a local clone with git log --since='2026-05-23 00:00' --until='2026-05-23 23:59:59'.

23

Twenty-three commits landed on a single agent project on May 23, 2026. Zero release tags. That gap is the leading-indicator surface every release roundup misses.

Fazm git log, 2026-05-23

Method versus anti-method, applied to May 22 to 23

The methods that actually catch the May 22 to 23 signal are not exotic. They are layered, and they include one source that does not exist in any aggregator.

FeatureMethod that misses itMethod that catches May 22 to 23, 2026 signal
Frontier-model announcements (May 22 Compliance API, Mythos findings)Wait for a Sunday roundup newsletterRead lab blog + weekly skim of news pages
Research with linked code (DelTA, Gated DeltaNet-2)Calendar-keyed paper aggregator (mostly stale)huggingface.co/papers/date/2026-05-22 + open the linked repo
Tagged open-source releases (Fazm v2.9.35, v2.9.36)Single aggregated 24h GitHub trending feedProject's own CHANGELOG.json with ISO dates
May 23 unshipped agent-layer work (Composio, session recovery)No aggregator at all (commits are not indexed by date)git log on a maintained agent project, ranged by date
Where the next 2 weeks of releases are visibleTagged releases only (always lagging)Commits-not-yet-tagged in 1-3 projects you actually run

The counterargument and where it lands

The honest objection to reading commit logs is that they are noisy and most commits do not matter. That is true. A typical day on a busy agent project includes refactors, lint fixes, and dependency bumps that mean nothing to anyone outside the team. If you read every commit on every active AI project, you would drown.

The refinement is to pick one or two projects whose stack you actually run, then read the commit log filtered by date when an aggregator surfaces a quiet day. For May 22 to 23, the Fazm log was useful specifically because May 23 was quiet at the announcement layer. On a louder day (a frontier-model drop, a major lab benchmark release), the announcement layer carries the signal and the commit log adds detail. The two layers are complementary, not redundant.

The other refinement is that not every project keeps a readable commit history. Squash-merge teams produce one commit per PR, which is closer to a release-note granularity than a true commit log. The Fazm log is unusually readable because the team commits in small atomic units. Pick your one or two projects accordingly.

What the May 23 commits imply about the next two weeks

Three predictions you can make from the May 23 commit cluster, all falsifiable against future release notes.

  • A Composio MCP integration for at least Gmail will land as a tagged Fazm release, probably as the lead feature of a minor version bump. The shape of the eight commits (configuration, server, ACP bridge, auth-token injection, plus a user-facing connect skill) is feature-complete enough to ship.
  • Session-interrupt recovery will be released as a generic resumption capability, not a one-off bug fix. The five commits add a 40-message prior-context window and a customizable preamble, which only makes sense if interrupt recovery becomes a documented user-facing behavior rather than a silent fix.
  • The RunningToolLabel plus per-tool Swift timeouts together imply that the team is treating tool runtime as a first-class user-visible state, not just a backend ceiling. Expect that surface to extend (per-tool budget displays, per-session tool-time accounting, cancel-from-UI) in the next release or two.

If those predictions land in tagged releases over the next two weeks, the method works. If they do not, the method needs refinement. Either way, you would not have been able to make any of them from the announcement-layer record alone.

The 48-hour routine, restated for this window

  • Skim the major lab news pages for the date range. For May 22, this surfaces the Anthropic Compliance API and Mythos findings in under a minute.
  • Open the dated paper index at huggingface.co/papers/date/2026-05-22. Scan the top five entries. If any link to a repo whose stack you run, open the linked repo and check the recent commits.
  • Open the CHANGELOG of one or two agent projects you actually use. For May 22, Fazm's tagged v2.9.35 and v2.9.36 cover the Gemini-backend story end to end.
  • For days that look quiet at the announcement layer (May 23 in this window), run a git log over the same project, ranged by date. Read the commit subjects. Group them into themes. The themes are usually the next release's feature plan.
  • Write down one ecosystem-level pattern you noticed. For May 22 to 23, the pattern is: the agent layer is normalizing on third-party MCP integration brokers (Composio in Fazm's case) as a faster path to per-app reach than asking every user to wire MCPs by hand.

Want a walkthrough of the commit-log method on your own stack?

If you are running or evaluating an agent setup and want a 15-minute pass at reading commit history alongside the release roundups, book a call. We will open Fazm together and show the path from this week's git log to next week's feature ship.

Frequently asked questions

What AI model releases, papers, and open-source projects shipped specifically on May 22 to 23, 2026?

On May 22, four things are verifiable. Anthropic launched a Claude Compliance API and rolled out Claude Code improvements to background sessions, the renamed /code-review flow, and auto-update reliability. Project Glasswing published its first quantified results from the Claude Mythos preview, naming more than 10,000 high- or critical-severity vulnerabilities found across widely deployed software by roughly 50 partner organizations. Hugging Face's trending papers for May 22 led with DelTA (token-level credit assignment for reinforcement learning from verifiable rewards, 198 upvotes), TransitLM (transit route generation without map dependencies, 169), and NVIDIA's Gated DeltaNet-2 (decoupled erase and write in linear attention, 152). Fazm shipped v2.9.35 and v2.9.36, adding Google Gemini Flash and Gemini Pro as selectable ACP backends and wiring Gemini up as a free option when built-in credits run out. On May 23, none of the major labs made a frontier-model announcement. The activity worth recording was at the agent and tooling layer, visible only in commit histories: Fazm alone landed 23 commits including a new Composio MCP integration for Gmail, session-interrupt recovery, per-tool Swift timeouts, and a RunningToolLabel showing active-tool elapsed time. None of those changes are in any release feed yet.

Why is the May 23 commit-log layer worth reading separately from May 22 announcements?

Because tagged releases and lab announcements are downstream of the work that lands in commits one to three weeks earlier. The 23 Fazm commits on May 23 are a concrete example. The Composio MCP integration alone is eight commits: Composio configuration fields, environment variables, API routes, a composio module, the ACPBridge integration, Composio MCP server support, an auth-token injection fix, and a composio-connect skill for Gmail. None of that has a version tag yet. Reading it gives a one to two week lead on what the next minor or major releases of agent tooling will actually contain. Aggregators do not surface commits-in-progress because they index releases, not branches.

Where do I check primary sources for the May 22 to 23, 2026 events?

Anthropic's Claude Compliance API and Claude Code release notes are at anthropic.com/news with May 22 dates. The Claude Mythos preview findings are documented in Project Glasswing's first quantified results, also dated May 22. Hugging Face's dated paper index for May 22 is at huggingface.co/papers/date/2026-05-22. Fazm's tagged releases for May 22 are in CHANGELOG.json at the root of github.com/mediar-ai/fazm, with ISO date fields on each release object. Fazm's May 23 unshipped commits are at github.com/mediar-ai/fazm/commits/main, filtered to the date range. The on-disk verification is a single git command: git log --since='2026-05-23 00:00' --until='2026-05-23 23:59:59' --pretty=format:'%h %s' in a clone of the repo.

What does the Composio MCP integration that landed on May 23 actually do?

It adds a new class of bring-your-own-account MCP servers to the agent loop. The implementation is split across eight commits. Three add the wiring: Composio configuration fields, environment variables in .env.example, and API routes. Two add the server integration: a composio module added to routes, and Composio MCP server support inside the ACP bridge. Two add the auth and ergonomics: an auth-token injection fix that runs regardless of which Composio toolkits are enabled, and integration routes for toolkit connection. One adds the user-facing path: a composio-connect skill scoped to Gmail integration. The net effect, once shipped, is that an agent running on Claude Code or Codex via ACP can call into Gmail (and any other Composio-enabled toolkit) through MCP without the user having to wire up their own MCP server first.

What does the May 23 session-interrupt recovery work add?

Five commits raise the conversation-resumption surface from 'resume on tool failure' to 'recover from arbitrary stream interruption'. The visible changes are: prior-context recovery after a session interruption, a prior-context limit raised to 40 messages, a tool-history-and-partial-responses recovery path, implicit interrupt recovery that does not surface a session-expired notification to the user, and customizable preamble text. The shape is: when an agent stream is interrupted (network blip, app restart, tool deadlock), the next message can fold the prior tool calls and partial responses back into context instead of starting from a fresh prompt. This is a category of failure that any persistent-session agent eventually has to solve, and it landed in commits without a release tag yet.

Did any of the May 22 papers actually ship code on those dates?

DelTA, TransitLM, and Gated DeltaNet-2 link to code as Hugging Face papers typically do, but a paper being trending on May 22 does not mean it was uploaded that day. The Hugging Face dated-papers feed surfaces papers by recent attention. To check whether the code repository tied to a specific paper was actually first pushed on May 22 or May 23, the working method is to open the linked GitHub repo and read its initial commit date. For NVIDIA's Gated DeltaNet-2, the linear-attention work is a follow-up to DeltaNet and ships as an open-source repository alongside the paper. The Ising open-quantum models from NVIDIA also surfaced in late-May coverage but were announced earlier in April, not May 22 or 23. Trending feeds compress 'attention this week' into a daily view and the date stamp is not the upload date.

Is there a single recommended cadence to track AI releases that holds up in May 2026?

Three layers, three different cadences. For frontier-model drops (OpenAI, Anthropic, Google DeepMind, Meta, the major Chinese labs), weekly skimming of lab blogs is plenty; major announcements are loud. For research papers with linked code, daily skimming of huggingface.co/papers/trending plus a quick read of one or two interesting linked repos is the right rhythm. For the agent and tooling layer, rolling 7-day skim of a small number of project changelogs is more useful than any feed, and 1-2 of those projects should be ones whose commit history (not just release tags) you actually read. The May 23, 2026 window is a worked example of why: zero major announcements, zero new release tags in the Fazm repo, but 23 commits that together preview the next significant feature ships.

Does Fazm itself help me check 'what shipped in the past 48 hours' from inside the app?

Yes, through the deep-research skill that auto-installs to ~/.claude/skills/deep-research/SKILL.md on first launch. The skill runs an 8-phase pipeline (Scope, Plan, Retrieve, Triangulate, Synthesize, Critique, Refine, Package), launches parallel web searches plus parallel research subagents, verifies citations where possible, and writes a markdown plus HTML plus PDF report into a dated folder under ~/Documents. Because it runs as the local Claude Code, Codex, or (after v2.9.36 on May 22) Gemini agent on your machine, the answer is grounded against fresh searches from your IP, not a cached newsletter summary. For the May 22 to 23 window, asking the skill for a dated rundown gives a multi-source synthesis that you can cross-check against the primary sources above.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.