New AI model releases in the last 24 hours
Nobody can hand you a static list of what dropped today, because the list is stale by the time you read it. So this page does two things instead: points you at the feeds that actually update hourly, and then covers the part every other guide skips, which is how to run a model that shipped this morning inside the agent loop you already work in.
Direct answer (verified 2026-06-20)
There is no single canonical list of every AI model released in the last 24 hours, and there never has been; releases land at all hours across dozens of labs and registries. The closest thing to a live answer is to watch three feeds and the vendor newsrooms:
- Hugging Face, newest models for open-weight models, which usually appear here the moment the weights upload.
- OpenRouter models for hosted models that have reached a routing layer.
- Artificial Analysis for independent benchmarks once a model has been measured.
- Vendor newsrooms for frontier hosted models, which land in the vendor docs before anywhere else.
Reading those takes a minute. The slow part is using whatever you find. The rest of this page is about closing that gap on the same day.
The list is the easy half
Every article about this topic stops at the list. A model shipped, here is its name, here are its benchmark numbers. That is genuinely useful for about ten minutes. Then you hit the real question: can you put it to work in the tool you spend your day in, or do you have to wait for that tool to add support?
For most agent tools the answer is wait. They hard-code one provider, so a new model is unreachable until the maintainers ship an update and you upgrade. The news cycle moves in hours; that loop moves in weeks.
What launch day looks like
You read that a model shipped this morning. Your agent tool only speaks to one backend, so you cannot point it anywhere new. You file an issue, wait for a release, then upgrade. By the time you can try the model, it is old news.
- One backend, hard-coded
- New model gated behind a tool update
- Days to weeks of lag
How Fazm routes a model it has never seen before
Fazm is a native macOS app that wraps Claude Code and Codex through ACP, the agent client protocol. Internally it does not assume a single provider. Every model id is classified into one of three lanes before a turn is sent, and the classification is pure string matching, so a model that did not exist yesterday still routes correctly today.
One agent loop, three routing lanes
The classification lives in ChatProvider.swift. A model is a Codex model if its id starts with gpt-, codex-, or an o-series prefix; a Gemini model if it starts with gemini-; and a Claude model by exclusion, meaning anything that is neither. That last rule is the important one: a Claude-compatible model with a name nobody has ever seen still falls into the Claude lane and works.
The anchor: what the Custom API Endpoint setting actually changes
When you set a Custom API Endpoint, Fazm does exactly two things to the agent bridge's environment, and you can read both in ACPBridge.swift in the open-source repo:
Two consequences follow from this being a base-URL override and nothing more. First, it routes only Claude-family traffic; Codex and Gemini models bypass it entirely, because they never read ANTHROPIC_BASE_URL. Second, your own credentials, not Fazm's, are what reach the proxy, because the bundled key is swapped for the harmless placeholder sk-fazm-custom-endpoint. That placeholder exists so Anthropic-compatible local gateways that accept any key stay on the API-key path instead of triggering an OAuth login. This is the uncopyable detail: it is not a generic provider dropdown, it is a deliberately narrow override that keeps built-in chat safe while letting you point at anything that speaks the Anthropic shape.
Running a model that shipped today, in three steps
This is the path for an open-weight model that dropped on Hugging Face this morning. The same setting works for a corporate proxy or a hosted Anthropic-compatible gateway; only step one changes.
Launch-day workflow
- 1
Serve it
Load the new weights in LM Studio or Ollama and expose the local server. Or copy the base URL of any Anthropic-compatible gateway you already have.
- 2
Point Fazm at it
Settings > Advanced > AI Chat > Custom API Endpoint. Paste the http(s) URL. Fazm validates it is an absolute URL with a host before forwarding it.
- 3
Send a turn
Fazm restarts the bridge so the new endpoint takes effect, then sends Claude-routed turns to your model. Your chat history stays intact across the switch.
If the local server has no model loaded, you do not get a silent empty turn. Fazm returns a specific message telling you to load a model or turn the endpoint off, which is the right behavior when you are stress-testing weights that are a few hours old and half your problems are configuration, not the model.
Where this does not help
The endpoint override is narrow on purpose, so be honest about its edges. A new OpenAI or Gemini model has to arrive through its own backend; the Anthropic base-URL override will not reach it. A model being downloadable says nothing about whether it is good at agentic tool use, and plenty of impressive 24-hour-old models still fumble multi-step tool calls. And a local model is only as fast as your machine, so a 70B parameter model that benchmarks well may be too slow to sit inside a tight agent loop on a laptop. Launch day is when your own testing starts, not when it ends. The value here is that the testing can start the same day, inside the conversations and the Mac-wide reach you already use, instead of weeks later behind a tool update.
Want to route a new model through your own workflow?
Walk through pointing Fazm at a custom endpoint, swapping backends per chat, and keeping context alive across the switch.
Questions people actually ask
Is there a single page that lists every AI model released in the last 24 hours?
No. Nothing publishes one canonical, always-current list, because releases land at all hours across dozens of labs and registries. The closest thing to a real-time view is a registry sorted by creation date (Hugging Face's newest-models view) plus a routing aggregator (OpenRouter) and an independent benchmark tracker (Artificial Analysis). You cross-reference those against the vendor newsrooms for the frontier labs. Treat any blog post claiming to be a 'last 24 hours' list as a snapshot from whenever it was written, not a live feed.
Where do new models actually show up first?
Open-weight models usually appear on Hugging Face the moment the weights are uploaded, often before any blog post. Hosted frontier models (Claude, GPT, Gemini families) appear in the vendor's own docs and newsroom first, then propagate to routing layers like OpenRouter within hours. So the fastest signal is: Hugging Face newest for open weights, the vendor docs/newsroom for hosted models.
Once a model drops, how fast can I use it in a real agent workflow?
That is the part the news lists skip. Reading that a model exists takes seconds; wiring it into the agent loop you actually work in is the slow part. If your tool hard-codes one provider, you wait for that tool to ship an update. Fazm decouples the two: it routes Claude-family, Codex (gpt-/codex-/o-series), and Gemini models through one ACP loop, and a Custom API Endpoint setting redirects Claude-family traffic to any Anthropic-compatible gateway. So a model reachable through such a gateway is reachable from your existing chats the same day.
What exactly does Fazm's Custom API Endpoint setting do?
It writes your URL into the ANTHROPIC_BASE_URL environment variable for the agent bridge and substitutes a placeholder key (sk-fazm-custom-endpoint) so Fazm never forwards its bundled credentials to your proxy. Because it overrides only the Anthropic base URL, it routes only Claude-family traffic. Codex and Gemini models bypass it. This is in ACPBridge.swift in the open-source repo, not a marketing claim. It is the same mechanism people use to point Fazm at LM Studio, Ollama, a corporate proxy, or a GitHub Copilot gateway.
Can I run a brand-new open-weight model locally through Fazm on launch day?
Yes, if you serve it behind an Anthropic-compatible endpoint. Load the new weights in LM Studio or Ollama, expose the local server, and paste that URL into Settings > Advanced > AI Chat > Custom API Endpoint. Fazm then sends Claude-routed turns to your local model. If the local server has no model loaded, Fazm surfaces a specific error telling you to load one rather than silently failing, which matters when you are testing weights that dropped an hour ago.
Do I lose my conversation when I switch which model I am pointing at?
No. Switching the endpoint stops and restarts the bridge so the new setting takes effect on your next query, but the chat itself persists. Fazm keeps full chat history live for the lifetime of the window, does not auto-compact, and restores every window after a Mac restart. So you can test a model that shipped this morning inside a thread you started last week, with the prior context intact.
What is the catch with routing a freshly released model through a custom endpoint?
The endpoint only affects Claude-family traffic, so a new OpenAI or Gemini model still has to come in through its own backend, not the Anthropic base-URL override. Local servers must actually have the model loaded or Fazm tells you so. And a model being downloadable does not mean it is good at agentic tool use; a 24-hour-old model can be impressive on benchmarks and still mishandle multi-step tool calls. Treat launch day as the start of your own testing, not the end.
Keep reading
New LLM releases for Mac desktop agents (April 2026)
A dated snapshot of model launches and what they meant for running agents on a Mac.
Why agent tooling beats model upgrades
The harness around the model usually moves your workflow more than the next model does.
Control Claude Code context compaction
Why auto-compacting drops decisions in long sessions, and how to keep history live.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.