JUNE 2026 / FIELD NOTE

Hugging Face new models, June 2026. The list is the easy half. Then your bridge logs zero requests.

Every page answering this reprints a set of model names and stops. Here is the dated, checkable list of what gained support in June. Then the part nobody writes down: when you try to drive one of these from a Claude-Code-style agent on your Mac, the local server you stood up can silently receive no traffic at all. I traced exactly why through the source of the app I work on, and the fix is one setting.

Matthew Diakonov, Written with AI

Published June 16, 20268 min read

Direct answer, verified 2026-06-16

Hugging Face does not publish a single dated announcements page. The authoritative record of what gained support in June 2026 is the Transformers release notes. v5.11.0 added DiffusionGemma and DeepSeek-V3.2-Exp. v5.12.0 (June 12) added MiniMax-M3-VL, PP-OCRv6, and Parakeet-RNNT. v5.10.3 and v5.12.1 (June 15) were patches with no new models.

That is the lookup. The rest of this page is the part you hit right after: which of these you can actually point at an agent, and the one routing detail that decides whether your request ever arrives.

Specifically June 20-21, 2026, verified 2026-06-23

No new model classes shipped on June 20 or June 21, 2026. The last tagged Transformers release before those dates was v5.10.4 on June 19, a fix-only patch to keep vLLM in sync (no new models). The most recent new models in the run-up were the June 12 additions in v5.12.0: MiniMax-M3-VL, PP-OCRv6, and Parakeet-RNNT. Checked against the Transformers releases and the Releasebot changelog. If you came here looking for a June 20-21 drop, the honest answer is there was not one; the June 12 wave is the batch to evaluate.

The full June list, with dates and what each one is

Every row below comes from a tagged Transformers release. The date is the tag date, the model is the class added, and the description is condensed from the release notes and model cards. The patch row is included on purpose: two fix-only releases landing the same day (June 15) tells you the June 12 additions needed real shoring up, which is useful context if you are deciding whether to depend on one yet.

Release	Date	Model	What it is
v5.11.0	June 2026	DiffusionGemma	Encoder-decoder text model tuned for inference speed. Uses multi-canvas sampling: instead of generating left to right, it iteratively denoises a full block of tokens with a diffusion sampler.
v5.11.0	June 2026	DeepSeek-V3.2-Exp	Experimental DeepSeek release introducing DeepSeek Sparse Attention (DSA), a trainable fine-grained sparse attention mechanism aimed at long-context training and inference efficiency.
v5.12.0	June 12, 2026	MiniMax-M3-VL	Vision-language model pairing a CLIP-style vision tower with the MiniMax-M3 text backbone. Mixed dense and sparse Mixture-of-Experts with a lightning indexer for block-sparse attention.
v5.12.0	June 12, 2026	PP-OCRv6	Lightweight OCR system built from MetaFormer-style blocks, shipped in three tiers (medium, small, tiny) for document text recognition.
v5.12.0	June 12, 2026	Parakeet-RNNT	Speech-to-text model: a Fast Conformer encoder paired with an RNN-T decoder using LSTM prediction networks for language context.
v5.10.3 / v5.12.1	June 15, 2026	(patches, no new models)	Two same-day patch releases. Fixes only: vLLM compatibility, processor token IDs, InternVL handling, the PEFT lower bound, and a mistral tokenizer backend issue.
v5.10.4	June 19, 2026	(patch, no new models)	The last tagged release before June 20-21. Fix-only: several changes to keep vLLM in sync with transformers. No new model classes landed on June 19, 20, 21, or 22.

The live trending board at huggingface.co/models showed a heavy June wave of open-weight frontier releases on top of these framework additions. That board is popularity-ranked, not a dated changelog, so treat it as a separate signal: it tells you what people are downloading this week, not what gained first-class support and when.

Checked again 2026-06-21 (covers June 16-20)

If you searched for a specific late-June date, here is the honest state of things. Between June 16 and June 20, 2026 the Transformers release notes added no new model classes. The most recent tag is still v5.12.1 from June 15, which was patch-only, so the dated list above (DiffusionGemma, DeepSeek-V3.2-Exp, MiniMax-M3-VL, PP-OCRv6, Parakeet-RNNT) is the complete record of new framework support for the month so far.

What did move in that window was the live trending board: open-weight names like DeepSeek V4.1, Qwen 3.7, GLM-6, Llama 4.5, and Gemma 4 were dominating downloads. Those are weights and popularity, not a dated changelog entry, so if your real question is “what can I point a Mac agent at right now,” the routing section below matters more than the exact day a name trended.

The part the roundups skip: the request that never arrives

Say you picked DeepSeek-V3.2-Exp off the June list, served it locally, and put a small Anthropic-format shim in front so an agent client can talk to it. You open Fazm, the native macOS app I work on that wraps Claude Code and Codex, paste your bridge URL into Custom API Endpoint, and send a message. The chat answers. But your bridge log is empty. No request hit it. Nothing crashed, no error, the agent just quietly talked to something else.

This is not a misconfiguration on your end. It is a routing detail that only shows up once you know how the model selector and the endpoint interact. Here is the sequence, the way it actually plays out the first time.

What happens the first time you wire in a June model

01 / 04

You set the endpoint

Paste your local bridge URL into Custom API Endpoint. Fazm validates it is an absolute http(s) URL and stores it. So far, correct.

Why it routes that way, from the source

The Custom API Endpoint setting does exactly one thing: it sets the ANTHROPIC_BASE_URL environment variable for the agent bridge. That variable is only read on the Claude path. Whether a given chat uses the Claude path is decided by one function, defined by exclusion: anything that is not a Gemini model and not a Codex model is treated as Claude-routed. Gemini and Codex selections go to their own backends and never see your endpoint.

routing.swift

Notice the placeholder key. When you route through your own endpoint, Fazm injects sk-fazm-custom-endpoint instead of its bundled Anthropic key, so it never leaks a real key to your proxy and never bills the usage against built-in credits. The placeholder exists because many Anthropic-compatible local gateways accept any key and just need the API-key path to stay active rather than triggering an OAuth prompt.

The two paths, side by side

The same prompt takes two completely different routes depending on one dropdown. The top path is the default-install failure; the bottom path is what you want.

Default Gemini route vs Claude route to your bridge

The dated fix, and a second trap worth knowing

This footgun has a clear origin and a clear resolution, both in the same month. Version 2.9.60, shipped June 3, 2026, changed new chats to start on Gemini Flash by default because it is fast and free. That is a good default for most people and the exact reason a custom-endpoint user could be silently bypassed. Two weeks later, version 2.9.68 on June 16, 2026 closed it. From the changelog: “Custom API Endpoint setting now warns when your selected model is not a Claude model (the endpoint only routes Claude traffic), with a one-tap switch to a Claude model.”

The second trap is the endpoint value itself. It must be an absolute http(s) URL with a host. A bare localhost:8766 with no scheme is rejected on purpose, because a malformed value would otherwise be written into ANTHROPIC_BASE_URL and make every query fail with an invalid-URL error that the retry-with-resume path can swallow into an empty turn, so you see no error at all. Write the full http://localhost:8766 and that whole class of silent failure goes away.

Which of June's models are worth wiring in

Connecting one is not the same as it being good at agent work. An agent loop is a punishing workload: long chains of tool calls, arguments the model has to format exactly, and a context window that fills with tool output. Most of June's additions were never built for that, which is fine, because that was not their job.

PP-OCRv6 reads text out of documents. Parakeet-RNNT is speech-to-text. MiniMax-M3-VL is a vision-language model for images and documents. DiffusionGemma is a speed-optimized text generator using block denoising. None of those is an agent driver, and pushing a multi-step desktop task through one is the wrong test for it.

The plausible general-purpose candidate from the June list is DeepSeek-V3.2-Exp, and even there the deciding factor is not a benchmark number. It is tool-calling reliability and usable context length under real load. The only honest way to find out is to wire it in through a Custom API Endpoint, set the chat to a Claude-family model so the request actually reaches it, and watch it attempt a genuine task before you trust it with anything that matters.

Wiring a new June model into your Mac?

Book a call and I will walk you through serving a model behind an Anthropic-compatible endpoint, the model-selector gotcha, and where the rough edges actually are.

Questions about Hugging Face's June 2026 models

Frequently asked questions

What new models did Hugging Face add in June 2026?

Hugging Face has no single dated announcements page, so the authoritative record of new model support is the Transformers release notes. In June 2026, v5.11.0 added first-class support for DiffusionGemma and DeepSeek-V3.2-Exp, and v5.12.0 (June 12) added MiniMax-M3-VL, PP-OCRv6, and Parakeet-RNNT. v5.10.3 and v5.12.1 (both June 15) were patch-only with no new model classes. Separately, the live trending feed at huggingface.co/models surfaced a wave of open-weight frontier releases that month, but those are popularity-ranked, not a dated changelog.

Were there new Hugging Face models on June 20-21, 2026 specifically?

No. No new model classes were added to Transformers on June 20 or June 21, 2026. The last tagged release before those dates was v5.10.4 on June 19, a fix-only patch that kept vLLM in sync with no new models. The most recent new models in that window were the June 12 additions in v5.12.0: MiniMax-M3-VL, PP-OCRv6, and Parakeet-RNNT. Verified against the Transformers releases page and the Releasebot changelog on 2026-06-23. If you are tracking a specific June 20-21 drop, it did not happen in the framework changelog; the June 12 batch is the one to evaluate.

Why would a brand-new June model send zero requests to my local bridge?

Because of how the agent client routes traffic. In Fazm, the Custom API Endpoint setting overrides exactly one thing: the ANTHROPIC_BASE_URL environment variable. That variable only affects requests on the Claude path. Gemini and Codex model selections bypass it entirely. As of version 2.9.60 (June 3, 2026), new chats start on Gemini Flash by default. So if you install fresh, paste your bridge URL into Custom API Endpoint, and send a message without changing the model, the request goes out on the Gemini path and your bridge never receives it. The bridge is fine; the model selector is on the wrong family.

How do I actually point Fazm at one of June's new models?

Three things have to line up. First, serve the model behind an Anthropic-API-compatible HTTP endpoint (most local runtimes speak OpenAI format, so you put a small translation shim in front). Second, paste that endpoint as an absolute http(s) URL with a host into the Custom API Endpoint field; a bare 'localhost:8766' with no scheme is rejected on purpose because a malformed value would otherwise land in ANTHROPIC_BASE_URL and break chat. Third, and this is the step people miss, set the chat's model to a Claude-family model so the request actually routes through ANTHROPIC_BASE_URL to your bridge.

Did Fazm change anything about this in June 2026?

Yes, two dated changes that bracket the whole problem. Version 2.9.60 (June 3, 2026) made new chats default to Gemini Flash because it is fast and free, which is also what created the silent-bypass footgun for custom endpoints. Version 2.9.68 (June 16, 2026) added the fix: the Custom API Endpoint setting now warns when your selected model is not a Claude model and offers a one-tap switch to a Claude model so your traffic reaches the endpoint.

If I route through my own endpoint, does Fazm bill me for Claude usage?

No. When a valid custom endpoint is set, Fazm injects a placeholder key (sk-fazm-custom-endpoint) instead of its bundled Anthropic key and does not count the usage against built-in credits. The requests hit your endpoint and your billing. The placeholder exists so Anthropic-compatible local gateways that accept any API key stay on the API-key path rather than triggering a Claude OAuth prompt.

Which of June's new models are realistic agent drivers?

Most of them are not, and that is by design. PP-OCRv6 is an OCR system. Parakeet-RNNT is speech-to-text. MiniMax-M3-VL is a vision-language model for documents and images. DiffusionGemma is a speed-optimized text generator. None of those is built for long tool-calling chains. The plausible general-purpose candidate from the June list is DeepSeek-V3.2-Exp, and even there the deciding factor for agent work is tool-calling reliability and usable context length under load, not a benchmark score. The honest test is to wire it in and watch it attempt a real multi-step task.

Is there an official Hugging Face page that lists new models by month?

No. There is no calendar-indexed announcements feed. What people treat as announcements are live, popularity-ranked feeds (huggingface.co/models sorted by trending, the papers feed, and the size leaderboards) plus per-lab blog posts. The closest thing to a dated, authoritative list of what gained framework support is the Transformers Releases page on GitHub, because each tag names the exact model classes added and is tied to a real timestamp.