JUNE 2026 / FIELD NOTE

Hugging Face new models, June 2026. The list is the easy half. Then your bridge logs zero requests.

Every page answering this reprints a set of model names and stops. Here is the dated, checkable list of what gained support in June. Then the part nobody writes down: when you try to drive one of these from a Claude-Code-style agent on your Mac, the local server you stood up can silently receive no traffic at all. I traced exactly why through the source of the app I work on, and the fix is one setting.

M
Matthew Diakonov
8 min read

Direct answer, verified 2026-06-16

Hugging Face does not publish a single dated announcements page. The authoritative record of what gained support in June 2026 is the Transformers release notes. v5.11.0 added DiffusionGemma and DeepSeek-V3.2-Exp. v5.12.0 (June 12) added MiniMax-M3-VL, PP-OCRv6, and Parakeet-RNNT. v5.10.3 and v5.12.1 (June 15) were patches with no new models.

That is the lookup. The rest of this page is the part you hit right after: which of these you can actually point at an agent, and the one routing detail that decides whether your request ever arrives.

The full June list, with dates and what each one is

Every row below comes from a tagged Transformers release. The date is the tag date, the model is the class added, and the description is condensed from the release notes and model cards. The patch row is included on purpose: two fix-only releases landing the same day (June 15) tells you the June 12 additions needed real shoring up, which is useful context if you are deciding whether to depend on one yet.

ReleaseDateModelWhat it is
v5.11.0June 2026DiffusionGemmaEncoder-decoder text model tuned for inference speed. Uses multi-canvas sampling: instead of generating left to right, it iteratively denoises a full block of tokens with a diffusion sampler.
v5.11.0June 2026DeepSeek-V3.2-ExpExperimental DeepSeek release introducing DeepSeek Sparse Attention (DSA), a trainable fine-grained sparse attention mechanism aimed at long-context training and inference efficiency.
v5.12.0June 12, 2026MiniMax-M3-VLVision-language model pairing a CLIP-style vision tower with the MiniMax-M3 text backbone. Mixed dense and sparse Mixture-of-Experts with a lightning indexer for block-sparse attention.
v5.12.0June 12, 2026PP-OCRv6Lightweight OCR system built from MetaFormer-style blocks, shipped in three tiers (medium, small, tiny) for document text recognition.
v5.12.0June 12, 2026Parakeet-RNNTSpeech-to-text model: a Fast Conformer encoder paired with an RNN-T decoder using LSTM prediction networks for language context.
v5.10.3 / v5.12.1June 15, 2026(patches, no new models)Two same-day patch releases. Fixes only: vLLM compatibility, processor token IDs, InternVL handling, the PEFT lower bound, and a mistral tokenizer backend issue.

The live trending board at huggingface.co/models showed a heavy June wave of open-weight frontier releases on top of these framework additions. That board is popularity-ranked, not a dated changelog, so treat it as a separate signal: it tells you what people are downloading this week, not what gained first-class support and when.

The part the roundups skip: the request that never arrives

Say you picked DeepSeek-V3.2-Exp off the June list, served it locally, and put a small Anthropic-format shim in front so an agent client can talk to it. You open Fazm, the native macOS app I work on that wraps Claude Code and Codex, paste your bridge URL into Custom API Endpoint, and send a message. The chat answers. But your bridge log is empty. No request hit it. Nothing crashed, no error, the agent just quietly talked to something else.

This is not a misconfiguration on your end. It is a routing detail that only shows up once you know how the model selector and the endpoint interact. Here is the sequence, the way it actually plays out the first time.

What happens the first time you wire in a June model

01 / 04

You set the endpoint

Paste your local bridge URL into Custom API Endpoint. Fazm validates it is an absolute http(s) URL and stores it. So far, correct.

Why it routes that way, from the source

The Custom API Endpoint setting does exactly one thing: it sets the ANTHROPIC_BASE_URL environment variable for the agent bridge. That variable is only read on the Claude path. Whether a given chat uses the Claude path is decided by one function, defined by exclusion: anything that is not a Gemini model and not a Codex model is treated as Claude-routed. Gemini and Codex selections go to their own backends and never see your endpoint.

routing.swift

Notice the placeholder key. When you route through your own endpoint, Fazm injects sk-fazm-custom-endpoint instead of its bundled Anthropic key, so it never leaks a real key to your proxy and never bills the usage against built-in credits. The placeholder exists because many Anthropic-compatible local gateways accept any key and just need the API-key path to stay active rather than triggering an OAuth prompt.

The two paths, side by side

The same prompt takes two completely different routes depending on one dropdown. The top path is the default-install failure; the bottom path is what you want.

Default Gemini route vs Claude route to your bridge

You (chat)Fazm routerLocal bridge :8766June modelprompt on default Gemini Flashanswers via bundled Gemini key, endpoint bypassedswitch to a Claude model, resendANTHROPIC_BASE_URL points hereAnthropic-format calltokensresponseanswer from your model

The dated fix, and a second trap worth knowing

This footgun has a clear origin and a clear resolution, both in the same month. Version 2.9.60, shipped June 3, 2026, changed new chats to start on Gemini Flash by default because it is fast and free. That is a good default for most people and the exact reason a custom-endpoint user could be silently bypassed. Two weeks later, version 2.9.68 on June 16, 2026 closed it. From the changelog: “Custom API Endpoint setting now warns when your selected model is not a Claude model (the endpoint only routes Claude traffic), with a one-tap switch to a Claude model.”

The second trap is the endpoint value itself. It must be an absolute http(s) URL with a host. A bare localhost:8766 with no scheme is rejected on purpose, because a malformed value would otherwise be written into ANTHROPIC_BASE_URL and make every query fail with an invalid-URL error that the retry-with-resume path can swallow into an empty turn, so you see no error at all. Write the full http://localhost:8766 and that whole class of silent failure goes away.

Which of June's models are worth wiring in

Connecting one is not the same as it being good at agent work. An agent loop is a punishing workload: long chains of tool calls, arguments the model has to format exactly, and a context window that fills with tool output. Most of June's additions were never built for that, which is fine, because that was not their job.

PP-OCRv6 reads text out of documents. Parakeet-RNNT is speech-to-text. MiniMax-M3-VL is a vision-language model for images and documents. DiffusionGemma is a speed-optimized text generator using block denoising. None of those is an agent driver, and pushing a multi-step desktop task through one is the wrong test for it.

The plausible general-purpose candidate from the June list is DeepSeek-V3.2-Exp, and even there the deciding factor is not a benchmark number. It is tool-calling reliability and usable context length under real load. The only honest way to find out is to wire it in through a Custom API Endpoint, set the chat to a Claude-family model so the request actually reaches it, and watch it attempt a genuine task before you trust it with anything that matters.

Wiring a new June model into your Mac?

Book a call and I will walk you through serving a model behind an Anthropic-compatible endpoint, the model-selector gotcha, and where the rough edges actually are.

Questions about Hugging Face's June 2026 models

Frequently asked questions

What new models did Hugging Face add in June 2026?

Hugging Face has no single dated announcements page, so the authoritative record of new model support is the Transformers release notes. In June 2026, v5.11.0 added first-class support for DiffusionGemma and DeepSeek-V3.2-Exp, and v5.12.0 (June 12) added MiniMax-M3-VL, PP-OCRv6, and Parakeet-RNNT. v5.10.3 and v5.12.1 (both June 15) were patch-only with no new model classes. Separately, the live trending feed at huggingface.co/models surfaced a wave of open-weight frontier releases that month, but those are popularity-ranked, not a dated changelog.

Why would a brand-new June model send zero requests to my local bridge?

Because of how the agent client routes traffic. In Fazm, the Custom API Endpoint setting overrides exactly one thing: the ANTHROPIC_BASE_URL environment variable. That variable only affects requests on the Claude path. Gemini and Codex model selections bypass it entirely. As of version 2.9.60 (June 3, 2026), new chats start on Gemini Flash by default. So if you install fresh, paste your bridge URL into Custom API Endpoint, and send a message without changing the model, the request goes out on the Gemini path and your bridge never receives it. The bridge is fine; the model selector is on the wrong family.

How do I actually point Fazm at one of June's new models?

Three things have to line up. First, serve the model behind an Anthropic-API-compatible HTTP endpoint (most local runtimes speak OpenAI format, so you put a small translation shim in front). Second, paste that endpoint as an absolute http(s) URL with a host into the Custom API Endpoint field; a bare 'localhost:8766' with no scheme is rejected on purpose because a malformed value would otherwise land in ANTHROPIC_BASE_URL and break chat. Third, and this is the step people miss, set the chat's model to a Claude-family model so the request actually routes through ANTHROPIC_BASE_URL to your bridge.

Did Fazm change anything about this in June 2026?

Yes, two dated changes that bracket the whole problem. Version 2.9.60 (June 3, 2026) made new chats default to Gemini Flash because it is fast and free, which is also what created the silent-bypass footgun for custom endpoints. Version 2.9.68 (June 16, 2026) added the fix: the Custom API Endpoint setting now warns when your selected model is not a Claude model and offers a one-tap switch to a Claude model so your traffic reaches the endpoint.

If I route through my own endpoint, does Fazm bill me for Claude usage?

No. When a valid custom endpoint is set, Fazm injects a placeholder key (sk-fazm-custom-endpoint) instead of its bundled Anthropic key and does not count the usage against built-in credits. The requests hit your endpoint and your billing. The placeholder exists so Anthropic-compatible local gateways that accept any API key stay on the API-key path rather than triggering a Claude OAuth prompt.

Which of June's new models are realistic agent drivers?

Most of them are not, and that is by design. PP-OCRv6 is an OCR system. Parakeet-RNNT is speech-to-text. MiniMax-M3-VL is a vision-language model for documents and images. DiffusionGemma is a speed-optimized text generator. None of those is built for long tool-calling chains. The plausible general-purpose candidate from the June list is DeepSeek-V3.2-Exp, and even there the deciding factor for agent work is tool-calling reliability and usable context length under load, not a benchmark score. The honest test is to wire it in and watch it attempt a real multi-step task.

Is there an official Hugging Face page that lists new models by month?

No. There is no calendar-indexed announcements feed. What people treat as announcements are live, popularity-ranked feeds (huggingface.co/models sorted by trending, the papers feed, and the size leaderboards) plus per-lab blog posts. The closest thing to a dated, authoritative list of what gained framework support is the Transformers Releases page on GitHub, because each tag names the exact model classes added and is tied to a real timestamp.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.