Latest Anthropic API release notes for June 2026
The official changelog tells you what shipped. It does not tell you which entries actually change your day if you run Claude Code through a wrapper. This page does both: a dated, checked list of the June 2026 Claude API changes, then the four that matter for a long-running, bring-your-own-account agent loop.
In June 2026 Anthropic retired Claude Sonnet 4 and Opus 4 (June 15), launched Claude Fable 5 and Mythos 5 (June 9), stopped billing refusals that produce no output (June 2), and disclosed a 90-second per-cell code-execution limit plus a new response_inclusion web-tool parameter (June 11). These build on late-May's Claude Opus 4.8 (1M token default context) and the launch of Claude Platform on AWS.
Authoritative source: docs.claude.com/en/release-notes/api. The dates and details below were re-checked against that page and the releasebot Claude Developer Platform feed on 2026-06-20.
The June 2026 changes, in order
Newest first. Each row is a public Claude Developer Platform release note.
| Date | Change |
|---|---|
| Jun 15, 2026 | Claude Sonnet 4 and Opus 4 retired. Both models stop serving requests. Callers are pointed to Sonnet 4.6 and Opus 4.8. Pinned model ids start failing after this date. |
| Jun 11, 2026 | Code execution per-cell limit disclosed. The code_execution_20260521 tool version documents a 90-second per-cell execution limit for long-running cells. |
| Jun 11, 2026 | web_search / web_fetch gain response_inclusion. web_search_20260318 and web_fetch_20260318 add a response_inclusion parameter to drop consumed result blocks from the response. |
| Jun 10, 2026 | AWS sandbox work endpoint. GET /v1/environments/{id}/work is available on Claude Platform on AWS for listing pending work. |
| Jun 9, 2026 | Claude Fable 5 and Mythos 5 launch. Both ship with a 1M token context, 128k max output, and always-on adaptive thinking. A reasoning_extraction category is added to stop_details.category for blocked requests. |
| Jun 5, 2026 | Claude Opus 4.1 deprecated. Marked deprecated with retirement scheduled for August 5, 2026. Still serves until then. |
| Jun 2, 2026 | Refusals with no output are not billed. A request that returns stop_reason: "refusal" without Claude generating any output is no longer billed. The advisor tool also gains a max_tokens cap per call. |
Late-May groundwork the June notes assume
Several June entries only make sense if you read the three changes that landed days earlier.
Managed Agents webhooks, multi-agent orchestration, and self-hosted sandboxes land on the AWS-billed platform.
1M token default context, 128k max output, broader platform support. Mid-conversation system messages are now allowed after user turns while preserving prompt cache hits, and stop_details gains a category and human-readable explanation.
Native AWS endpoints with AWS billing and IAM auth for the Messages API, Files API, Message Batches API, and Managed Agents.
The four entries that change your day, if you wrap the agent loop
Most of the list is housekeeping. These four touch how a long-running Claude Code session behaves, what it costs, and where it can run.
Model retirement is a hard cutoff
June 15 retired Sonnet 4 and Opus 4. A wrapper that pins a model id breaks; one that lets you swap the backend per chat just changes a dropdown. This is the difference between a deprecation warning and a failed request.
1M default context vs summarizing it away
Opus 4.8 makes 1M tokens the default, not a beta header. Keeping a full session live in context, instead of auto-summarizing it, stops being reckless and starts being normal.
Refusals stop costing you
From June 2, a refusal with no output is not billed. If your agent occasionally trips a refusal mid-run, that line item disappears on its own.
Custom endpoints now reach AWS
Claude Platform on AWS gives the same Messages API a native AWS-billed endpoint. Anything that can set ANTHROPIC_BASE_URL can point the agent loop there, or at a corporate gateway.
How a wrapper actually routes to a new endpoint
The June notes keep adding places Claude can run: AWS endpoints, the existing Bedrock and Vertex paths, corporate proxies. The whole mechanism on the client side is one environment variable. The Anthropic SDK reads ANTHROPIC_BASE_URL, and whatever spawns the agent loop decides what that points at.
Here is the exact behavior in Fazm, which wraps the real Claude Code agent loop in a Mac app. In Desktop/Sources/Chat/ACPBridge.swift (around line 2425), a user-set custom endpoint is validated as an absolute http(s) URL, written into the child process environment, and the bundled key is replaced with a placeholder so your real key never leaves the machine:
if let customEndpoint = Self.validCustomAPIEndpoint(rawCustomEndpoint) {
env["ANTHROPIC_BASE_URL"] = customEndpoint
env["FAZM_CUSTOM_API_ENDPOINT"] = "true"
// A custom endpoint means the user is routing their own model/provider.
// Never send Fazm's bundled Anthropic key to that proxy.
env["ANTHROPIC_API_KEY"] = "sk-fazm-custom-endpoint"
}A malformed value (a missing scheme, a bare localhost:8766, stray text) is rejected and the app falls back to the default Anthropic endpoint, because an invalid base URL otherwise makes the SDK throw on every turn and silently bricks chat. That guard is the difference between "point it at AWS in one field" and "spend an afternoon debugging an empty response."
One environment variable, many destinations
Why June 15 hits harder than a normal deprecation
Deprecation and retirement are not the same event. Opus 4.1 was deprecated on June 5 with retirement set for August 5, so it keeps serving for now and you get runway. Sonnet 4 and Opus 4 were retired on June 15: after that date a request pinned to those ids does not warn, it fails. Anthropic directs you to Sonnet 4.6 and Opus 4.8 as the replacements.
If your setup hard-codes a model id anywhere (a shell alias, a CI job, a config file, a wrapper's default), that is the line you change. The advantage of a tool that lets you pick the backend model per chat is that a retirement is a one-click swap, not an incident. The same applies to the new Fable 5 and Mythos 5 models from June 9: if you want them, you select them, and the rest of your session is untouched.
1M default context is what makes "don't compact" reasonable
There is a long-running argument about auto-compacting: summarizing the early part of a session to free up room silently drops decisions you made an hour ago. The counter-move is to keep the full history live for the lifetime of the window and never summarize. The obvious objection used to be cost and ceiling: a full history burns context fast.
The late-May Opus 4.8 change moves the ceiling. A 1M token default context (also shared by Fable 5 and Mythos 5) means a long session can carry far more of its own history before anything has to give. That is the technical precondition that turns "never auto-compact" from a stubborn preference into a defensible default. Pair that with mid-conversation system messages that preserve prompt cache hits, also from the May 28 notes, and you can steer a long session without throwing away the cached prefix you already paid for.
If you want the deeper version of this argument, see the companion piece on controlling Claude Code context compaction.
Two billing notes, one quiet win and one that did not happen
The quiet win is June 2: a request returning stop_reason: "refusal" with no generated output is no longer billed. If your agent loop occasionally trips a hard refusal, you were previously paying input tokens for nothing. That stops automatically, no flag required.
The one that did not happen is worth flagging because several guides state it as fact. Anthropic announced a change that would have moved Agent SDK, claude -p, and third-party app usage onto a separate monthly credit pool starting June 15, 2026, then paused it on the day it was due to take effect. So if you bring your own Claude Pro or Max account to a wrapper, that flat-rate path is, as of this writing, unchanged. If a page tells you the credit-pool split is live, check the official notes before you plan around it.
Run the June 2026 agent loop without the restart tax
Fifteen minutes on how Fazm wraps Claude Code with persistent sessions, one-click forking, and a custom-endpoint field for AWS or your own gateway.
Questions people actually ask about these notes
Frequently asked questions
What is the single most disruptive Anthropic API change in June 2026?
The June 15, 2026 retirement of Claude Sonnet 4 and Claude Opus 4. Retirement is harder than deprecation: a request pinned to a retired model id fails outright instead of warning. Anything that hard-codes claude-sonnet-4 or claude-opus-4 (a script, a CI job, a wrapper config) needs to move to Sonnet 4.6 or Opus 4.8 before that date.
Does the refusal billing change apply automatically?
Yes. As of June 2, 2026 a request that returns stop_reason: "refusal" with no generated output is simply not billed. You do not opt in. If you were previously paying for the input tokens of requests Claude refused outright, those stop showing up on the bill.
What does the Opus 4.8 1M default context window change in practice?
Before, large context was a beta header you opted into. Opus 4.8 ships 1M tokens as the default. For a long agent session that means you can carry far more history before you hit the ceiling, which is what makes keeping a full chat live (instead of summarizing it away) practical rather than reckless.
Was there a June 15 billing change for the Agent SDK?
Anthropic announced a change that would have moved Agent SDK, claude -p, and third-party app usage onto a separate monthly credit pool starting June 15, 2026, then paused it on the day it was due to take effect. As of this writing that flat-rate subscription usage path is unchanged. Treat any guide that states the credit-pool change as live with caution and check the official notes.
How do I point Claude Code at Claude Platform on AWS or a corporate gateway?
The Anthropic SDK reads ANTHROPIC_BASE_URL. Anything that spawns the Claude Code agent loop can set that variable to redirect traffic to an Anthropic-compatible endpoint. In Fazm this is a Settings field: the app validates the URL, writes it into the child process environment, and disables the bundled key so it never reaches your proxy. See the section above for the exact behavior.
Where is the authoritative source for these notes?
Anthropic publishes them at docs.claude.com/en/release-notes/api. This page mirrors and dates those entries as of June 20, 2026 and adds the reading that matters if you run the agent loop through a wrapper. When in doubt, the official page wins.
Do I need to change anything in Fazm because of these releases?
If you were pinned to a retired model id, switch the backend model in the picker. Otherwise no: the agent loop is the real Claude Code, so it inherits new models, the refusal billing change, and the 1M default context automatically. The custom endpoint field is there if you want to route through AWS, a Copilot gateway, or your own proxy.
Related reading
Control Claude Code context compaction
Why auto-compacting drops decisions, and how to keep a full session live.
The May 2026 Claude Code 529 outage
What a 529 overloaded error means and how a wrapper survives it.
Buying extra Claude usage
How Claude usage and credits work when you bring your own account.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.