Anthropic API, 2026

Anthropic API release notes, 2026: the changes that actually reach a Claude Code wrapper

Anthropic ships API changes most weeks. Reading the full list as a developer who runs Claude Code, only one question matters: does this reach me, or is it a server-side feature my client has to go opt into? This page sorts the 2026 changelog by that question.

M
Matthew Diakonov
9 min read

Direct answer, verified June 15, 2026

Anthropic's official, continuously updated API release notes are at platform.claude.com/docs/en/release-notes/overview. The headline 2026 changes: Claude Opus 4.7 landed April 16, then Claude Opus 4.8 on May 28 (now the API default at $5 / $25 per million tokens, with the effort parameter defaulting to high); Claude Fable 5 arrived June 9; Managed Agents and the ant CLI shipped April 8; Claude Platform on AWS launched May 11; and a run of models retired, including Haiku 3 on April 20 and both Sonnet 4 and Opus 4 on June 15.

Two kinds of release note

Every roundup of Anthropic's 2026 API changes lists the same items in date order. That is useful, but it flattens a distinction that decides whether a given line affects you at all. There are model-level changes and there are platform-level changes, and they reach a Claude Code wrapper through completely different paths.

A model-level change (a new model ID, a new tokenizer, the effort default, the jump to a 1M context window) flows through to anything that calls the Messages API. When you run the real Claude Code agent loop on your own Claude Pro or Max account, you inherit it with no integration work. When Anthropic flipped the default to Opus 4.8 on May 28, a Claude Code wrapper picked it up the same moment the bare CLI did.

A platform-level change (Managed Agents, Claude Platform on AWS, server-side compaction, the advisor tool) is a feature a client chooses to build on. A native desktop chat app does not have to adopt any of them, and sometimes the right product decision is to skip one on purpose. The clearest example is the change this whole page is built around.

The one we skip on purpose

Server-side compaction shipped in January 2026. Fazm runs none of it.

On January 12, 2026 Anthropic shipped server-side context compaction (beta header compact-2026-01-12): as a long session nears the context limit, the API summarizes earlier history into a compaction block you pass back each turn. It is a real fix for the stateless-API problem, and it is a beta a client opts into.

Fazm opts out. The app keeps the full chat history live in context for the lifetime of the window, and persists every session across a Mac restart. The bridge that carries this is concrete, not marketing: acp-bridge/package.json pins @agentclientprotocol/claude-agent-acp@0.29.2 and @zed-industries/codex-acp@0.12.0, so you are talking to the real Claude Code (or Codex) agent loop, wrapped in a native UI that owns persistence itself rather than handing it back to a server summarizer.

The practical difference: three hours into a session, you can scroll back to the exact decision the agent made, verbatim, instead of to a summary the model wrote about that decision.

Server-side compaction vs. no compaction

Same problem, opposite answer. Anthropic's platform feature condenses; the wrapper preserves. Neither is forced by the API.

FeatureServer-side compaction (compact-2026-01-12)Fazm (no compaction)
Long session strategyEarlier history is condensed into a compaction block once it nears the context limitFull chat history stays live for the window's lifetime; nothing is summarized away
Where it runsServer-side on the Anthropic platform (beta header compact-2026-01-12)App-level decision in the ACP bridge on your Mac
What you can re-read laterA summary the model wrote; original turns are replacedEvery original message, verbatim, after a restart
Survives a Mac restartOut of scope; the API is stateless and the wrapper owns persistenceYes, every window auto-restored with history intact

The 2026 changelog, sorted by whether it reaches you

Newest first. Each entry is tagged with how it reaches someone running Claude Code through a desktop wrapper. Dates and details are taken from Anthropic's official release notes, read on June 15, 2026.

Jun 15, 2026Reaches you automatically

Claude Sonnet 4 and Claude Opus 4 retired

claude-sonnet-4-20250514 and claude-opus-4-20250514 now return an error on every request. The recommended targets are Sonnet 4.6 and Opus 4.8. If any tool you run still pins one of these dated IDs, it breaks today, not gradually.

Jun 9, 2026Only if the client adopts it

Claude Fable 5 and Mythos 5 launch

Fable 5 (claude-fable-5) is the most capable widely released model: 1M context by default, 128k max output, always-on adaptive thinking, and no support for thinking: disabled, manual thinking budgets, or assistant prefill (all return 400). Safety classifiers can return stop_reason: refusal; an opt-in fallbacks parameter re-runs a refused request on another model. Requires 30-day data retention.

Jun 5, 2026Reaches you automatically

Opus 4.1 deprecation announced

claude-opus-4-1-20250805 is scheduled to retire on the Claude API on August 5, 2026, with Opus 4.8 as the migration target.

Jun 2, 2026Only if the client adopts it

Refused-with-no-output requests are no longer billed

When a request returns stop_reason: refusal and the model generated no output, you are no longer charged for it. The advisor tool also gained a per-call max_tokens cap.

May 28, 2026Reaches you automatically

Claude Opus 4.8 becomes the default model

Opus 4.8 (claude-opus-4-8) launched as the most capable generally available model at the same $5 / $25 per MTok as 4.7, with 1M context and 128k output. The effort parameter now defaults to high across all surfaces, including Claude Code and the Messages API. Mid-conversation system messages also shipped, and the stop_details refusal field became publicly documented.

May 27, 2026Only if the client adopts it

Thinking-token usage is now reported separately

The Messages API response includes usage.output_tokens_details.thinking_tokens, so you can see how many billed output tokens went to extended thinking. When streaming, the breakdown lands on the final message_delta event.

May 13, 2026Only if the client adopts it

Cache diagnostics in public beta

Pass diagnostics.previous_message_id and the API returns a cache_miss_reason explaining where your prompt-cache prefix diverged from the previous turn. Requires the cache-diagnosis-2026-04-07 beta header.

May 11, 2026Does not apply to a desktop chat client

Claude Platform on AWS launches

The full Claude API (Messages, Files, Batches, Managed Agents, Skills, code execution, tool use) became available through AWS-native endpoints with AWS billing and IAM authentication. Distinct from partner-operated Amazon Bedrock; same-day API parity except self-hosted sandboxes.

Apr 16, 2026Reaches you automatically

Claude Opus 4.7 launches

Opus 4.7 introduced the xhigh effort level, task budgets, a new tokenizer (roughly 30% more tokens than pre-4.7 models for the same text), high-resolution vision, and breaking changes versus 4.6: budget_tokens and the sampling parameters temperature / top_p / top_k all return 400.

Apr 8, 2026Does not apply to a desktop chat client

Claude Managed Agents and the ant CLI

Managed Agents shipped in public beta under the managed-agents-2026-04-01 header: a server-managed agent harness where Anthropic runs the loop and hosts the tool sandbox. The ant CLI also launched for driving the API and versioning agent configs in YAML.

Mar 13, 2026Only if the client adopts it

Task Budgets beta

task-budgets-2026-03-13 lets you tell the model how many tokens it has for a full agentic loop. The model sees a running countdown and self-moderates, distinct from max_tokens, which is an enforced ceiling the model never sees. Minimum 20,000 tokens.

Jan 12, 2026Only if the client adopts it

Server-side context compaction beta

compact-2026-01-12 is Anthropic's platform answer to long sessions: when context approaches the limit, the API summarizes earlier history server-side into a compaction block you pass back each turn. It is a beta a client opts into, not a default, and it is the one change this page is built around.

This is a selected list focused on changes a developer running Claude Code would feel. Anthropic publishes far more (Managed Agents memory, webhooks, multi-agent sessions, the Rate Limits API, MCP tunnels, and more); the full notes are the complete record.

Model retirements to check today

A retired model ID is a hard error, not a soft warning. If anything in your stack pins one of these dated IDs, it stops working the day it retires. This is the one part of the changelog worth grepping your configs for.

ModelRetired / scheduledMigrate to
Claude Sonnet 4 / Opus 4Jun 15, 2026Sonnet 4.6 / Opus 4.8
Claude Opus 4.1Aug 5, 2026 (scheduled)Opus 4.8
Claude Haiku 3Apr 20, 2026Haiku 4.5
Claude Sonnet 3.7 / Haiku 3.5Feb 19, 2026Sonnet 4.6 / Haiku 4.5

Running Claude Code on a bring-your-own Claude account sidesteps most of this: you call whatever your plan currently serves, so a retired dated ID is not something you maintain by hand.

Run Claude Code without losing the thread

See how a native macOS wrapper keeps full session history live and inherits each model change Anthropic ships, without an app update.

Questions developers ask about the 2026 notes

Where are Anthropic's official API release notes for 2026?

They live at platform.claude.com/docs/en/release-notes/overview, under the heading 'Claude Platform', and they are updated continuously, usually several times a week. That page is the primary source for everything on this page; it covers the Claude API, the client SDKs, and the Claude Console. Claude app updates (claude.ai, Cowork) are in a separate Help Center article, and Claude Code's own changes are in the CHANGELOG.md in the claude-code repository.

What actually changed in the Anthropic API in 2026?

The biggest items, in order: Claude Opus 4.7 launched April 16 with breaking changes versus 4.6 (sampling parameters and manual thinking budgets now 400), Managed Agents and the ant CLI shipped April 8, Claude Platform on AWS launched May 11, Claude Opus 4.8 launched May 28 and became the default model with effort defaulting to high, and Claude Fable 5 launched June 9. Alongside the models, Anthropic shipped a steady stream of betas: server-side compaction (compact-2026-01-12), Task Budgets (task-budgets-2026-03-13), mid-conversation system messages, and cache diagnostics. Several models were also retired: Haiku 3 on April 20, and Sonnet 4 and Opus 4 on June 15.

Is Claude Opus 4.8 the default model on the API now?

Yes. Opus 4.8 (claude-opus-4-8) launched May 28, 2026 as the most capable generally available model at $5 / $25 per million tokens, with a 1M-token context window by default and 128k max output. On launch it became the default across surfaces, and the effort parameter now defaults to high on the Messages API and in Claude Code. Because Fazm runs the real Claude Code agent loop on your own Claude Pro or Max account through ACP, that default reached Fazm users without any Fazm update.

Which 2026 model retirements broke existing code?

Four waves. Sonnet 3.7 and Haiku 3.5 retired February 19. Opus 3 retired January 5. Haiku 3 retired April 20. Sonnet 4 (claude-sonnet-4-20250514) and Opus 4 (claude-opus-4-20250514) retired June 15, and now return an error on every request. Opus 4.1 is scheduled to retire August 5, 2026. A retired model is a hard 404, not a soft warning, so anything pinning a dated ID for one of these breaks the day it retires.

What is server-side compaction, and does Fazm use it?

Server-side compaction (beta header compact-2026-01-12) is Anthropic's platform feature for long conversations: as you approach the context window, the API summarizes earlier history into a compaction block, and you pass that block back on each turn. Fazm deliberately does not use it. The app keeps the full chat history live in context for the lifetime of the window, and persists every session across a Mac restart, so you can scroll back to the exact decision the agent made three hours ago rather than to a summary of it. That is a wrapper-level choice, made in the ACP bridge, not something the API forces either way.

Does Fazm need an update every time Anthropic ships a new model?

No. Fazm wraps Claude Code via the Agent Client Protocol and you bring your own Claude Pro or Max account, so the model your Claude plan serves is the model the agent runs. When Anthropic flipped the default to Opus 4.8 on May 28, Fazm sessions picked it up the same way the bare Claude Code CLI did. Server-side platform features like Managed Agents or compaction are different: those are choices a client makes, and a native desktop chat app does not have to adopt them.

Which 2026 API features are platform-level rather than model-level?

Managed Agents (managed-agents-2026-04-01), Claude Platform on AWS, scheduled deployments, vault environment-variable credentials, the advisor tool (advisor-tool-2026-03-01), cache diagnostics, and server-side compaction are all platform or SDK features you opt into deliberately. Model-level changes (a new model ID, a new tokenizer, the effort default, the 1M context window) flow through to anything that calls the Messages API, including a Claude Code wrapper, without per-feature integration work.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.