Anthropic returned 529 overloaded. Claude Code told users their org was out of credit. The bridge layer is where the real fix lives.
On the weekend of May 15 to 17, 2026, Anthropic absorbed three capacity events on the Claude API surface and responded the way the docs say to respond: HTTP 529 overloaded. Most Claude Code clients did not branch on the error code at the bridge, so the user-visible surface was a wallet prompt, a fallback model switch, or a silent session loss. The right place to fix this is one TypeScript file, not a banner.
What changed, what stayed the same
- Read the error code at the bridge, not the message at the UI: a 529 is not a 402
- Treat 529 as transient. Retry with backoff on the same provider, not a fallback model
- Keep one session per window in a bridge map so siblings do not share a fate
- Queue the user's in-flight prompt and replay it on retry; do not discard it
- Surface a provider-named overload message, not a wallet prompt, unless the wallet is actually empty
- Persist the conversation to disk so the chat survives a restart
The wedge in one sentence
A vendor 529 is a transient overload, and any client that surfaces it to the user as a billing event is misreading its own error code.
Expanded: 529 is documented behavior. The agent layer's job is to catch it at the boundary closest to the network, classify it correctly, retry on the same provider, and keep the streaming socket to the UI open through the retry. Every other piece of behavior (no fallback switch, no sibling abort, no wallet prompt, no lost transcript) falls out for free once you do that one thing. The reason most clients did not do it on May 15 to 17 is that they built the wallet branch first, back when 529s were rare, and they never tightened the heuristic. Fazm's acp-bridge layer is the place where that decision lives in this product, and v2.9.18 added the branch.
How most clients handle a 529 vs. how a bridge layer should
529 treated as terminal. Active chat aborted or flipped to a smaller fallback model. Sibling chats in the same wrapper aborted as collateral damage. User sees 'your org is out of extra usage, we let your admin know' or 'usage limit reached'. Quoted as a screenshot. Goes viral.
- 529 treated as terminal
- Fallback model switch
- Sibling chats aborted
- Wallet prompt surfaces
- In-flight prompt lost
The patch, in TypeScript
The branch that absorbs the 529 lives in the bridge file in the public Fazm repo. Paraphrased lightly for readability; the variable names and the structure match the actual code on disk:
The function is ten lines that matter. Catch the error. Read status or error.code. If it is 529, treat it as a transient overload, emit a structured overload_retry event to Swift so the UI can print an honest line, sleep with exponential backoff, then loop. If the retry budget is exhausted the error bubbles up to the UI with the same honest framing, not a wallet prompt.
Timeline of the weekend
Reconstructed from status.anthropic.com incident updates, the public Fazm CHANGELOG.json, and the timestamps on the X posts that piled on the cluster:
Fri May 15, 2026, evening PT
First wave of Claude API 529 'overloaded' responses observed on status.anthropic.com. Claude Code clients without 529 handling begin surfacing 'out of credit' and 'usage limit reached' messages instead of an outage banner.
Fazm acp-bridge ships v2.9.18 the same evening. The 529 branch is added to the streaming retry loop in acp-bridge/src/index.ts. CHANGELOG.json gets a dated entry.
Sat May 16, 2026, mid-day PT
Second wave of 529s. Reports on X and Reddit spike. Several screenshots show Claude Code surfacing 'your org is out of extra usage, we let your admin know' for what is in fact a capacity overload at Anthropic's end.
Fazm ships v2.9.19 with pre-warmed pop-out sessions and surfaces real Codex failure reasons (e.g. ChatGPT usage limit reached) instead of a generic error. Workspaces using Codex via codex-acp can switch a chat to OpenAI mid-session.
Sun May 17, 2026, morning PT
Third wave. Viral X threads land: 22,346 views on the eastweb3eth quote-tweet pointing at fazm.ai, 26,330 views on the svpino thread on the same topic. Both link to the homepage, not to any deep page.
The fact that the viral links pointed at the bare homepage is what motivated this dedicated /t/ page and the NewsStrip pointing here from the top of fazm.ai.
Sun May 17, 2026, afternoon PT
Anthropic status page returns to all green. The misleading 'out of credit' banners that surfaced for Team and Enterprise users during the outage windows resolve as their clients see successful responses again. No retroactive correction on the wallet UI.
Fazm v2.9.22 ships overnight with workspace-switch transcript replay, unrelated to the outage but the next thing on the queue. The bridge 529 branch from v2.9.18 keeps doing its job.
The mechanism, as a sequence
The shape of a 529 retry across the three layers (Fazm Swift UI, acp-bridge Node process, Anthropic API):
A 529 retry, end to end
Three layers, one branch
What it looks like in the log
The bridge writes structured log lines to /tmp/fazm-dev.log on dev builds and /tmp/fazm.log on production builds. A real 529 retry during the May 17 morning window looked roughly like this:
The X thread, in full
The reason this page exists is that two threads on X reached a combined ~48,000 views on May 17 and both linked to the bare homepage. The threads named the symptom (Claude Code surfacing wallet errors during an outage) but the deep link they offered was just fazm.ai. Below are the source posts, quoted with their verbatim engagement metrics for the record:
The viral quote-tweet of the outage cluster. The link in the post points to fazm.ai (the homepage), not to any deep page on the story, which is what motivated this dedicated /t/ article and the NewsStrip on the homepage that routes the click through to here.
22,346 views, 175 likes, 72 replies
https://x.com/eastweb3eth/status/2055482080134586787
Santiago Valdarrama's thread on the same outage and the way Claude Code clients mislabeled it. Reaches the ML educator audience and adds the 'this is a client problem, not an Anthropic billing problem' frame.
26,330 views, 115 likes, 26 replies
https://x.com/svpino/status/2055759187825893461
The 529 'overloaded' code is documented on docs.anthropic.com/api/errors and surfaced on status.anthropic.com as elevated-error events. The viral X cluster is the community-side primary source because there was no separate Anthropic blog post for the weekend, only the live status timeline.
Multiple incident windows on the public status page
https://status.anthropic.com/
What to do this week if you ship a Claude Code client
Read the error code at the bridge
Branch on the HTTP status and the body error.code before deciding which UI surface to show. A 529 must not fall through into the wallet branch.
Add an explicit 529 retry
Exponential backoff. Same provider. Same session ID. Cap the retry budget at three to five attempts so a real long outage still surfaces an error eventually.
Keep the streaming socket open
Do not close the chat's stream during the retry. The Swift (or web) UI should see an overload_retry event and keep its conversation buffer intact.
Isolate sessions
One session per chat window in a bridge map. A 529 on one chat must not abort any other chat in the workspace.
Replay the in-flight prompt
Voice-transcribed or typed, whatever the user sent at t=0 has to ride the retry. Discarding it is the worst failure mode because the user does not know it happened.
Persist sessions to disk
JSONL on the filesystem under ~/.claude/projects (or your equivalent). If the outage exceeds your retry budget the chat can still resume after Anthropic recovers.
Before the news vs. after the news
| Feature | What 529 meant in the average client (before) | What 529 means now (after May 15 to 17) |
|---|---|---|
| Anthropic returns HTTP 529 overloaded mid-stream | Bubbles to the UI as a generic error, often mapped to a billing or 'usage limit' surface | Caught at bridge, classified as transient, retried on the same provider with backoff |
| Active chat window during an overload retry | Window flips to a fallback model or ends the turn. Context may reset. | Streaming socket stays open. UI prints 'overload, retrying'. Conversation context preserved. |
| Sibling chat windows when one chat hits 529 | Aborted as collateral damage in many wrappers that share one transport. | Untouched. Each window holds its own session in the bridge map. |
| User-visible message during a real outage | 'Your org is out of extra usage' / 'usage limit reached' / 'switched to a smaller model'. | Names the provider and the cause. No wallet prompt unless wallet is actually empty. |
| Voice input dictated during an overload | Often discards the prompt or returns a generic 'send failed' that requires manual retry. | Transcribed locally with WhisperKit. Prompt queued and replayed on retry. No redictation needed. |
| Session persistence across the outage window | Many wrappers keep state only in memory. Closing the window or restarting loses the chat. | JSONL transcript on disk under ~/.claude/projects. Resumes after Anthropic recovers, no copy-paste. |
The numbers from the weekend
Exact values from the public sources. The viral X thread counts come from the post analytics on May 17 evening PT; the rest is from Anthropic's public surfaces and the Fazm repo.
Myths the weekend corrected
The synthesized takeaway
An AI agent that survives a vendor outage is not a different category of agent. It is the same agent with one more branch in the bridge: catch 529, classify as transient, retry. Everything else that made the weekend feel chaotic (the wallet prompt, the fallback model switch, the silent session loss) was a second-order consequence of failing to add that one branch. The fix is not big. The shape of the fix is what matters: it lives in the layer closest to the network and farthest from the UI, where error codes still mean what the documentation says they mean.
Plan-by-plan: who saw what during the weekend
The 529 itself is plan-agnostic; it fires on the API surface, not on a per-plan basis. The user-visible surface, on the other hand, was shaped by which billing path the client's "something went wrong" fallback was wired to. Here is what that looked like across the common configurations:
| Feature | What was reported during the weekend | The right behavior |
|---|---|---|
| Claude Pro and Max (personal) | Still affected if the client maps any non-200 to a wallet prompt. | Most resilient to wallet-shaped error mapping (no org pool to misread). 529s still possible. |
| Claude Team | Where the misleading 'we let your admin know' surface was reported during the outage windows. | Bridge layer should branch on the error code before deciding to show the 'out of extra usage' banner. |
| Claude Enterprise | Admins received misleading 'out of credit' alerts on what was a vendor capacity event. | Same as Team for the wallet UX, plus admin-side notifications that can fire on overload events if the client does not branch. |
| Anthropic API direct | A naive script that crashes on the first non-200 will lose work. | 529 is documented behavior. Retry with backoff. Same fix as a wrapper. |
Honest caveats
- A retry budget is not infinite. If Anthropic stays overloaded for tens of minutes the bridge will eventually bubble an error to the UI. The fix moves the failure from "silently mislabeled" to "clearly named", not to "never happens".
- Switching providers per chat is a real option. The same wrapper that absorbs 529s can also let a user flip an individual chat to Codex via codex-acp for the duration of the outage. That is not a substitute for the retry branch; it is what you do if the outage is long enough that retry budgets exhaust.
- Not all 529s are equal. A 529 with a long
Retry-Afterheader is a signal to back off harder, not retry faster. Read the header if present and respect it. - The wallet branch still has to exist. Out-of-credit and out-of-extra-usage are real states that can fire when no outage is happening. The point of the fix is to prevent that branch from firing when the error is a 529; it is not to remove the branch.
- This is not a model-quality story. Nothing about the May weekend is evidence about Claude 4.X being better or worse at a task. Capacity events are operational, not capability events.
Frequently asked, weekend edition
What was the Anthropic outage on the weekend of May 15 to 17, 2026?
A cluster of overload incidents on the Anthropic API surface that ran on a load-shedding signal: HTTP 529 'overloaded'. The status code is documented behavior for Claude when traffic exceeds the available capacity for a tier. Independent users reported the symptom hitting hardest in three windows: late Friday May 15 PT, mid-day Saturday May 16 PT, and Sunday morning May 17 PT, with the longest visible tail on Sunday. The pattern across the three windows is the same: requests succeed at the network layer, the body is JSON, the type field is 'error', the code field is the integer 529, and the message reads 'Overloaded'. It is not a billing event and it is not a per-user rate limit; it is the platform telling the client to back off because shared capacity is saturated.
Why did Claude Code show 'out of credit' or 'out of extra usage' during the outage?
Because most Claude Code clients in May 2026 treat any non-200 from the Anthropic API as terminal for the turn and then map it to whichever recovery surface they happen to have. Out-of-credit messaging is a Team-and-Enterprise affordance bolted to the wallet layer that is supposed to fire only when the org's extra usage pool is empty. A naive client that does not branch on the error code at the bridge layer surfaces a wallet message when the actual error is 529. That is what users saw: an outage-shaped error decorated as a billing prompt, with a 'let your admin know' phrase attached that has nothing to do with what failed.
Did Anthropic publish an official postmortem for the May 15 to 17 incidents?
Anthropic does maintain status.anthropic.com and publishes incident reports for API events; the public record of the weekend is several elevated-error and degraded-performance incidents on the Claude API and on the Claude.ai surface, all tied to the same overload signal. There was no separate blog-post-style postmortem at the time of writing this page; the canonical primary source is the incident timeline on the status page itself plus the response code semantics documented at docs.anthropic.com on the errors page. The fact that there is no glossy post is the right read: 529 is documented expected behavior for capacity events, not a regression.
Why did this become a viral story on X if it was just a normal capacity event?
Two reasons. First, Claude Code is now the primary developer surface for a meaningful slice of the engineering audience; an outage that disrupts a few hours of agent work for thousands of people is more visible than an equivalent dip on a chat product. Second, the failure mode in most clients was misleading. Users were not told 'Anthropic is overloaded, retry in a minute'; they were told 'your org is out of extra usage' or 'your model has been switched' or 'session ended'. Misleading errors generate quoted screenshots. Quoted screenshots are how things go viral. The cluster of posts on X on May 17 was the same handful of screenshots and the same shape of complaint repeated.
What changed in Fazm v2.9.18 that absorbs this kind of 529?
The bridge layer (acp-bridge in the Fazm repo) added explicit 529 handling on the streaming request to Anthropic. The raw 529 is caught at the bridge, classified as a transient overload, and an exponential backoff retry is fired on the same provider and the same session ID instead of bubbling the error to Swift. The streaming socket to the Mac app stays open through the retry, so the active chat window does not flip to a fallback model and does not abort. Sibling chat windows in the same Fazm process are completely untouched because each window holds its own session in the bridge map. The user-visible effect on May 15 to 17 was that Fazm windows printed a small 'overload, retrying' line and then continued; clients without the same branch printed 'out of credit' and stopped.
Where can I find the exact lines of code that added this behavior?
In the public mediar-ai/fazm repo on GitHub. The bridge file is acp-bridge/src/index.ts; the 529 branch sits near the streaming retry loop alongside the existing 429 and 5xx handling. The dated release line is in CHANGELOG.json under version 2.9.18 with date 2026-05-15. The CHANGELOG.json entry reads, paraphrased, that 529 overloads from Anthropic no longer trigger a fallback mode switch and no longer abort other open chat windows. The Swift side (the macOS app) was unchanged for v2.9.18; the fix is fully bridge-layer, which is part of why it shipped the same day as the first overload window without needing a notarized binary release.
Is this a Claude Code problem or an Anthropic problem?
Both, but in different ways. The 529 itself is correct behavior on Anthropic's side; capacity events have to surface as an error rather than a silent timeout, and 529 is a documented signal. The way most Claude Code clients translate that signal into a user-visible message is a client problem. There is no infrastructure reason a 529 has to read as a billing prompt; the only reason it does is that the recovery surface for capacity events on Team and Enterprise plans was built before agent loops were common, and the heuristic for 'show the out-of-credit page' did not get tightened when 529 became more frequent. The fix lives in clients.
Does Fazm just hide the outage, or does it surface it honestly?
It surfaces it honestly. The retry banner says 'overload, retrying' with the provider name and the retry count, and the active chat does not pretend it succeeded. If the overload persists past the configured retry budget the chat shows a clear 'Anthropic is overloaded, try again in a minute' message instead of a wallet prompt. The session is preserved on disk either way, so if you close the window you can reopen the conversation later. The point is not to mask the outage; the point is to name it correctly and keep the rest of the workspace usable while it lasts.
What should I actually do during an Anthropic overload window in 2026?
Three moves. First, check status.anthropic.com to confirm the cause; if there is an open incident the wallet-shaped error you are seeing is wrong. Second, if your client supports it, switch the affected chat to a different provider for the moment (a wrapper that supports Codex via codex-acp lets you do that per chat without changing IDEs). Third, if your workflow is session-shaped, prefer a client that keeps the chat on disk so the conversation survives the outage; resuming a chat after Anthropic recovers is preferable to losing the thread. The fourth, longer-term move is to bring your own Anthropic account (Pro or Max) into the wrapper so the wallet branch never fires for you in the first place.
Is voice input affected when Anthropic returns 529?
Voice transcription on Fazm runs on-device via WhisperKit. The transcript becomes text the moment you release the hotkey; the 529 only enters the picture when the agent step that follows sends the prompt to Anthropic. So during an outage your voice still types into the chat field locally, but the agent's reply is what stalls or retries. If you are dictating long instructions, the prompt is captured and queued, and the bridge replays it on retry once Anthropic recovers; you do not have to redictate. That is also the bridge-layer fix, not a UI patch.
The wallet-side, the changelog-side, and the compaction-side of the same story
Adjacent reads
Your org is out of extra usage, we let your admin know: what that message actually means
The other side of the same coin. When the wallet message is correct, it means the org pool ran out. When it fires during a 529 outage, the client misread the error code.
AI model releases, new papers, open-source projects, past 24 hours (May 2026): the better question to ask
Where Fazm v2.9.18 sits in the broader May 2026 release cycle, with the full v2.9.18 through v2.9.22 changelog window.
Why Claude Code compaction drops your decisions
The other class of silent failure in long sessions. A compaction does not feel like an outage but it removes work in much the same way: the model can no longer see what you decided.
Building an agent that has to survive a vendor outage
If you are shipping a Claude Code client (or any agent on a frontier model API) and want the 529 branch, the session-isolation map, and the JSONL persistence pattern walked through line by line, this is what the call is for.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.