Claude Pro / Max overflow billing, explained from the SDK

What is Claude extra usage, seen from the rate_limit_event stream

Extra usage is the pay-as-you-go overflow Anthropic enables once you burn through your Pro, Max 5x, or Max 20x session and weekly limits. Every SERP result explains this as prose. Almost none of them show you the actual SDK event shape. This guide walks through the five rate_limit_type values Anthropic emits, the overageStatus and overageDisabledReason fields, and how a real desktop client intercepts them so you see the state in real time, before the Anthropic billing dashboard refreshes.

M
Matthew Diakonov
11 min read
4.9from Backed by the Mediar team
Forwards Anthropic's rate_limit_event live
Maps five rate_limit_type values to human labels
Surfaces overageStatus and overageDisabledReason
Works with your own Claude Pro or Max seat

The plain-English version

Anthropic sells Claude on three consumer subscription tiers: Pro, Max 5x, and Max 20x. Each tier gives you a rolling 5-hour session budget and a 7-day weekly budget, denominated in tokens under the hood but presented as message counts in the UI. When you exhaust either budget, the product has two choices: wall you off until the window resets, or let you keep going at API rates. The second path is what Anthropic calls extra usage.

Extra usage is off by default. You turn it on at settings.claude.com, set a monthly spending cap, and agree that Claude will bill any overflow at the same per-token prices as the Anthropic API. If a card is on file and the monthly cap is not yet reached, Anthropic keeps serving requests past your subscription cap and invoices the difference. If there is no card, or you hit your cap, the API returns a rate-limit rejection with a reason string and the chat stops.

Most guides stop there. The interesting part is underneath. Every time the Claude SDK or the claude-code agent makes a request, the response stream carries a structured event called rate_limit_event. That event is how your client knows, to the token, where you are in the window. The default ACP agent drops it. A well-written desktop client forwards it.

5 types

Extra usage is what Anthropic calls the overflow at standard API rates. On the SDK it is a rateLimitType value of 'overage' plus two status fields that say whether it is allowed and, if not, why.

Fazm acp-bridge/src/protocol.ts, RateLimitMessage interface

The five rate_limit_type values Anthropic actually emits

If you read the help center, you get generic language about "limits". If you tap the SDK event stream, you get five enumerated values. Fazm's ChatProvider.swift at line 537 maps each one to the human label that renders in the Mac UI.

five_hour

Your rolling 5-hour session budget. The one Pro users hit first, usually after a long Claude Code session or a chat with a lot of tool use.

seven_day

The 7-day weekly cap, applied across all models. Relevant on every paid tier. Pro users rarely see this; Max 20x users see it when they run agents continuously.

seven_day_opus

A separate weekly cap for Opus, enforced on Max plans that meter Opus differently from Sonnet. Lets Anthropic throttle the expensive model without blocking the cheap one.

seven_day_sonnet

The sibling weekly cap for Sonnet. Not everyone will ever see this emitted; it fires when a plan has differentiated per-family budgets.

overage

Extra usage. Your included budget is gone; you are now billing at API rates up to your monthly cap. This is the value that maps to 'extra usage limit' in Fazm.

The anchor fact: what the event actually looks like

This is the exact TypeScript interface Fazm's Node bridge declares for the forwarded event. It lives at acp-bridge/src/protocol.ts lines 244 to 253. Every field corresponds to a key on the rate_limit_event the Anthropic SDK emits. The two overage-specific fields (overageStatus and overageDisabledReason) are what make extra usage programmatically visible.

acp-bridge/src/protocol.ts

The key is isUsingOverage. That is the single boolean that tells you whether the request you are about to make is going to bill against your Pro seat or against your extra usage invoice. It flips to true the moment your included budget is gone and Anthropic decides to keep serving at API rates. The client does not have to guess; the event says so.

Excerpt, ChatProvider.swift lines 537 to 546

static func rateLimitTypeLabel(_ type: String?) -> String {
    switch type {
    case "five_hour": return "session limit"
    case "seven_day": return "weekly limit"
    case "seven_day_opus": return "Opus weekly limit"
    case "seven_day_sonnet": return "Sonnet weekly limit"
    case "overage": return "extra usage limit"
    default: return "usage limit"
    }
}

Why your Claude Code terminal does not show this, and how Fazm does

The stock Agent Client Protocol agent that ships with claude-code eats rate_limit_event internally. It does not relay the event to whatever editor or desktop client is attached. That means if you are running claude in a terminal, you often find out you crossed into extra usage only when billing emails you later. Fazm patches the agent entrypoint to forward the event explicitly. The code lives at acp-bridge/src/patched-acp-entry.mjs lines 131 to 150.

acp-bridge/src/patched-acp-entry.mjs

The patch is 20 lines, including error handling. Without it, your desktop client has no idea the user is one prompt away from an extra usage charge, or that their extra usage was disabled because their card expired, or that they crossed into the overage branch at all. With the patch, every one of those states becomes a first-class event on the stream.

How the event gets from Anthropic to your floating control bar

Three hops, no dashboard poll. The SDK emits rate_limit_event inline on the response stream. The patched ACP entrypoint forwards it to the Node bridge. The bridge encodes it as a protocol message over stdout. Swift decodes it in ACPBridge.swift line 1108 and fires a StatusEvent the ChatProvider consumes.

The rate_limit_event pipeline

Claude API
Anthropic SDK
ACP agent
patched-acp-entry.mjs
protocol.ts
ACPBridge.swift
ChatProvider
Floating bar

What actually happens the moment you cross into extra usage

1

You send a prompt inside your Pro or Max window

Every response stream carries a rate_limit_event with status = allowed and utilization rising toward 1.0. You do not see it because the ACP agent does not forward it.

2

Utilization crosses the warning threshold

Anthropic flips status to allowed_warning on the event. Fazm logs 'Rate limit warning — 85% of session limit used' and fires a PostHog rate_limit_event analytic so we can see the distribution. The user gets a soft nudge in the UI.

3

You hit the limit

status flips to rejected for the active rateLimitType (usually five_hour for Pro users). Anthropic checks whether extra usage is allowed on your account. If it is, the next event carries overageStatus = allowed and isUsingOverage = true.

4

Your request starts billing at API rates

You continue typing. The agent keeps serving. Every subsequent rate_limit_event carries isUsingOverage = true. You are accumulating metered spend against your monthly extra usage cap.

5

Extra usage gets disabled

Either you hit your monthly cap, or your card gets declined, or a Team admin flips it off. overageStatus flips to rejected and overageDisabledReason becomes a short machine-readable string ('monthly_cap_reached', 'no_payment_method', 'admin_disabled').

6

Fazm renders the state, not a refresh delay

ChatProvider.handleRateLimitEvent updates rateLimitStatus, rateLimitResetsAt, rateLimitType, rateLimitUtilization on @Published properties. SwiftUI re-renders the floating control bar with the correct message in the same frame as the event arrives.

0

distinct rateLimitType values Fazm maps

0

lines of patch to un-drop rate_limit_event

0

fields on the RateLimitMessage interface

0

Anthropic billing dashboard polls needed

What the log looks like when a user crosses into extra usage

A condensed transcript from /tmp/fazm.log of a Pro user approaching, crossing, and exhausting their extra usage cap during one long session. The actual log line format matches ChatProvider.handleRateLimitEvent.

fazm.log

What the event stream carries vs. what a default client exposes

FeatureDefault ACP / Claude Code CLIFazm (patched bridge)
rate_limit_event forwardedNo, dropped internallyYes, forwarded as protocol RateLimitMessage
rateLimitType exposedNot exposedfive_hour | seven_day | seven_day_opus | seven_day_sonnet | overage
overageStatus surfacedNot surfacedallowed or rejected, per event
overageDisabledReason surfacedNot surfacedVerbatim string: no_payment_method, monthly_cap_reached, etc.
isUsingOverage booleanNot surfacedLive flag on every event while in overage
Warning before rejectionNo allowed_warning surfacedFires at utilization threshold, logs and renders
resetsAt timestampNot surfacedUnix seconds, rendered as 'resets at 14:32' in UI

Request-level view of a Pro user entering, then exhausting, extra usage

Session, overage, cap reached

Mac UIACP bridgeAnthropic APIprompt (in session window)messages.createrate_limit_event: five_hour, 68%text delta streamanother prompt (near limit)rate_limit_event: allowed_warning 97%one more prompt (over)rate_limit_event: rejected five_hourisUsingOverage=true, overageStatus=allowedresponse continues, meteredrate_limit_event: rejected overageoverageDisabledReason=monthly_cap_reached

What a client can do once it actually has this signal

A dashboard pull tells you where you were ten minutes ago. The SDK event tells you where you are right now. Once you have the live signal, the product surface expands in obvious ways.

With live rate_limit_event

  • Warn the user at 85% utilization, before rejection
  • Show the exact reset time, not 'try again later'
  • Differentiate 'your Pro cap' from 'your extra usage cap'
  • Render 'add a card' if overageDisabledReason is no_payment_method
  • Offer an upgrade prompt if seven_day_opus is the blocker
  • Log utilization curves per session for later review

Without the event (default CLI behavior)

  • Silent failure when the window exhausts mid-tool-use
  • Generic 'rate limited, try again later' message
  • No way to tell session cap apart from weekly cap
  • No way to tell Pro cap apart from extra usage cap
  • User finds out about the overage via billing email
  • No programmatic hook to auto-pause the agent

Edge cases the event lets you handle

Team seat with admin-disabled overage

Team admins can turn extra usage off for specific seats. The event surfaces overageDisabledReason = admin_disabled; the right UI here is a 'ask your admin' message, not 'add a card'.

Card on file but expired

Anthropic sets overageDisabledReason to no_payment_method even when a card exists but is unchargeable. The fix is a billing portal deep link, not a retry.

Monthly cap reached

overageDisabledReason = monthly_cap_reached. The remedy is raising the cap in settings, not waiting.

Pro plan, no overage enabled

First rejection on five_hour arrives with overageStatus = rejected and a reason like billing_paused. This is the enable-overage upsell moment.

Opus-only weekly hit

rateLimitType = seven_day_opus with seven_day still well under 1.0. The fix is 'switch to Sonnet for the rest of the week', not 'wait until tomorrow'.

Fields on the rate_limit_event Fazm forwards verbatim

status
rateLimitType
utilization
resetsAt
overageStatus
overageDisabledReason
isUsingOverage
five_hour
seven_day
seven_day_opus
seven_day_sonnet
overage
allowed
allowed_warning
rejected
no_payment_method
monthly_cap_reached
admin_disabled
billing_paused

When you actually need to care about extra usage

If you chat with Claude on claude.ai for a few messages a day, you will never hit a limit, and extra usage is a line item you can ignore. The people who care are the ones running agents: Claude Code users doing long refactors, Cursor users with heavy tool-use loops, desktop clients like Fazm that leave the agent running in the background. For those workloads, a single long session can consume a full 5-hour budget in 30 minutes.

That is why extra usage exists. It is a release valve that keeps the work flowing instead of hard-stopping at the included cap. And that is also why the SDK-level event shape matters: the difference between a smooth overflow and a silent failure is whether your client parses rate_limit_event or not.

0status values: allowed, allowed_warning, rejected
0rate limit types Anthropic emits
0overage-specific fields: overageStatus, overageDisabledReason
0boolean that tells you if you are billing extra (isUsingOverage)

Want to see this live on your Mac?

Book a 15-minute walkthrough. We will run Fazm against your own Claude seat and watch the rate_limit_event pipeline surface in real time.

Book a call

Frequently asked questions

What is Claude extra usage, in one sentence?

Extra usage is Anthropic's pay-as-you-go overflow for paid Claude plans. Once you hit your Pro, Max 5x, or Max 20x limit, instead of being blocked you start consuming at standard API rates until a monthly cap you set. Anthropic's own help center calls it 'extra usage' in the UI, and at the SDK level the same state is surfaced on the rate_limit_event as rateLimitType: "overage".

What does the underlying rate_limit_event payload actually contain?

Five fields matter: status (allowed, allowed_warning, rejected), rateLimitType (five_hour, seven_day, seven_day_opus, seven_day_sonnet, overage), utilization (0 to 1), resetsAt (unix seconds), plus two overage-specific fields — overageStatus (allowed or rejected) and overageDisabledReason (a string explaining why extra usage was turned off, e.g. no card on file, hit the monthly cap, admin-disabled for a Team seat). Fazm's acp-bridge/src/protocol.ts declares exactly this shape in the RateLimitMessage interface at lines 244 to 253.

Why does the default ACP agent drop rate_limit_event messages?

The stock Agent Client Protocol agent that ships with claude-code does not forward rate_limit_event to downstream consumers — it eats them internally. Fazm had to patch the entrypoint. The file patched-acp-entry.mjs, lines 131 to 150, wraps the session update stream and explicitly forwards any item whose value.type equals 'rate_limit_event', mapping the inner rate_limit_info block into the protocol's RateLimitMessage so Swift on the desktop side can render a warning or upgrade prompt.

What are the five rate_limit_type values in practice?

five_hour is your rolling 5-hour session window (this is what most Pro users hit first). seven_day is the weekly cap. seven_day_opus and seven_day_sonnet are separate weekly caps per model family, relevant on Max plans that have mixed model access. overage means your plan's included usage is gone and you are now billing against extra usage. Fazm's ChatProvider.swift at line 537 maps these to human labels: 'session limit', 'weekly limit', 'Opus weekly limit', 'Sonnet weekly limit', 'extra usage limit'.

What does overageDisabledReason look like when extra usage is off?

Anthropic emits a short string explaining why extra usage is not available. Typical values are 'no_payment_method', 'monthly_cap_reached', 'admin_disabled', 'billing_paused', and a generic fallback. Fazm stores this verbatim on the ChatProvider state so the UI can render actionable copy like 'Your extra usage cap of $X was reached; raise it in Anthropic billing' rather than the generic 'rate limited, try again later' fallback.

How is extra usage priced?

Anthropic bills extra usage at standard API rates for whichever model served the request. Sonnet and Opus have different per-token prices; the extra usage invoice itemizes by model. You set a monthly cap in settings.claude.com; hitting that cap flips overageStatus to rejected with overageDisabledReason set to monthly_cap_reached until the next billing cycle. Usage bundles (topped-up credits) are debited first when present, before metered extra usage kicks in.

When does extra usage actually trigger?

Only after your included plan usage is exhausted for the current window. While you are inside your Pro or Max session window, isUsingOverage on the event payload is false. The moment Anthropic flips you to the pay-as-you-go path, isUsingOverage becomes true on every subsequent rate_limit_event until the window resets. That is the cleanest programmatic signal: a boolean on the SDK stream, not a dashboard refresh.

Do I need extra usage if I use the Claude API directly?

No. Extra usage is a subscription concept. If you authenticate with an ANTHROPIC_API_KEY you are already billing at API rates for every call; there is no included quota to exhaust. Fazm's bridge explicitly supports both paths: personalOAuth mode (your Claude Pro or Max seat, with extra usage semantics) and bundledKey mode (raw API key, no overage concept). The mode is selected in ACPBridge.swift line 193 onward.

Can I watch my extra usage state in real time on my Mac?

Yes, if your client parses rate_limit_event. Fazm does: every event updates four @Published fields on ChatProvider — rateLimitStatus, rateLimitResetsAt, rateLimitType, and rateLimitUtilization — which SwiftUI reactively binds to. The floating control bar lights up with a warning when utilization crosses the warning threshold and switches to a reject state with resets-at time the moment Anthropic flips status to rejected. No billing page refresh needed.

What does 'allowed_warning' status mean?

It means you are inside the window but approaching the cap. Anthropic emits status = allowed_warning with a utilization value (e.g. 0.85 for 85% used) so clients can show a heads-up before users slam into a rejection mid-task. Fazm logs this as 'Rate limit warning — 85% of session limit used' and fires a PostHog rate_limit_event analytic so we can see the aggregate distribution.