Claude extra usage cost: the real rates, and the one desktop-agent path that skips them
Extra usage on paid Claude plans is billed at the standard API rate card: $0.80 to $75 per million tokens depending on model. Third-party tools like Cursor, Windsurf, and Claude Code draw exclusively from this pool, which is why a $50 day on Claude Code is common even on a Pro plan. The SERP explains the pricing. Nobody explains the alternative. If a Mac desktop agent talks to Claude through your OAuth session instead of an API key, every request counts against your subscription allowance, same as chatting on claude.ai. Fazm's ACP bridge is wired exactly this way. The Swift line that makes it happen is pinned below.
“When Fazm's ACP bridge is in personal-OAuth mode, it literally calls `env.removeValue(forKey: "ANTHROPIC_API_KEY")` before spawning the Node subprocess. With no API key in scope, Claude calls fall back to the OAuth session, which is billed against your Pro or Max allowance, not the per-token rate card.”
ACPBridge.swift, line 350
The anchor fact: how Fazm avoids Claude extra-usage billing in code
A desktop agent that calls Claude either authenticates with an API key (billed per token, same rate card as extra usage) or with an OAuth session (billed against your subscription allowance). The difference between those two paths is a single environment variable. Here is the exact line of Swift that controls which one Fazm uses.
Two cases. One billed through the API (extra usage), one billed through OAuth (subscription). Now the code that actually swaps which one is used when the bridge starts its Node subprocess.
That is the whole trick. In `personalOAuth` mode the environment variable is stripped before the Node child process is spawned, so the Claude client library inside the bridge cannot find an API key and falls back to the OAuth token cache left behind by the user's earlier browser sign-in. Every request is then attributed to the user's Anthropic account and deducted from the monthly Pro or Max allowance. No per-token extra-usage charge accrues.
And the reason any of this matters: Fazm does give new users a $10 bundled-API-key trial, tracked against `builtinCostCapUsd`. The moment that cap is hit, `switchBridgeMode(to: \"personal\")` fires and the OAuth path takes over. Your cost stops being per-token and starts being zero-marginal against a plan you were already paying for.
The two billing paths, side by side
Same Claude model. Same prompts. Same output quality. Completely different bill at the end of the month. The difference is how the request authenticates.
API key (extra usage) vs OAuth (subscription)
Any tool that calls Claude with an Anthropic API key is billed per token at the standard rate card. This includes Cursor, Windsurf, Claude Code, most self-built agents, and Fazm's bundled-key trial mode. Tokens map to dollars. $10 gets you roughly 33 to 200 Opus messages, 200 to 1,000 Sonnet messages, or 1,000 to 5,000 Haiku messages. The daily redemption cap is $2,000.
- Billed at $0.80 to $75 per million tokens
- Does not touch your Pro or Max allowance
- Same rate card as raw API usage
- What Cursor and Claude Code always use
How Fazm's ACP bridge picks the billing path
`ChatProvider.createBridge()` inspects the stored `bridgeMode`, the presence of a bundled key, and the cumulative cost to decide which Claude authentication path the Node subprocess gets. Four inputs, one decision, two possible outputs. Only one of them shows up on your extra-usage invoice.
Bridge mode selection (ChatProvider.swift lines 491-506)
The six ways Claude extra usage actually activates
A short, factual recap of how extra-usage billing works on paid Claude plans in April 2026. This is the stuff the SERP covers well, included here so the later sections make sense.
Monthly allowance ran out
Pro, Max 5x, and Max 20x plans include a usage allowance that refreshes monthly. When you go past it on Claude.ai directly, you can choose to keep working by opting into extra usage. Billed at standard API rates, no monthly reset.
Third-party Claude integrations
Cursor, Windsurf, Claude Code, and most API-key-based tools draw exclusively from extra usage credits, even if you never touched your Pro allowance that month.
Daily redemption cap
There is a $2,000 daily ceiling on how much extra usage can be redeemed from a single account. Past that, requests get rejected until the next day.
Credits do not expire
Once loaded, extra-usage credits stay in your account until spent. No monthly reset, no rollover penalty. They just sit there.
Standard API rate card applies
Extra usage is priced at the same per-million-token rates as raw API access. Haiku 4.5 is cheap. Opus 4.6 is expensive. The rate card is the rate card.
Separate from your plan price
Extra usage is a dollar balance that sits on top of Pro, Max 5x, or Max 20x. Running a large extra-usage bill does not change your subscription tier.
Same workload, two billing paths, real numbers
Take a full workday of Sonnet 4.6 usage: roughly 2 million input tokens and 400k output tokens across tool-heavy agent sessions. Here is what that looks like depending on whether the tool authenticates with an API key (extra usage) or an OAuth session (subscription).
| Feature | API-key path (Cursor, Claude Code, bundled-key trials) | OAuth path (Fazm personalOAuth, claude.ai) |
|---|---|---|
| Monthly billing model | Per-token rate card on top of plan | Flat plan price (Pro $20 / Max 5x $100 / Max 20x $200) |
| Cost of 2M in + 400k out on Sonnet 4.6 | 2M * $3 + 0.4M * $15 = $12.00 in extra usage | $0 marginal, counted in subscription allowance |
| Cost of 500k in + 100k out on Opus 4.6 | 0.5M * $15 + 0.1M * $75 = $15.00 in extra usage | $0 marginal, counted in subscription allowance |
| Behavior when allowance exhausted | Keeps billing per token up to $2,000 / day | Hard stop, rate-limit message, resets on schedule |
| Typical heavy-user month on Max 20x | $200 plan + $100 to $500 in extra usage credits | $200 flat, no surprises |
| What Fazm uses after the $10 trial | Not applicable (trial ends at the $10 cap) | personalOAuth, every request |
| What Cursor / Windsurf / Claude Code use | API key, always, draws from extra usage | Not supported |
| Verifiable in Fazm source at | ACPBridge.swift line 352 (env["ANTHROPIC_API_KEY"] = apiKey) | ACPBridge.swift line 350 (env.removeValue) |
The six-step flow from $10 trial to subscription-allowance billing
What actually happens inside Fazm when a new user exhausts the built-in trial. Every step routes through a specific Swift property or function, pinned to a file and line number.
1. New user opens Fazm the first time
The `bridgeMode` AppStorage value defaults to `"builtin"` (ChatProvider.swift line 439). Fazm spawns the Node ACP bridge with `ANTHROPIC_API_KEY` set from the bundled key (ACPBridge.swift line 352). Every request hits Anthropic's API. Cost accumulates locally in `builtinCumulativeCostUsd`.
2. The $10 cap is checked after every response
ChatProvider.swift lines 2812-2819 add `queryResult.costUsd` to the running total and compare it to `builtinCostCapUsd` (which is exactly `10.0`). Anything up to the cap is on the house.
3. Cap is reached, mode flips to personal OAuth
Line 2818 calls `await switchBridgeMode(to: "personal")`. The Node subprocess is terminated and a new one is launched in `BridgeMode.personalOAuth` (ACPBridge.swift line 195). The fresh process has no `ANTHROPIC_API_KEY` in its environment.
4. User connects their Claude account via OAuth
`ClaudeAuthSheet` (`Desktop/Sources/Chat/ClaudeAuthSheet.swift`) opens. The user signs in at claude.ai in their browser. Tokens land in the ACP bridge's cache. `isClaudeConnected` flips to true.
5. Subsequent requests go through the subscription allowance
From this point on, every Claude request is authenticated by the user's OAuth tokens, not an API key. Anthropic attributes the tokens to the user's Pro or Max plan and decrements the included allowance. No per-token extra-usage charges accrue.
6. If the subscription allowance is exhausted, Fazm reports it
The ACP bridge emits a `rateLimit` event (line 183) with `resetsAt` and `rateLimitType`. ChatProvider surfaces a user-facing message (ACPBridge.swift line 1603: `You've hit Claude's usage limit (\(resets)). Upgrade to Claude Pro at claude.ai for higher limits.`). The app does not silently start racking up extra-usage charges.
The runtime wiring, by the numbers
Four line numbers, one constant, two bridge modes. That is the complete wiring that moves a user off extra-usage billing and onto their own Claude subscription.
Verify the wiring with three greps
If you distrust any of the line-number claims above, run these against a local Fazm desktop source tree. Each line of output matches a line in the source.
Three consequences of the OAuth path for a heavy Claude user
Not abstract. These are the three things that change for a user who moves from API-key-based Claude usage to OAuth-based Claude usage for their Mac-side automation work.
Bill becomes predictable
Max 20x is $200 flat per month, no matter how many agent sessions you run. With API keys the same workload can swing from $50 to $500 a month depending on how tool-heavy the sessions get.
Rate limits replace runaway billing
On OAuth, hitting your allowance fails fast with a clear reset timestamp. On API keys, the same usage silently keeps billing up to the $2,000 daily cap, often past the point a user would have stopped.
One account, one allowance, many apps
Your Claude.ai chat, Fazm's desktop automation, and any future first-party Anthropic surface all share the same subscription allowance. Extra usage is confined to the specific integrations that cannot OAuth.
Claude extra usage, as the rate card shows it
Public Anthropic pricing for April 2026. The OAuth path avoids every dollar amount on the left side of this strip.
Run your Mac agent on your existing Claude plan
Fazm ships with a $10 bundled-API-key trial. The moment it ends, the ACP bridge auto-switches to your Claude Pro or Max OAuth session. Every Mac-automation request then counts against the allowance you already pay for, the same as chatting on claude.ai. No per-token extra-usage billing, no surprise monthly invoices, no separate credit balance to top up.
Download Fazm →Frequently asked questions
What is Claude extra usage and when does it cost extra?
Extra usage is Anthropic's name for pay-as-you-go token billing that kicks in on paid Claude plans once two things happen. One, you burn through the included monthly allowance on your Pro, Max 5x, or Max 20x plan. Two, you use a third-party integration such as Cursor, Windsurf, or Claude Code through an API key, which bypasses the allowance entirely and draws only from extra usage credits. In both cases the tokens are charged at standard API rates, not a flat subscription rate.
What are the exact per-token rates for Claude extra usage in April 2026?
The public rate card is: Haiku 4.5 at $0.80 per million input tokens and $4.00 per million output tokens. Sonnet 4.6 at $3.00 per million input tokens and $15.00 per million output tokens. Opus 4.6 at $15.00 per million input tokens and $75.00 per million output tokens. A $10 credit lasts roughly 1,000 to 5,000 Haiku messages, 200 to 1,000 Sonnet messages, or 33 to 200 Opus messages, depending on context size. Extra usage credits do not expire once loaded. There is a daily redemption cap of $2,000.
Why do Cursor, Windsurf, and Claude Code drain extra usage credits but the Claude.ai chat app does not?
The Claude.ai web app, the Claude iOS app, and the Claude macOS app are all first-party Anthropic surfaces authenticated by your OAuth login. They consume your subscription's usage allowance. Third-party tools that call the Anthropic API with an API key, or that negotiate access through a partner integration, do not touch the subscription allowance. They bill against extra usage credits instead. This is why a day of heavy Claude Code can cost $50 in credits even though your Pro plan is nominally unlimited for chat.
How does Fazm use Claude without triggering extra usage billing?
Fazm ships two bridge modes. A bundled-API-key mode used for new users during a free $10 trial, and a personal-OAuth mode used for everyone else. In personal-OAuth mode the Swift code at `/Users/matthewdi/fazm/Desktop/Sources/Chat/ACPBridge.swift` line 350 runs `env.removeValue(forKey: "ANTHROPIC_API_KEY")` before spawning the Node subprocess, which forces the bridge to talk to Claude through your own OAuth session instead of the API. Every request then counts against your Pro or Max subscription allowance the same way as chatting with Claude.ai directly.
Where is the $10 free trial cap set in the Fazm source?
At `/Users/matthewdi/fazm/Desktop/Sources/Providers/ChatProvider.swift` line 476: `static let builtinCostCapUsd: Double = 10.0`. The cumulative cost is tracked in `builtinCumulativeCostUsd` (line 479). After every Claude response, line 2813 accumulates the cost and lines 2814-2818 auto-switch the bridge to `personal` OAuth mode the moment the cap is hit. The switch is a real subprocess restart with `ANTHROPIC_API_KEY` stripped out.
Does connecting my Claude account to Fazm count against my extra usage?
No. Personal-OAuth mode is the same authentication path the Claude.ai web app uses. When you sign in through the Fazm ACP bridge, Anthropic routes the requests through your subscription allowance. Extra usage only activates if that allowance runs out, same as on Claude.ai. If you hit the allowance, Fazm reports the rate limit in the ACP bridge and shows `You've hit Claude's usage limit` (ACPBridge.swift line 1603). You do not silently accrue extra-usage charges.
What is the difference between API-key billing and OAuth billing on Claude?
API-key billing is pure pay-as-you-go at the standard rate card. Every input and output token maps to a dollar amount that you pay on top of any subscription. OAuth billing is the subscription-allowance path: the request is attributed to your user account and decremented from the Pro or Max monthly allowance, with the daily Claude.ai-style message budget. Fazm's BridgeMode enum (ACPBridge.swift lines 194-204) makes this distinction explicit at the code level. `bundledKey(apiKey:)` goes to the API, `personalOAuth` goes to the subscription.
If Fazm strips the API key, how does Claude know who I am?
Through the same OAuth flow Claude.ai uses. When you connect, the Fazm `ClaudeAuthSheet` (/Users/matthewdi/fazm/Desktop/Sources/Chat/ClaudeAuthSheet.swift) opens your browser, signs you in at claude.ai, and the Node ACP bridge caches the resulting tokens. Every subsequent Claude request from Fazm is sent with your OAuth credentials, not an API key. That is what makes the usage count against your subscription allowance and not the extra-usage pool.
What happens if I am already paying for Claude Code extra usage credits?
Those credits sit in your Anthropic account until they are spent. They do not transfer to anything. If you move your Mac desktop automation workload from Claude Code to Fazm's personal-OAuth mode, your Claude Code usage stops drawing them down. If you keep using both, Claude Code will keep drawing from extra usage credits because it is a separate integration with its own billing path. Fazm usage on your OAuth session is billed through the subscription. Two separate buckets.
Can Fazm track how much extra usage or subscription allowance a session used?
Yes. Every Claude response returns a `costUsd` value and input and output token counts, captured in the ACP bridge's `QueryResult` struct (ACPBridge.swift line 100). `ChatProvider` accumulates these per session (line 2809: `sessionTokensUsed += queryResult.inputTokens + queryResult.outputTokens`). In bundled-key mode the running total is compared to the $10 cap. In personal-OAuth mode Claude itself governs the allowance and Fazm surfaces rate-limit messages when it is hit.
Are there rate limits on the OAuth path even if I am not paying extra usage?
Yes, the same rate limits that apply to the Claude.ai web app apply to OAuth requests from third-party clients like Fazm. The ACP bridge processes a `rateLimit` status event (ACPBridge.swift line 183) with a `resetsAt` timestamp, a `rateLimitType`, and an `overageStatus`. Line 871 forwards it to the UI. If the model returns a rejected rate limit, Fazm logs it and lets the session continue so in-flight tool calls are not lost, then surfaces a clear message about when the limit resets.
What is the cheapest real-world setup if I use Claude every day across chat, coding, and Mac automation?
A single Claude Max 20x subscription, used through first-party Anthropic surfaces for chat and through Fazm's personal-OAuth mode for Mac-side automation. Add a small extra-usage balance only if you also rely on Cursor or Claude Code, which are locked to the extra-usage path regardless of subscription. This keeps the bulk of your usage inside the flat monthly rate and confines pay-per-token billing to the specific integrations that cannot avoid it.
What every Claude extra-usage article stops short of saying
Extra usage is not a bug. It is the pricing model for API access, and it is exactly what a tool like Cursor or Claude Code requires to talk to Claude at all. What the SERP misses is that the same model (Sonnet 4.6, Opus 4.6, Haiku 4.5) can be reached through a different authentication path, one that is already included in your subscription, if the tool on your desktop is built to use it.
Fazm is that tool. `ACPBridge.swift` line 350 is one line of Swift that, by removing a single environment variable before spawning a subprocess, reroutes every Claude call from the API-key rate card onto the user's Pro or Max allowance. `ChatProvider.swift` line 476 defines the $10 cap where this rerouting fires automatically. Both are grep-able in the public-facing behavior of the app you install.
If you care about the monthly bill, the architecture matters more than the rate card. The rate card tells you what extra usage costs. The architecture tells you whether you ever have to pay it.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.