Billing & Tokens

Claude Code Extra Usage, Explained by a Mac App That Rides the Same Token

Extra usage is what Claude charges you after your plan limits are hit. It is shared across Claude.ai, Claude Code, and any third-party app that logs in with your Claude account. Fazm logs in with the exact same OAuth client ID as Claude Code, so every query it runs bills to the same balance. This page explains how that works, what drains the pool fastest, and the exact file paths you can check to verify it.

Fazm

Published April 16, 20269 min read

Try Fazm free

4.9from 200+

Pro, Max 5x, Max 20x all have extra usage

Shared balance across Claude.ai + Claude Code

Third-party apps can ride the same token

One Balance. Three Consumers.

How Claude Code extra usage really flows

Anthropic bills Pro, Max 5x, Max 20x at API rates once limits are hit.

Claude.ai conversations and Claude Code terminal share the same pool.

Third-party apps authenticate via OAuth, scope user:inference.

Fazm reuses the Claude Code client ID and Keychain record.

Every Fazm query draws from your extra usage balance.

0:00 / 0:05

What Extra Usage Actually Is

Extra usage is Anthropic's term for the pay-as-you-go budget that kicks in after a paid subscriber hits their included session limits. It is available to Claude Pro, Claude Max 5x, and Claude Max 20x subscribers. Instead of being rate-limited and blocked for the rest of the session window, you keep working, and the overflow is charged at standard API rates.

The important thing most guides miss: extra usage is a single balance, not three separate ones. Usage from Claude.ai, Claude Code, Claude Desktop, Claude Cowork, and any third-party app that authenticates with your Claude account all pulls from the same pool. Pre-purchased bundles (Anthropic sells discounted bundles up to 30% below standard rates) roll into the same balance.

Since early 2026 Anthropic also routes third-party app billing straight to extra usage. So any time you connect a non-Anthropic tool to your Claude account, it bills there first, not against your plan's included allowance.

One extra usage balance, many consumers

The Anchor Fact: Fazm Logs In as claude-code

When Fazm asks you to "Login with Claude," it is not running a generic OAuth flow. It runs the exact flow that the Claude Code CLI runs. Same OAuth client ID, same authorize URL, same scope, same Keychain service name. After you approve, the access token ends up in the same place the CLI would store it.

You can read the constants yourself in the open source repo at github.com/mediar-ai/fazm, file acp-bridge/src/oauth-flow.ts, lines 25 through 31.

acp-bridge/src/oauth-flow.ts

Three details matter here. The CLIENT_ID is Anthropic's registered Claude Code client, not a Fazm app ID. The SUCCESS_URL ends with ?app=claude-code, which is the marker Anthropic's OAuth consent screen uses to name the integration. And the KEYCHAIN_SERVICE is literally the string Claude Code-credentials, which is where the claude CLI also stores its tokens on macOS.

Verify It Yourself

Open Terminal and run the command below. If you have logged into either Claude Code or Fazm, you will see a single Keychain record under the service name Claude Code-credentials. Logging out of Fazm deletes that same record (source: Desktop/Sources/AppState.swift:797).

Verifying the shared Keychain record

What Happens When a Fazm Query Hits Your Extra Usage

This is the flow of tokens and cost data for a single Fazm query when you are signed in with your Claude account. The same sequence (minus the accessibility tree step) describes any Claude Code CLI command that hits extra usage.

Fazm query to Anthropic, billed to extra usage

The total_cost_usd field in the response is Anthropic's own accounting of what the turn cost against your plan and extra usage balance. Fazm subtracts the previous session cost to compute the delta for just this turn. That makes it possible to see per-step spend in real time.

acp-bridge/src/patched-acp-entry.mjs

What Actually Drains Your Extra Usage

Extra usage spend is asymmetric. A plain chat turn costs a few cents. A vision-heavy automation turn can cost 20 to 50 times more for the same amount of useful output. Here is the ranking, with the drivers that matter most first.

Screenshot-heavy automation agents

Full-resolution screen captures land as image tokens. A single 1440x900 frame runs 1,700 to 3,000 tokens before any text arrives. At 20 turns that is 40,000+ image tokens on visual signal alone. Multiply by sessions and it shows up in extra usage fast.

Long-running sessions without resets

Each new turn replays the full session history. Letting a session grow past 30 to 50 turns means every subsequent message pays for all prior context. Cache reads cover most of it, but cache misses are brutal.

Large file reads into context

A 10,000 line source file is 60,000 to 80,000 tokens. If the agent reads several of them per turn, input token counts balloon and every downstream turn pays the same bill.

Opus 4.x on exploratory tasks

Opus costs roughly 5x Sonnet per token. Opus is the right choice for hard reasoning. It is the wrong choice for repetitive mechanical actions, which is most of desktop automation.

Accessibility-tree automation (Fazm)

A structured tree of every visible UI element runs under 200 tokens per action for most apps. Combined with a 20-turn image cap (MAX_IMAGE_TURNS = 20), this keeps per-task spend roughly 10x lower than screenshot agents on the same work.

The Numbers That Matter

These are concrete limits and constants from Fazm's source code, plus Anthropic's published pricing. Each one translates directly into how fast your extra usage balance moves.

0MAX_IMAGE_TURNS per session in Fazm

0yOAuth token lifetime (31,536,000s)

0Median tokens for an accessibility tree turn

0%Discount on prepaid usage bundles (up to)

Why the Input Medium Decides the Bill

Two desktop automation agents can perform the same 20-step workflow and produce the same final result. One spends 10 times more extra usage than the other, because of how it sends the UI state to Claude.

Screenshot agent vs accessibility-tree agent

Sends a full image of the current screen on every turn. Claude processes it as image tokens. The model has to infer element roles and coordinates from pixels. Costs scale with screen resolution and session length.

1,700 to 3,000 image tokens per turn
OCR cost baked in every message
Pixel hallucinations require retries
10x to 30x total cost on long workflows

How to Point Your Claude Code Extra Usage at Fazm

Four steps. After the last one, every Fazm automation run bills to the same balance as your claude CLI sessions. You do not need an Anthropic API key.

Install Fazm and launch it

Download the macOS app from fazm.ai. First run triggers onboarding. Skip the built-in key option if you want to use your own Claude subscription.

Choose Login with Claude

In Settings, pick the personal Claude account option. Fazm opens your browser at claude.ai/oauth/authorize with Anthropic's Claude Code client ID. You approve access to user:inference scope exactly as if you were running `claude /login` in a terminal.

Token lands in Keychain service Claude Code-credentials

The callback exchanges your code for an access token and refresh token. Fazm runs `security add-generic-password -U -s "Claude Code-credentials"` to upsert the record, using the same service name the Claude Code CLI uses.

Ask Fazm to do anything on your Mac

From that point on, every Fazm query bills to your plan. If you exceed your plan limits, extra usage covers the overflow. Per-turn costs are tracked via total_cost_usd and shown in-app.

What Bills to Extra Usage (and What Does Not)

Anthropic's rules here are not symmetric. Claude.ai, Claude Desktop, and the mobile app draw from your plan allowance first and fall back to extra usage only when that runs out. Third-party apps always draw from extra usage directly, bypassing the plan allowance.

Bills to Extra Usage Directly

Third-party apps connecting via OAuth (including Fazm)
Claude Code CLI usage beyond your session limit
Claude.ai conversations after plan allowance is exhausted
Claude Cowork agent work after plan allowance
Prepaid usage bundle balance (pulled first when available)

Auto-reload is the switch most people forget. It lives in your Claude account settings under Billing. Without it, third-party apps hard-fail when your extra usage hits zero. Claude.ai and Claude Desktop keep working within your plan allowance, but any non-Anthropic app refuses inference.

Fazm keeps your extra usage going further

Accessibility APIs instead of screenshots. A 20-turn image cap. Per-query cost tracking. Free to start, open source.

Try Fazm free →

Frequently asked questions

What is Claude Code extra usage?

Extra usage is the consumption-based budget Anthropic charges Pro, Max 5x, and Max 20x subscribers once they exhaust their plan's included session limits. Usage keeps working at standard API pricing instead of being blocked. The balance is shared across Claude.ai, Claude Code, Claude Desktop, and any third-party app that authenticates with the same Claude account.

Does Claude Code count against the same extra usage pool as Claude.ai?

Yes. Your combined usage across Claude.ai conversations and Claude Code terminal sessions counts toward the same plan limits, and once you exceed them, both interfaces draw from the same extra usage balance. Usage bundles bought in advance also work across Claude.ai, Claude Code, Claude Cowork, and third-party products.

How can a macOS app use my Claude Code extra usage balance without an API key?

Any app that authenticates with Anthropic's OAuth client ID for Claude Code (9d1c250a-e61b-44d9-88ed-5944d1962f5e) and requests the user:inference scope can obtain tokens that route inference through the same subscription billing path. Fazm, for example, runs the exact same OAuth flow as the Claude Code CLI: same client ID, same scope, same Keychain service name (Claude Code-credentials). When you log in, Fazm's queries consume extra usage from your plan just like a terminal session would.

Why do screenshot-based AI agents burn through extra usage faster than accessibility-based ones?

Screenshot agents send full images to the model on every action. Image tokens cost roughly 1.15 tokens per 1.15K pixels for Claude, so a single 1440x900 screen capture can run 1,700 to 3,000 tokens per turn before the model even sees text. Accessibility-based agents read a structured text representation of the UI (every button, label, and text field as text with coordinates), which typically runs under 200 tokens for the same screen. Over a 20-step workflow the difference compounds into 10x to 30x cost deltas.

Where can I verify that Fazm and Claude Code share credentials?

Run `security find-generic-password -s "Claude Code-credentials"` in Terminal. A single record exists whether you logged in through the claude CLI's /login command or through Fazm's Login with Claude button. The record stores the OAuth access token, refresh token, and expiration under a single JSON blob keyed `claudeAiOauth`.

What cap does Fazm impose on image-heavy turns?

Fazm's ACP bridge sets MAX_IMAGE_TURNS = 20 per session (acp-bridge/src/index.ts:676). After 20 turns that include screenshots, the bridge stops sending images on subsequent turns to avoid tripping Claude's per-message image limit and to keep the token bill bounded. Accessibility-tree context is still sent, so the agent continues to work from structured text.

What happens when my Claude Code extra usage runs out while Fazm is mid-task?

The token exchange fails with a 403 or a rate limit error, and Fazm surfaces the error back to the chat. You can switch Fazm to its bundled built-in key (Fazm's own credits) or enable auto-reload on extra usage in your Claude account settings. The two modes are selectable in Settings and do not share billing pools, so you can fall back to Fazm's built-in key without touching your plan balance.

Does Fazm track how much each query costs against my extra usage?

Yes. Fazm reads total_cost_usd from every Claude session result and computes a per-turn delta (patched-acp-entry.mjs:45-47: _sessionCostUsd = item.value.total_cost_usd; _lastCostUsd = item.value.total_cost_usd - prevSessionCost). That value is logged per query and rolled up into session cost analytics, so you can see exactly which automation step was expensive.

Your Claude Code balance can automate your whole Mac.

Fazm logs in with the same OAuth identity as Claude Code, reads apps through real accessibility APIs instead of screenshots, and shows per-turn cost deltas in the app. No API key, no separate bill.