Opus Token Burn Rate - Watching It Write, Delete, and Rewrite 200-Line Functions

Fazm Team··3 min read

Opus Token Burn Rate - Watching It Write, Delete, and Rewrite 200-Line Functions

Opus does not burn tokens. It vaporizes them. If you have watched Opus work in Claude Code, you have seen this pattern: it writes a 200-line function, pauses, decides it does not like the approach, deletes the entire thing, and rewrites it from scratch. Sometimes it does this twice in a row.

The Rewrite Cycle

Here is what a typical Opus session looks like from a token perspective:

  1. Opus reads the file and understands the context (input tokens)
  2. Opus writes 200 lines of implementation (output tokens)
  3. Opus re-reads its own output and identifies a problem (more input tokens)
  4. Opus deletes the 200 lines (output tokens for the edit)
  5. Opus writes a new 200-line implementation (more output tokens)
  6. Opus is satisfied and moves on

Steps 2 through 4 consumed tokens that produced zero lasting value. The final output is the same as if Opus had written the correct version the first time. But the token bill reflects the entire journey, including the false starts.

Why Opus Does This

Opus is the most capable Claude model, and its "thinking" style is genuinely different from Sonnet. Opus explores the solution space more broadly. It considers multiple approaches, sometimes by actually implementing them before deciding which one is better.

This is not a bug - it is how Opus achieves higher quality output. The rewrite often produces meaningfully better code than the first attempt. But you pay for the exploration.

The Cost Implications

On a busy day with five parallel agents all running Opus, the token burn adds up fast. A single agent might use $20-40 in tokens for a complex refactoring task. Multiply that by five agents and a full workday, and you are looking at meaningful monthly costs.

Strategies to Manage It

  • Use Sonnet for straightforward tasks - not every task needs Opus-level exploration
  • Scope tasks tightly - smaller, well-defined tasks reduce the exploration surface
  • Provide more context upfront - the more Opus knows about your preferences, the less it explores wrong approaches
  • Set constraints in CLAUDE.md - explicit constraints reduce the solution space Opus needs to explore

The irony is that Opus's token-heavy approach often produces the best code. The rewrite cycle is expensive, but the final output is usually worth it. The skill is knowing when to deploy Opus versus Sonnet - save the expensive exploration for problems that actually benefit from it.

Fazm is an open source macOS AI agent. Open source on GitHub.

More on This Topic

Related Posts