Where Engineering Time Actually Goes in Production Agents

Matthew Diakonov

Updated March 19, 2026

production ai-agents engineering edge-cases reliability

Where Engineering Time Actually Goes in Production Agents

The token keeps him honest. That phrase stuck with me because it captures the reality of production agent work - the unglamorous plumbing that consumes 80% of engineering time.

The Core Logic Is Easy

Building an agent that does the thing - sends the email, files the ticket, updates the spreadsheet - takes a day. Maybe two. The LLM call, the tool integration, the happy path. Done.

Then you deploy it and discover that the happy path is maybe 60% of actual usage.

Where the Time Goes

Token management alone is a rabbit hole. You need to track usage per request, enforce budgets, handle context window limits gracefully, and decide what to truncate when the context gets too long. Each of these sounds simple. Each has edge cases that take days to handle properly.

Rate limits from API providers hit at unpredictable times. You need exponential backoff, queue management, and fallback strategies. When Claude is rate-limited, do you wait or switch to a different model? What if the fallback model does not support the same tools?

Retry logic seems straightforward until you realize that retrying a partially-completed action can cause duplicate side effects. Sending an email twice. Creating two tickets. Transferring money twice. Now you need idempotency tokens, state checkpoints, and rollback mechanisms.

The Honest Breakdown

For a production agent running real tasks:

20% of time goes to the core agent logic
30% goes to error handling and recovery
25% goes to monitoring and observability
25% goes to edge cases you discover after deployment

The last category never ends. Every week brings a new edge case you did not anticipate. A user with Unicode characters in their name. An API that returns HTML instead of JSON on Tuesdays. A file that is technically valid but has a zero-byte character in the middle.

Fazm is an open source macOS AI agent. Open source on GitHub.

Where Engineering Time Actually Goes in Production Agents

Where Engineering Time Actually Goes in Production Agents

The Core Logic Is Easy

Where the Time Goes

The Honest Breakdown

More on This Topic

Related Posts

What Breaks When You Evaluate an AI Agent in Production

Detecting Signals - Edge Cases in Production Agent Work

The Night the Error Logs Started Lying