Tips for Secondary Models - When to Use Haiku vs Opus in AI Agents
Tips for Secondary Models - When to Use Haiku vs Opus in AI Agents
Not every AI agent task needs the most powerful model. Running Opus for a simple button click confirmation is like hiring a surgeon to put on a bandage. It works, but you are burning money for no reason.
The key insight is that model selection should be task-driven, not workflow-driven. Within a single agent workflow, different steps have wildly different complexity requirements.
When Haiku Is Enough
Haiku handles the majority of agent micro-tasks perfectly well:
- Screen element classification - "Is this a button or a text field?" Haiku gets this right 99% of the time.
- Simple extraction - Pulling a name, date, or number from structured text.
- Yes/no verification - "Does this page contain a success message?"
- Status checks - Reading error codes, confirming page loads, checking element visibility.
- Template filling - Inserting known values into predictable form fields.
These tasks are high-frequency and low-complexity. Routing them to Haiku can cut your per-workflow cost by 60% or more.
When You Need Opus
Opus earns its cost on tasks that require genuine reasoning:
- Multi-step planning - Deciding the sequence of actions across multiple applications.
- Error recovery - When something unexpected happens and the agent needs to figure out a new path.
- Ambiguous UI interpretation - Complex layouts where the right action is not obvious from the accessibility tree alone.
- Context-heavy decisions - Tasks that require remembering and synthesizing information from earlier steps.
The Handoff Problem
The tricky part is the handoff between tiers. If Haiku misclassifies a complex task, it might take a wrong action that Opus then has to recover from - costing more than just using Opus from the start.
A practical approach: start with Haiku, and if it returns low-confidence results or the task requires more than two reasoning steps, escalate to Opus automatically. This catches the 10-15% of tasks that genuinely need more power while keeping the other 85% cheap.
The best model is not always the biggest one. It is the one that matches the task.
Fazm is an open source macOS AI agent. Open source on GitHub.