The 3-Tool-Call Problem and Why It Matters
The 3-Tool-Call Problem and Why It Matters
One tool call is usually fine. The agent reads a file, gets the content, moves on. Two tool calls introduce some risk - the second call depends on correctly interpreting the first result. Three tool calls is where things break.
The Compounding Error
Each tool call has a small chance of the agent misinterpreting the result. Say each call has a 95% chance of correct interpretation. That sounds reliable. But three calls in sequence:
- One call: 95% success
- Two calls: 90% success
- Three calls: 86% success
At five calls, you are below 80%. At ten calls, you are at 60%. The reliability degrades faster than intuition suggests.
Three Chances to Hallucinate
Each tool call result gets folded into the context window. The agent reads the result, reasons about it, and decides what to do next. Each reasoning step is a chance to hallucinate - to invent details that were not in the result, to misremember what a previous call returned, or to confuse outputs from different calls.
Three tool calls means three reasoning steps between actions. Three opportunities for the agent to drift from reality.
What To Do About It
- Batch operations - combine multiple reads into one call when possible
- Verify intermediate results - have the agent confirm its interpretation before proceeding
- Reduce chain depth - redesign workflows to minimize sequential dependencies
- Add checkpoints - let the agent save state and validate before continuing
The goal is not zero tool calls. It is fewer sequential dependencies. Parallel tool calls that do not depend on each other are fine. It is the chains that kill reliability.
Fazm is an open source macOS AI agent. Open source on GitHub.