Getting AI Models to Follow Instructions - Atomic Task Decomposition
Why AI Models Ignore Your Instructions
You write a clear prompt. The model does something different. You add more detail. It still does something different. You add examples. It follows the examples but ignores the rest.
This is not a prompt engineering problem. It is a task structure problem.
The Atomic Step Pattern
Instead of giving the model a complex instruction like "refactor this module to use dependency injection," break it into steps that are individually verifiable:
- List all direct dependencies in the constructor
- Create an interface for each dependency
- Replace each concrete reference with the interface
- Add constructor parameters for each interface
- Update all call sites
Each step has a clear success criterion. Each step can be verified before moving to the next. The model cannot drift because each step constrains the next one.
Why This Works
Large language models are not instruction followers - they are next-token predictors. When you give a complex instruction, the model predicts what a response to that instruction would look like. If the instruction is ambiguous, the prediction drifts toward the most common pattern in training data, not your specific intent.
Atomic steps reduce ambiguity to near zero. "List all direct dependencies in the constructor" has exactly one correct answer for any given code file. There is nothing to interpret.
Verification Closes the Loop
The second half of the pattern is verification. After each step, check the output before proceeding. This catches drift early.
For AI agents, this means:
- Run the code after each change. Does it still compile?
- Diff the output. Did the change match what you expected?
- Assert on specifics. Not "did it work" but "does file X now contain interface Y?"
Practical Application
When an AI agent runs a multi-step workflow, each step should produce a verifiable artifact. If step 3 produces unexpected output, you retry step 3 - not the entire workflow. This is cheaper, faster, and more reliable than running complex prompts and hoping for the best.
The models are capable. The bottleneck is how we structure what we ask them to do.
Fazm is an open source macOS AI agent. Open source on GitHub.