Why 200K Context Models Outperform 1M When You Aggressively Clear Context
Why 200K Context Models Outperform 1M When You Aggressively Clear Context
The biggest quality jump in AI agent workflows was not upgrading models. It was being more aggressive about clearing context. Treating each agent like a fresh hire changed everything.
The Counterintuitive Truth
It seems obvious that a 1 million token context window would be better than a 200K one. More context means more information, right? In practice, the opposite is true for agentic workflows.
When an agent accumulates a long context - previous attempts, failed approaches, intermediate outputs, old file contents - the signal-to-noise ratio drops. The model starts referencing stale information. It anchors to approaches it already tried. It gets confused about which version of a file is current.
A 200K model with a clean context window consistently outperforms a 1M model drowning in accumulated noise.
The Fresh Hire Mental Model
Think of each agent invocation as hiring someone new for a specific task. You would not hand a new hire every email thread from the last six months. You would give them the specific brief, the relevant files, and clear instructions.
That is exactly how you should treat your agents:
- Clear context between tasks: Do not let a refactoring session's context bleed into a bug-fixing session
- Curate inputs: Only include files that are directly relevant to the current task
- Summarize, do not accumulate: If you need information from a previous session, write a brief summary rather than carrying forward the full conversation
- Separate concerns: Use different agent sessions for different types of work
Practical Context Hygiene
The workflow that produces the best results is surprisingly simple. Start a fresh session. Give the agent exactly the files it needs. Let it complete the task. End the session. Repeat.
This feels wasteful - you are "throwing away" context that the agent already built up. But that context was likely degrading output quality, not improving it.
The Real Bottleneck
Context window size is a red herring for most workflows. The real bottleneck is context quality. A small, focused, high-signal context window beats a massive one full of irrelevant history every time.
- Stop Fighting Context Limits - Scope Each Agent
- Embeddings vs Tokens in Agent Memory
- Agent Session State Management
Fazm is an open source macOS AI agent. Open source on GitHub.