Context Management Is 90% of the Skill in AI-Assisted Coding
Context Management Is 90% of the Skill in AI-Assisted Coding
Most people think the skill in AI-assisted coding is writing good prompts. It is not. The real skill is context management - making sure the AI agent has the right information at the right time, both within a session and across sessions.
The prompt is the last 10%. The context is the first 90%.
Why Agents Without Context Fail
An AI agent with no context is like a brilliant contractor who shows up to your job site with no blueprints, no knowledge of what was built yesterday, and no understanding of the building codes. They can do excellent work in theory, but in practice they waste most of their time rediscovering things you already know.
This is what happens when you start every AI coding session fresh. The agent does not know your architecture decisions. It does not know which approaches you already tried and rejected. It does not know your coding conventions, your deployment pipeline, or your test strategy. Every session starts from scratch.
By the end of 2025, roughly 85% of developers regularly use AI tools for coding - yet most of them report inconsistent results. The variance is rarely about the model. It is about the context the model receives.
The Three-Layer Context Stack
A practical context architecture is layered. Each layer serves a different purpose:
Layer 1: Project-level context (CLAUDE.md)
The CLAUDE.md file at your repo root is loaded into every session automatically. Keep it under 200 lines - longer files consume more context and Claude's adherence decreases. Use it for genuinely permanent information:
# Architecture
- SwiftUI MVVM with Combine
- Do not use deprecated ScreenRecorder API (replaced in macOS 15.1)
- All async work goes through the TaskQueue singleton
# Test setup
- Run `scripts/setup-test-env.sh` before the test suite
- Integration tests require a running local Postgres on port 5433
# Rejected approaches
- URLSession streaming: tried 2025-09, caused memory pressure under load
- Background NSTask: blocked by Gatekeeper in sandboxed builds
Layer 2: Directory-level rules
For larger projects, the .claude/rules/ directory accepts scoped rules that only apply to specific paths. Backend rules do not pollute frontend sessions. This keeps each agent's context window focused on what is relevant.
Layer 3: Session and historical context
This is where most developers stop investing - and where the biggest gains are. A memory layer that stores decisions, failed attempts, and session summaries reduces the time agents spend rediscovering known information. Tools like the MCP memory server and claude-mem plugin can capture session transcripts, compress them, and inject relevant history into future sessions.
A 2025 analysis found that context editing - selectively pruning what goes into the context window - reduced token consumption by 84% while allowing agents to complete workflows that would otherwise fail due to context exhaustion.
The CLAUDE.md Files Are Not Enough Alone
CLAUDE.md is well understood. The historical context layer is not. Here is a concrete pattern for maintaining it:
Keep a decisions.md file in your repo (not CLAUDE.md, but referenced from it). Add to it after significant architectural choices:
## 2025-11-14: Switched from WebSockets to SSE for agent streaming
**Decision**: Use Server-Sent Events instead of WebSockets for streaming tool output.
**Reason**: WebSockets require custom reconnect logic; SSE is handled natively by browsers
and has simpler proxy behavior behind Nginx. Tested both under load - SSE was more reliable.
**Status**: Shipped, working well.
## 2026-01-08: Removed Redis session cache
**Decision**: Remove Redis and store session state in Postgres with a short TTL index.
**Reason**: Redis added operational complexity for no measurable latency benefit at our
current scale (< 500 concurrent sessions). Revisit if we exceed 5k concurrent.
**Status**: Shipped. Watch Postgres query times if load increases.
When an agent starts a new session and needs to make decisions that touch on these areas, the history is there. The agent does not start from first principles.
The 90% Rule in Practice
If you track where time actually goes during an AI coding session, the pattern is consistent. Agents that lack context spend a large portion of each session establishing what they should already know - asking for file contents, inferring conventions from examples, reproducing reasoning from scratch.
An agent with well-maintained context spends that time writing and verifying code. The ratio of useful work to setup work is what distinguishes a highly productive AI-assisted workflow from a frustrating one.
A few things that push the ratio in the right direction:
- CLAUDE.md under 200 lines, updated whenever architectural decisions change
- A decisions log for significant choices and rejected approaches
- Session summaries stored somewhere the next session can reference
- Directory-scoped rules so agents only see context relevant to their current task
Context Is Infrastructure
Treating context as infrastructure - something worth designing, maintaining, and investing in - is the mindset shift that separates developers who get substantial leverage from AI coding tools from those who get inconsistent results.
Prompting well matters. But if you spend an hour on a prompt and thirty seconds on context, you have inverted the priority.
Fazm is an open source macOS AI agent. Open source on GitHub.