Agent Workflows

Specialist AI Agents with Scoped Context: Why Fighting Your AI Tools Means Missing Context

Most developers who say "AI coding tools do not work for my codebase" are experiencing a context problem, not a model capability problem. The same model that writes perfect code for a greenfield project struggles with an existing codebase because it does not know your conventions, your architecture decisions, or why things are structured the way they are. The fix is not waiting for a smarter model. It is giving each agent exactly the context it needs, nothing more and nothing less. This guide covers Claude Code's project memory system, how to structure specs and skills for specialist agents, and the practical workflow for running scoped agents that consistently produce useful output.

OSS

“Fazm uses real accessibility APIs instead of screenshots, so it interacts with any app on your Mac reliably and fast. Free to start, fully open source.”

fazm.ai

1. The Context Problem: Why AI Agents Produce Bad Output

When an AI coding agent produces code that does not fit your codebase, the instinct is to blame the model. The model is not smart enough. It does not understand my framework. It keeps using patterns I do not want. But in almost every case, the agent is doing exactly what you would expect given the information it has access to.

Consider a React codebase that uses a custom hook pattern for data fetching instead of the standard useEffect approach. If the agent does not know about this custom hook, it will write useEffect-based data fetching code every time. It is not wrong in general. It is wrong in the context of your codebase. The model's training data contains thousands of examples of useEffect-based fetching, and without explicit instructions to do otherwise, it will follow the most common pattern.

This scales to every aspect of a codebase. Naming conventions, error handling patterns, state management approaches, testing strategies, file organization, import ordering, even comment style. Every codebase has hundreds of implicit rules that experienced developers learn over months. AI agents start from zero every session unless you explicitly provide this context.

2. Claude Code's Project Memory System

Claude Code uses a layered memory system to provide context to agents. At the top is the CLAUDE.md file at your project root, which is loaded into every agent session automatically. Below that are directory-level CLAUDE.md files that provide context specific to parts of your codebase. Below that are skill files in the .claude/ directory that provide task-specific instructions.

The project-level CLAUDE.md is where your universal rules live: coding standards, architecture conventions, forbidden patterns, and operational procedures. It should be concise and specific. "Use TypeScript strict mode" is good. "Write good code" is useless. Every rule in CLAUDE.md should be something that, if violated, would fail a code review.

Directory-level CLAUDE.md files let you scope context to specific parts of the codebase. A CLAUDE.md in src/database/ might describe the ORM conventions, migration workflow, and connection pooling strategy. A CLAUDE.md in src/api/ might describe the endpoint naming scheme, error response format, and authentication middleware usage. These files are only loaded when the agent reads or writes files in that directory, keeping the context window focused.

Try the AI agent that actually works with your apps

Fazm uses accessibility APIs to control your Mac natively. Voice-first, open source, runs locally.

3. Specialist Agents vs. General-Purpose Agents

A general-purpose agent tries to do everything: write frontend code, configure the database, update CI/CD pipelines, and write documentation. It needs context about the entire system, which either overflows the context window or forces you to provide shallow summaries of everything instead of deep context about anything.

A specialist agent has a narrow scope and deep context for that scope. A "frontend feature agent" knows your component library, design tokens, routing conventions, and state management patterns. It does not know or care about database migrations. A "database agent" knows your schema, indexing strategy, migration workflow, and query optimization patterns. It does not know about your frontend.

Aspect	General-Purpose Agent	Specialist Agent
Context depth	Shallow across everything	Deep in one domain
Output quality	Inconsistent	Consistently good within scope
Context window usage	Bloated with irrelevant info	Focused and efficient
Error rate	Higher, misapplies patterns	Lower, follows domain rules
Setup effort	Low	Medium (one-time investment)

The trade-off is setup time. Creating specialist context files takes effort upfront. But the investment pays off on the second task, when the agent produces code that matches your codebase conventions without any manual corrections.

4. Structuring Scoped Context Files

A good scoped context file answers three questions: what are the rules in this part of the codebase, what patterns should the agent follow, and what mistakes should it avoid? Structure your context files around these three categories.

Rules are non-negotiable constraints. "All API responses use the envelope format: { data, error, meta }." "Database queries go through the repository layer, never directly in route handlers." "Error messages are user-facing and must not expose internal details." Rules prevent the agent from introducing inconsistencies that would fail code review.

Patterns are preferred approaches. "Use the createSlice pattern for new Redux state." "Prefer composition over inheritance for component reuse." "Use the Result type for operations that can fail instead of throwing exceptions." Patterns guide the agent toward idiomatic code for your project without being absolute requirements.

Anti-patterns are specific mistakes to avoid. "Do not use the legacy fetchData helper, it has a memory leak. Use the new useFetch hook instead." "Do not add new dependencies without checking the bundle size impact." These are especially valuable because they catch errors that the model would make based on its training data but that are wrong in your specific context.

5. Skills and Specs: The Two Pillars of Agent Context

Skills define how an agent works. They contain workflow instructions, tool configurations, and procedural knowledge. A "ship feature" skill might instruct the agent to read the spec file, create a feature branch, implement the changes, run tests, fix any failures, and create a pull request. Skills are reusable across projects and tasks.

Specs define what an agent builds. They contain requirements, acceptance criteria, technical constraints, and design decisions for a specific piece of work. A spec for a new API endpoint might include the request and response schemas, authentication requirements, rate limiting rules, and edge cases to handle. Specs are task-specific and disposable.

The combination of skills and specs gives you repeatable, high-quality agent output. The skill ensures the agent follows your development workflow. The spec ensures it builds the right thing. When agent output is wrong, you can diagnose whether the problem is in the skill (wrong process) or the spec (wrong requirements) and fix the source rather than manually correcting the output.

6. Debugging Bad Agent Output: A Context-First Approach

When an AI agent produces output that does not match your expectations, apply this diagnostic framework before blaming the model. First, check whether the context files include the specific rule or pattern the agent violated. If you expected the agent to use your custom error handler but never mentioned it in any context file, the agent had no way to know.

Second, check whether the context is ambiguous or contradictory. If your CLAUDE.md says "use functional components" but your codebase has class components everywhere, the agent gets mixed signals. The fix is to update the context to acknowledge the current state and describe the target state: "Existing class components should not be refactored unless explicitly requested. All new components must be functional components."

Third, check whether the context window is overloaded. If you are loading 50,000 tokens of context before the agent even starts working, the model may deprioritize or miss specific rules. This is where specialist agents with scoped context win: fewer tokens of context mean each instruction gets more attention from the model.

Tools like Fazm, an AI computer agent for macOS that is voice-first, open source, and uses accessibility APIs, take a similar approach to context scoping for desktop automation. Each task gets targeted context about the specific application and workflow, rather than trying to understand every application on the system simultaneously.

7. The Complete Workflow in Practice

Here is the workflow distilled into concrete steps. Start by writing a project-level CLAUDE.md with your universal coding standards and architecture rules. Keep it under 2,000 tokens. Everything in this file applies to every agent session, so only include rules that are truly universal.

Next, add directory-level CLAUDE.md files for each major area of your codebase. These contain domain-specific context: the database layer gets migration conventions, the API layer gets endpoint patterns, the frontend gets component guidelines. These load only when relevant, so they can be more detailed without bloating every session.

Create skill files for your common workflows: ship a feature, fix a bug, write tests, perform a refactor. Each skill defines the step-by-step process the agent should follow. Skills are your development playbook encoded in a format that agents can execute.

When you have a task, write a spec file that describes exactly what needs to be built. Reference relevant context files in the spec. Then launch a specialist agent with the appropriate skill and spec. The agent has deep context for its domain, clear instructions for the workflow, and specific requirements for the task.

After the agent completes, review the output. If something is wrong, update the relevant context file, not the output. Every correction should improve future agent sessions, not just fix the current one. Over time, your context files accumulate the institutional knowledge of your codebase, and agent output quality improves continuously.

Scoped Context for Desktop Tasks

Fazm brings the same scoped-context approach to macOS desktop automation, giving each task targeted knowledge about the apps and workflows it needs to control.

Try Fazm Free