Agent Architecture

MCP Tools vs. Custom Skills vs. Subagent Orchestration: A Practical Decision Guide

You are running five parallel Claude agents on a codebase. One handles frontend, one handles backend, one writes tests, one manages infrastructure, and one coordinates the others. Each agent needs tools - but should those tools be MCP servers, custom skills, subagent delegations, or should you let the agent figure it out? The answer depends on the task, the context budget, and how much control you need. This guide breaks down the trade-offs so you can make informed decisions about your agent architecture.

OSS

“Fazm uses real accessibility APIs instead of screenshots, so it interacts with any app on your Mac reliably and fast. Free to start, fully open source.”

fazm.ai

1. Three Abstraction Layers

AI agents interact with the world through three increasingly abstract layers. Understanding these layers is the foundation for making good architectural decisions.

MCP tools are the lowest layer. They are individual functions exposed through the Model Context Protocol - read a file, execute a query, send a message, click a button. Each tool does one thing. The agent decides when and how to combine them. MCP tools are to agents what system calls are to applications: primitive operations that compose into higher-level behavior.

Custom skills are the middle layer. A skill is a packaged workflow that combines multiple tool calls with prompting logic, error handling, and domain knowledge. A "deploy to production" skill might run tests, build the project, push to a registry, update infrastructure, and verify the deployment - all through a single invocation. Skills encapsulate expertise that would be expensive to re-derive from first principles every time.

Subagent orchestration is the highest layer. Instead of calling a tool or invoking a skill, you spawn a separate agent instance with its own context window, its own tools, and a specific mandate. The orchestrating agent delegates a chunk of work to the subagent and gets back a result. Subagents are to skills what microservices are to functions: independently executing units with their own state and resources.

2. MCP Tools - The Standardized Interface

The Model Context Protocol provides a standardized way to expose tools to AI agents. An MCP server declares its capabilities - a list of tools with typed parameters and descriptions - and any MCP-compatible agent can discover and use them. This standardization is MCP's primary value proposition.

MCP tools work best for operations that are atomic, stateless, and have clear input/output contracts. Reading a file, querying a database, sending an API request, manipulating a UI element - these are natural MCP tools. Each call is self-contained, and the agent maintains the state and sequencing logic.

The strength of MCP tools is composability. Because each tool is small and focused, the agent can combine them in novel ways that the tool author did not anticipate. A file system MCP server was not designed for code refactoring, but an agent can use its read, write, and search tools to perform refactoring by composing them appropriately.

The weakness is that every tool call consumes context. A 10-step workflow using raw MCP tools generates 10 tool calls, 10 results, and the agent must reason about the sequence and handle errors at every step. For frequently repeated workflows, this context cost is wasteful - you are paying tokens for the agent to re-derive the same workflow logic every time.

Tools like Fazm use MCP as the extensibility layer for macOS automation. The core agent provides MCP tools for accessibility tree traversal, element interaction, and application control. Third-party MCP servers can extend Fazm with database access, API integrations, or any other capability - all through the standard protocol.

Try the AI agent that actually works with your apps

Fazm uses accessibility APIs to control your Mac natively. Voice-first, open source, runs locally.

3. Custom Skills - The Reusable Workflow

A custom skill packages a multi-step workflow into a single invocation. When the agent calls a skill, it gets the result of the entire workflow rather than needing to orchestrate each step. This saves context and improves reliability because the skill's internal logic is pre-tested and deterministic.

Skills are the right abstraction when you have a workflow that meets three criteria: it is used frequently (at least weekly), it has a well-defined sequence of steps, and the steps rarely need modification. Deployment workflows, database migration sequences, code formatting pipelines, and report generation are all natural candidates for skills.

The key advantage of skills over raw MCP tools is context efficiency. A deployment skill that internally makes 15 API calls presents as a single tool call to the agent. The agent's context window contains one invocation and one result instead of 15 of each. For agents with limited context budgets - or agents running in parallel where token costs multiply - this compression is significant.

The disadvantage is rigidity. A skill encodes a specific workflow. If your deployment process changes, you need to update the skill. If you need a variation (deploy to staging vs. production), you need either a parameterized skill or two separate skills. Skills trade flexibility for efficiency - the right trade-off when the workflow is stable, the wrong one when it is still evolving.

A practical heuristic: if you find yourself giving the agent the same multi-step instructions more than three times, extract those instructions into a skill. The repetition is a signal that the workflow has stabilized enough to benefit from encapsulation.

4. Subagent Orchestration - The Delegation Pattern

Subagent orchestration is the most powerful and most expensive abstraction. Instead of calling a tool or skill, the orchestrating agent spawns a new agent instance, gives it a mandate, and waits for the result. The subagent has its own context window, its own tool access, and makes its own decisions about how to accomplish the mandate.

The primary use case for subagents is work that requires independent reasoning. If the task needs the agent to read code, form a mental model, make judgment calls, and iterate on an approach, it benefits from a dedicated context window. A subagent tasked with "review this PR for security issues" needs to load the relevant code, reason about attack vectors, and compose findings - work that is better done in a fresh context than squeezed into an already-busy parent context.

Subagents are also the right choice for parallel work. If you need five independent tasks done simultaneously - frontend changes, backend changes, test writing, documentation, and deployment configuration - five parallel subagents will complete the work faster than a single agent doing them sequentially. The wall-clock time improvement is proportional to the number of truly independent tasks.

The costs of subagent orchestration are significant. Each subagent consumes a full context window of tokens. Coordination between subagents requires explicit communication mechanisms (shared files, status updates, dependency ordering). And debugging multi-agent failures is harder than debugging single-agent workflows because the state is distributed.

The practical rule: use subagents when the task requires judgment, not just execution. If the task can be described as a deterministic sequence of steps, a skill is more efficient. If it requires reading, thinking, and adapting, a subagent is worth the token cost.

5. Decision Matrix

Use this matrix to decide which abstraction layer to use for a given task:

Factor	Use MCP Tools	Use Custom Skills	Use Subagents
Task complexity	1-3 steps	4-15 steps, deterministic	Open-ended, requires judgment
Frequency	Any	Repeated regularly	As needed
Context cost	Low per call	Low (compressed)	High (full context window)
Flexibility	Maximum	Parameterized	Maximum
Error handling	Agent handles	Skill handles internally	Subagent handles
Parallelism	Sequential (same context)	Sequential (same context)	Parallel (separate contexts)
Example	Read a file, click a button	Deploy, run test suite	PR review, feature implementation

A common mistake is defaulting to subagents for everything. Subagents are powerful but expensive. If a task can be accomplished with a skill, the skill is more efficient. If it can be accomplished with a few MCP tool calls, that is even better. Reserve subagents for work that genuinely requires independent reasoning.

6. Context Management for Parallel Agents

When running multiple agents in parallel, context management becomes the primary architectural challenge. Each agent has its own context window, which means each agent has its own partial view of the project state. Without explicit coordination, parallel agents will make conflicting decisions.

The most effective coordination mechanisms are file-based. A shared spec file (like CLAUDE.md) ensures all agents follow the same conventions. A shared task list prevents two agents from working on the same thing. File-level locking or ownership conventions prevent two agents from editing the same file simultaneously.

Context budgeting is another critical practice. If you are running five parallel agents, your total token budget is five times what a single agent would use. This can get expensive quickly. The way to control costs is to give each agent only the tools and context it needs. The frontend agent does not need database tools. The test agent does not need deployment tools. Scoping each agent's toolkit reduces both token usage and the probability of the agent taking unintended actions.

A practical pattern for parallel agent coordination:

Shared spec file - All agents read the same CLAUDE.md for conventions and constraints.
File ownership - Each agent "owns" specific directories or files. If an agent needs to modify a file owned by another agent, it writes to a staging area and the owner merges.
Status updates - Each agent writes its status (in progress, blocked, complete) to a shared status file. The orchestrator reads this file to coordinate dependencies.
Conflict resolution - When two agents modify related files, a reconciliation step (either automated or via the orchestrator) resolves conflicts before merging.

7. Putting It Together - A Practical Architecture

A well-designed agent system uses all three abstraction layers. Here is what a production architecture typically looks like:

Base layer: MCP tools. Every external interaction goes through an MCP server. File operations, database access, browser automation, API calls - all exposed as standardized tools. This creates a consistent, auditable interface between the agent and the outside world.

Middle layer: skills. Frequently repeated workflows are packaged as skills. Deployment, testing, code formatting, PR creation, environment setup - each is a skill that internally uses MCP tools but presents as a single high-level operation to the agent.

Top layer: subagents. Large, judgment-heavy tasks are delegated to subagents. Feature implementation, code review, bug investigation, architecture design - each gets a dedicated agent with its own context and tool access.

Orchestration: coordinator agent. A single coordinator agent manages the overall workflow. It reads the task list, spawns subagents for large tasks, invokes skills for routine operations, and uses MCP tools for simple queries. The coordinator spends most of its context budget on coordination logic rather than execution.

This layered architecture mirrors how effective human teams work. The coordinator is the project manager. Subagents are the senior engineers who work independently on complex tasks. Skills are the standard operating procedures that anyone can follow. MCP tools are the shared infrastructure that everyone uses. Each layer has a clear purpose, and the boundaries between them are well-defined.

The agent ecosystem is evolving rapidly, and the boundaries between these layers will continue to shift. But the underlying principle is stable: match the abstraction to the task. Simple operations get simple tools. Routine workflows get packaged skills. Complex work gets dedicated agents. Getting this mapping right is the difference between an agent system that works and one that burns through tokens without delivering results.

Try an agent built on MCP

Fazm is an open-source macOS agent that uses MCP for extensible desktop automation. Control your browser, write code, and handle documents through accessibility APIs. Free to start.

Free to start. Fully open source. Runs locally on your Mac.