Tokens Used Loading MCP Tools - Measuring and Reducing the Overhead

Matthew Diakonov

Updated March 19, 2026

mcp tokens optimization cursor claude-code ai-tools

Tokens Used Loading MCP Tools - Measuring and Reducing the Overhead

Every time your AI coding tool starts a conversation, it loads tool schemas into the context window. With MCP servers, those schemas add up fast. 31 tools can eat 3,000 to 5,000 tokens before a single user message is processed.

That is real money and real context window space consumed by tool definitions your agent might never use in that conversation.

Measuring the Actual Cost

You can measure tool token overhead by comparing the token count of an empty conversation with and without MCP servers enabled. The difference is your tool loading cost. In Cursor with three MCP servers providing 31 tools, the overhead was consistently 3,200 to 4,800 tokens per conversation.

For a developer running 50 conversations a day, that is 160,000 to 240,000 tokens per day spent on tool schemas alone. At current API pricing, that adds up.

Where the Tokens Go

Each tool schema includes the function name, description, parameter definitions, and type annotations. Verbose descriptions are the biggest offender. A single tool with a detailed description and five parameters can use 200 or more tokens.

The worst cases are tools with nested object parameters, long enums, or example values baked into the schema. These are helpful for the model but expensive in tokens.

Reducing the Overhead

Several strategies work:

Trim descriptions - Keep tool descriptions under 50 words. The model usually gets it with less.
Lazy loading - Only register tools when the conversation topic suggests they will be needed.
Schema compression - Remove optional parameter descriptions and rely on parameter names being self-explanatory.
Server consolidation - Merge related MCP servers to reduce duplicate type definitions.
Dynamic tool sets - Use an orchestrator that selects relevant tools based on the user's first message.

The Native Advantage

Desktop agents that use native APIs like the macOS Accessibility API do not pay this token tax. The agent accesses UI elements directly through system calls rather than loading tool schemas into the context window. Fewer tools, lower overhead, more room for actual work.

Fazm is an open source macOS AI agent. Open source on GitHub.

Tokens Used Loading MCP Tools - Measuring and Reducing the Overhead

Tokens Used Loading MCP Tools - Measuring and Reducing the Overhead

Measuring the Actual Cost

Where the Tokens Go

Reducing the Overhead

The Native Advantage

More on This Topic

Related Posts

The Hidden Token Cost of MCP Tools in Cursor and How to Fix It

A/B Testing Claude Code Hooks - Optimizing Token Usage

Claude Code with MCP Is the Cursor Equivalent for Research and Marketing