Jun 8, 2026·1 min readApplications & Use Cases

Token overhead in MCP tool calls traced across 400 calls

u/LorenzoNardi logged token usage across 400 MCP tool calls on a document-processing server and found that tool definition schemas, retry sequences, and verbose outputs are the dominant cost drivers.

r/mcp·u/LorenzoNardi

Read at source

Composite

5.7

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

The analysis surfaces retry sequences and tool-definition schema bloat as significant but non-obvious token cost drivers in MCP deployments, with concrete measurements showing retries cost 2.8x a clean call and schema overhead can reach ~10k tokens before any real work begins.

01Analyzed 400 MCP tool calls across 3 tool types on a document-processing server
02Tool definition schemas cost ~800–1,200 tokens per call in system context before any data is passed
0310 tools loaded simultaneously can consume ~10k tokens before the first real action

Summary— our read of the original

u/LorenzoNardi began logging token counts per MCP call after costs on a document-processing server climbed faster than expected. Examining 400 calls across three tool types, the analysis broke overhead into four categories: tool definition schemas, input payload size, output verbosity, and error-handling retry paths.

A single well-documented tool with full parameter descriptions consumes ~800–1,200 tokens per call in system context before any user data is sent.

The most surprising finding was tool definition overhead. A single well-documented tool with full parameter descriptions consumes ~800–1,200 tokens per call in system context before any user data is sent. With 10 tools loaded, that baseline reaches ~10k tokens before the first real action. Retry sequences were the second major surprise: ~18% of calls involved a retry, and those sequences cost 2.8x a clean first-attempt call on average because the agent pays for the original call, the error message, the reformulated call, and the retry itself.

Output verbosity was identified as the easiest lever to pull — returning only requested fields instead of raw JSON dumps reduced output tokens by ~60% on average with no degradation in agent performance. Recommended mitigations include trimming tool descriptions to the minimum needed for correct tool selection, adding a pre-return validation layer to eliminate malformed outputs (and thus most retries), and grouping read and write tools into separate servers to keep the selection context smaller per task type.

Key facts

01Analyzed 400 MCP tool calls across 3 tool types on a document-processing server
02Tool definition schemas cost ~800–1,200 tokens per call in system context before any data is passed
0310 tools loaded simultaneously can consume ~10k tokens before the first real action
04~18% of calls were retry sequences, costing 2.8x a clean first-attempt call on average
05Returning only requested fields vs. full JSON dumps cut output tokens by ~60% on average
06Retry overhead is easy to miss when only looking at average cost per call

Topics

#mcp #tool-use #cost-optimization #agent-framework #developer-tools

Methodology

Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 9, 2026 · 17:05 UTC. How this works →

Jun 8, 2026·1 min readApplications & Use Cases

Token overhead in MCP tool calls traced across 400 calls

u/LorenzoNardi logged token usage across 400 MCP tool calls on a document-processing server and found that tool definition schemas, retry sequences, and verbose outputs are the dominant cost drivers.

r/mcp·u/LorenzoNardi

Read at source

Composite

5.7

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

01Analyzed 400 MCP tool calls across 3 tool types on a document-processing server
02Tool definition schemas cost ~800–1,200 tokens per call in system context before any data is passed
0310 tools loaded simultaneously can consume ~10k tokens before the first real action

Summary— our read of the original

A single well-documented tool with full parameter descriptions consumes ~800–1,200 tokens per call in system context before any user data is sent.

Key facts

01Analyzed 400 MCP tool calls across 3 tool types on a document-processing server
02Tool definition schemas cost ~800–1,200 tokens per call in system context before any data is passed
0310 tools loaded simultaneously can consume ~10k tokens before the first real action
04~18% of calls were retry sequences, costing 2.8x a clean first-attempt call on average
05Returning only requested fields vs. full JSON dumps cut output tokens by ~60% on average
06Retry overhead is easy to miss when only looking at average cost per call

Topics

#mcp #tool-use #cost-optimization #agent-framework #developer-tools

Methodology

Score breakdown

Key facts

Topics

More in Applications & Use Cases.

Score breakdown

Key facts

Topics

More in Applications & Use Cases.