MCP schema design cuts token cost from ~1,840 to ~480 per call
u/LorenzoNardi tested five MCP tool schema designs across 50 calls each and found token usage ranged from ~1,840 tokens/call (verbose) down to ~480 tokens/call (minimal + output pruning), with accuracy holding steady until the field-names-only version.
Score breakdown
The experiment provides concrete token-count measurements showing that schema design and output pruning — not model choice — are the dominant levers for reducing MCP call costs, with output pruning alone responsible for 35–40% of total token overhead.
- 01Five schema versions of the same MCP tool were tested over 50 calls each using the same model.
- 02Token counts ranged from ~1,840/call (verbose with examples and nested objects) to ~480/call (minimal + output pruning).
- 03Task accuracy did not degrade until Version 4 (field names only, no descriptions).
u/LorenzoNardi followed up a prior experiment tracking token overhead across 400 MCP calls by isolating schema design as a variable. Using the same tool, the same model, and 50 calls per variant, five input schema versions were tested: Version 1 (verbose, full descriptions, examples, nested objects) at ~1,840 tokens/call; Version 2 (descriptions only, no examples) at ~1,210 tokens/call; Version 3 (minimal descriptions, flat structure) at ~890 tokens/call; Version 4 (field names only, no descriptions) at ~620 tokens/call; and Version 5 (minimal descriptions plus output pruning to only needed fields) at ~480 tokens/call.
A key finding was that task accuracy held steady through Version 3 and only degraded at Version 4, suggesting that minimal descriptions are sufficient provided field names are self-explanatory.
A key finding was that task accuracy held steady through Version 3 and only degraded at Version 4, suggesting that minimal descriptions are sufficient provided field names are self-explanatory. The most impactful optimization, however, was output pruning: tools were returning full objects when agents only needed 3–4 fields, and that excess output alone accounted for 35–40% of total token cost. The post concludes that schema optimization — not model optimization — is the highest-leverage starting point for reducing MCP token overhead.
Key facts
- 01Five schema versions of the same MCP tool were tested over 50 calls each using the same model.
- 02Token counts ranged from ~1,840/call (verbose with examples and nested objects) to ~480/call (minimal + output pruning).
- 03Task accuracy did not degrade until Version 4 (field names only, no descriptions).
- 04Minimal descriptions are sufficient as long as field names are self-explanatory.
- 05Output pruning — returning only needed fields instead of full objects — accounted for 35–40% of total token cost.
- 06The post recommends starting optimization with schema design rather than model selection.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 9, 2026 · 17:05 UTC. How this works →