GCF proxy cuts MCP tool response tokens by 79% with session dedup
`gcf-proxy` is a drop-in MCP proxy that encodes JSON tool responses into GCF, a compact wire format, reducing input tokens by 79% and improving LLM comprehension accuracy from 53.6% to 90.7% across 10 models.
Score breakdown
The proxy delivers simultaneous token cost reduction and accuracy improvement over plain JSON — without requiring any changes to existing MCP servers — by replacing a format that causes LLM comprehension failures at scale with one that scores 90.7% vs. JSON's 53.6% on the same data.
- 01GCF encodes JSON with field names declared once in a header and positional row values, eliminating repeated field names that overwhelm LLM attention.
- 02At 500 symbols: JSON = 53,341 tokens at 53.6% accuracy; GCF = 11,090 tokens at 90.7% accuracy — 79% fewer tokens.
- 03Benchmarked across 1,300+ evaluations on 10 models from Anthropic, OpenAI, and Google; GCF outperforms JSON and TOON on every model.
u/blackwell-systems posted about `gcf-proxy`, a drop-in proxy for the Model Context Protocol that encodes JSON tool responses into GCF (Generic Compact Format), a text-based wire format designed around how LLMs actually process structured data. Rather than repeating field names on every record — a pattern that overwhelms attention in large payloads — GCF declares field names once in a header and uses positional values for rows. The proxy requires zero code changes to existing MCP servers: servers continue outputting JSON, the proxy encodes it to GCF on the way to the LLM, and decodes GCF tool call arguments back to JSON on the way out. The format is lossless, with `decode(encode(value)) == value` verified across 200M+ round-trips.
Benchmarks across 1,300+ evaluations on 10 models from Anthropic, OpenAI, and Google show GCF outperforming both JSON and TOON on every model tested.
Benchmarks across 1,300+ evaluations on 10 models from Anthropic, OpenAI, and Google show GCF outperforming both JSON and TOON on every model tested. At 500 symbols, JSON scored 53.6% comprehension accuracy at 53,341 tokens; GCF scored 90.7% at 11,090 tokens. Four models hit 100% accuracy with GCF. When errors do occur, GCF's median error magnitude is 4 (off by 1–2 on precision), versus 53 for TOON and 56 for JSON. A real `agent-lsp` session exploring a TypeScript codebase across six tool calls reduced total bytes from 280,191 (JSON) to 24,199 (GCF + session dedup) — a 91% reduction — driven by natural symbol overlap as the agent re-examined related files.
`v0.10.0` ships four opt-in features: `--session` for session deduplication (previously transmitted symbols become bare references), `--delta` for delta encoding (68% savings on 20-symbol payloads with 10% change), `--cache` for response caching of identical calls, and HTTP support via `--upstream` and `--http` flags for connecting to or deploying as a remote MCP service. Six language implementations (Go, TypeScript, Python, Rust, Swift, Kotlin) are available at v1.0.0+, and a whitepaper has been published with DOI `10.5281/zenodo.20579817`.
Key facts
- 01GCF encodes JSON with field names declared once in a header and positional row values, eliminating repeated field names that overwhelm LLM attention.
- 02At 500 symbols: JSON = 53,341 tokens at 53.6% accuracy; GCF = 11,090 tokens at 90.7% accuracy — 79% fewer tokens.
- 03Benchmarked across 1,300+ evaluations on 10 models from Anthropic, OpenAI, and Google; GCF outperforms JSON and TOON on every model.
- 04GCF median error magnitude is 4; TOON is 53; JSON is 56 — GCF fails on precision (off by 1–2), not comprehension.
- 05A real agent-lsp session on a TypeScript codebase reduced 280,191 JSON bytes to 24,199 GCF+dedup bytes — a 91% reduction across 6 tool calls.
- 06v0.10.0 adds session dedup (--session), delta encoding (--delta, 68% savings on 20-symbol payloads with 10% change), response caching (--cache), and HTTP frontend/backend support.
- 07Six language implementations (Go, TypeScript, Python, Rust, Swift, Kotlin) are available at v1.0.0+; a whitepaper is published at DOI 10.5281/zenodo.20579817.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 12, 2026 · 10:05 UTC. How this works →