Opus 4.7's new tokenizer prompts a token waste audit
Pawel Jozefiak shares how Claude Opus 4.7's new tokenizer — which can use up to 35% more tokens on identical text — pushed him to rigorously audit token waste across 133,087 agent turns.
Score breakdown
Teams running Claude agents at scale should audit token usage now — Opus 4.7's new tokenizer can silently inflate costs by up to 35% on unchanged prompts, and infrastructure failures (not model reasoning errors) may be the largest source of waste.
- 01Claude Opus 4.7 was shipped by Anthropic on April 16, 2026, at the same per-token price as 4.6.
- 02Opus 4.7 introduces a new tokenizer that official docs say may use up to 35% more tokens for the same fixed text.
- 03Pawel Jozefiak analyzed 133,087 agent turns across 9,667 sessions to measure token waste.
Pawel Jozefiak writes that Anthropic's Claude Opus 4.7, released April 16, 2026, carries the same per-token price as its predecessor but ships with a new tokenizer. The official documentation notes that this tokenizer "may use up to 35% more tokens for the same fixed text" — a quiet change that translates to a significant effective cost increase for anyone running Claude agents at scale. Jozefiak, who runs Claude agents continuously across his stack, treated this as the trigger to stop estimating and start measuring.
His methodology involved building a token waste sorter over 9,667 sessions and 133,087 turns, using model-vs-model comparisons for judgment calls.
His methodology involved building a token waste sorter over 9,667 sessions and 133,087 turns, using model-vs-model comparisons for judgment calls. He draws a deliberate distinction between two categories most guides conflate: waste (turns that produced no useful output) and inefficient usage (turns that worked but consumed far more tokens than necessary). The dominant waste cluster turned out to be Browser/Playwright infrastructure failures — not hallucinations, bad reasoning, or runaway agent loops — spread thinly across hundreds of low-cost sessions but outweighing all other categories in aggregate.
Three fixes required no additional spend: shrinking the `CLAUDE.md` context file, enforcing tight `max_tokens` limits, and auditing WebFetch failures. Jozefiak shares before-and-after results for each, noting which intervention moved the needle most. The broader takeaway he emphasizes is that budget assumptions made without measurement are likely to be wrong about where the real losses occur.
Key facts
- 01Claude Opus 4.7 was shipped by Anthropic on April 16, 2026, at the same per-token price as 4.6.
- 02Opus 4.7 introduces a new tokenizer that official docs say may use up to 35% more tokens for the same fixed text.
- 03Pawel Jozefiak analyzed 133,087 agent turns across 9,667 sessions to measure token waste.
- 04He separates 'waste' (turns producing nothing useful) from 'inefficient usage' (turns that worked but cost more than necessary).
- 05The top waste cluster was Browser/Playwright infrastructure failures, not hallucinations or runaway loops.
- 06Three zero-cost fixes identified: shrinking CLAUDE.md, setting tight max_tokens, and auditing WebFetch failures.