Claude Code and Codex write local token logs you can read today
Claude Code and Codex both write detailed per-session token logs to local disk — including cache hit/miss breakdowns — and a few lines of Node.js are enough to aggregate them into actionable usage metrics.
Score breakdown
Both Claude Code and Codex have been writing granular token and cache-usage data to local disk all along, meaning developers can diagnose and fix prompt cache inefficiencies — the primary driver of hitting subscription limits — without any API call or provider dashboard.
- 01Claude Code writes per-session JSONL transcripts to `~/.claude/projects/` with a `usage` block on every assistant message.
- 02The `usage` block contains four fields: `input_tokens`, `cache_read_input_tokens`, `cache_creation_input_tokens`, and `output_tokens`.
- 03Codex writes cumulative `token_count` events to `~/.codex/sessions/`; usage must be computed as deltas between events, not summed directly.
Rob's post on Dev.to explains that both Claude Code and Codex silently write detailed token usage logs to local disk after every session, and that most developers have never looked at them. Claude Code stores a JSONL transcript per session under `~/.claude/projects/`, where each assistant message includes a `usage` block with four fields: `input_tokens` (uncached input), `cache_read_input_tokens` (context served from the prompt cache), `cache_creation_input_tokens` (context written to cache), and `output_tokens`. Codex takes a different approach, writing `token_count` events with a cumulative running total under `~/.codex/sessions/` — meaning you must take the delta between events rather than summing them directly.
The central metric the post emphasizes is the prompt cache hit rate: `cache_read / (cache_read + cache_creation + uncached_input)`.
The post walks through a short Node.js snippet that reads these JSONL files, deduplicates Claude messages by `uuid` and Codex entries by session delta, and aggregates token counts per model per day — without ever reading prompt or response text. The central metric the post emphasizes is the prompt cache hit rate: `cache_read / (cache_read + cache_creation + uncached_input)`. A low hit rate indicates the agent is re-sending the same context on every turn rather than reusing cached prefixes, and the fix is structural: stabilize the front of the prompt, keep tool definitions lean, and avoid reshuffling system context between turns.
Rob notes an important caveat for flat-plan subscribers: any dollar figure derived from these logs is an API list-price equivalent, not an actual cost, making token volume and cache hit rate the genuinely meaningful signals. He also introduces ModelMeter (`modelmeter.dev`), a tool installable via `npx modelmeter-collect` that reads the local logs, transmits only token counts to a dashboard, and labels every figure by how it was derived. It supports Claude Code, Codex, metered API keys, and CSV exports, and can be kept live via a Claude Code Stop hook or a cron job.
Key facts
- 01Claude Code writes per-session JSONL transcripts to `~/.claude/projects/` with a `usage` block on every assistant message.
- 02The `usage` block contains four fields: `input_tokens`, `cache_read_input_tokens`, `cache_creation_input_tokens`, and `output_tokens`.
- 03Codex writes cumulative `token_count` events to `~/.codex/sessions/`; usage must be computed as deltas between events, not summed directly.
- 04The key efficiency metric is prompt cache hit rate: `cache_read / (cache_read + cache_creation + uncached_input)`.
- 05A low cache hit rate means the agent is re-sending the same context on every turn, burning through usage limits.
- 06For flat-plan subscribers, any dollar figure from these logs is an API list-price equivalent, not actual cost.
- 07Rob introduces ModelMeter (`npx modelmeter-collect`), which reads local logs and sends only token counts to a dashboard at modelmeter.dev.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 18, 2026 · 10:40 UTC. How this works →