Fable 5's $50/M output pricing forces cost-aware routing into agent architecture
A Reddit post by u/StudentSweet3601 argues that Anthropic's new Fable 5 model — priced at $10/M input and $50/M output, double Opus 4.8 — makes cost-aware model routing a mandatory architectural concern for agent builders due to token fan-out in multi-step agentic workflows.
Score breakdown
Fable 5's combination of frontier pricing and agentic fan-out means per-step model routing, token budgets, and cost-per-task observability shift from optional optimizations to required components of any production agent orchestration layer.
- 01Fable 5 is described as a new Mythos-class model from Anthropic, positioned above Opus
- 02Pricing is $10/M input and $50/M output — exactly double Opus 4.8
- 03Anthropic is explicitly marketing Fable 5 for multi-day autonomous sessions with sub-agent delegation
u/StudentSweet3601 on r/AI_Agents contends that Anthropic's Fable 5 — a new Mythos-class model positioned above Opus, priced at $10/M input and $50/M output — makes the "default to the best model" approach untenable for agent builders. The critical issue is not the headline price multiple over Opus 4.8, but token fan-out: Anthropic is explicitly marketing Fable 5 for multi-day autonomous sessions with sub-agent delegation, and a single complex request can expand into tens of millions of tokens across planning passes, sub-agent spawns, tool call loops, retries, and self-verification steps. At $50/M output, that fan-out can produce a four-figure bill for what the end user experiences as a single question. The post also notes that Fable 5 thinks longer and writes more per turn than Opus 4.8, meaning the effective cost per task is well above the 2x the price sheet implies. The author reports burning roughly 2% of their Max 20x usage window per minute during heavy sessions — a rate Opus 4.8 never approached on the same workloads.\n\nThe post prescribes a tiered routing architecture in response: a cheap model (Haiku/Sonnet class) for classification, extraction, and glue work; a mid-tier model for standard reasoning; and Fable 5 reserved only for steps that genuinely require frontier capability. Alongside routing, the author highlights prompt caching (citing a 90% input discount), per-task token budgets and cost ceilings as first-class orchestration concerns, and cost-per-task observability rather than cost-per-call — because fan-out is where budgets collapse. As a cautionary data point, the post cites a report that Uber burned through their annual AI budget in four months, before this pricing tier even existed. The post closes by asking how the community is handling routing today — static rules, an LLM judge, or absorbing the cost.
Key facts
- 01Fable 5 is described as a new Mythos-class model from Anthropic, positioned above Opus
- 02Pricing is $10/M input and $50/M output — exactly double Opus 4.8
- 03Anthropic is explicitly marketing Fable 5 for multi-day autonomous sessions with sub-agent delegation
- 04A single complex agentic request can fan out into tens of millions of tokens, potentially producing a four-figure bill
- 05The author reports burning roughly 2% of their Max 20x usage window per minute during heavy Fable 5 sessions
- 06Prompt caching is cited as offering a 90% input discount
- 07Uber reportedly burned through their annual AI budget in four months, before this pricing tier existed
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 11, 2026 · 08:34 UTC. How this works →