Mental framework for reliable agentic workflows with Claude
Basti Ortiz introduces the "Principle of Least Context" — treating the main Claude session as a pure orchestrator and delegating all heavy work to isolated sub-agents — to prevent context rot and enable reliable long-running agentic workflows.
Score breakdown
Apply the Principle of Least Context now — by routing all tool calls and file reads through isolated sub-agents and keeping the main orchestrator lean — to prevent context rot from silently degrading Claude Code's output quality on long-running tasks.
- 01Article originated as a talk at the inaugural Claude Code Manila meetup on March 5, 2026, by Basti Ortiz.
- 02Anecdotally, LLMs like Claude enter a 'dumb zone' beyond 40% of their context window, where 'context rot' degrades reasoning and recall.
- 03Claude Code automatically compacts conversations at 85% context capacity, producing lossy summaries.
Originally delivered as a talk at the inaugural Claude Code Manila meetup on March 5, 2026, Basti Ortiz's article argues that most developers misuse agentic AI tools by dumping everything into a single long-running conversation. The core problem is context rot: anecdotally, LLMs like Claude experience significant degradation in reasoning and recall beyond 40% of their context window. Claude Code addresses this by auto-compacting conversations at 85% capacity, but compacted summaries are lossy — and if workflows aren't carefully managed, those summaries accumulate until the preamble itself exceeds the 40% threshold, triggering rot all over again.
To illustrate the scale of the problem, Ortiz notes that a single parallelized Explore sub-agent can consume ~60k tokens.
To illustrate the scale of the problem, Ortiz notes that a single parallelized Explore sub-agent can consume ~60k tokens. A medium-sized feature in a large codebase typically spawns 3 parallel Explore sub-agents, each finishing at ~60k tokens — meaning a naive sequential approach would burn ~180k tokens of a 200k budget before any actual planning or implementation work begins. Claude's own trained "context awareness" compounds this by making the model progressively lazier as its window fills.
Ortiz's prescription is the **Principle of Least Context**: the main session should function purely as an orchestrator with zero data leakage from sub-agent internals. Sub-agents perform all discovery, file reads, and tool calls in their own isolated, throwaway context windows, then return only their final output to the orchestrator. Running these sub-agents in parallel transforms Claude into a system capable of processing practically infinite context — documents, emails, meeting transcripts, procurement requests, and more — without compounding intelligence degradation.
Key facts
- 01Article originated as a talk at the inaugural Claude Code Manila meetup on March 5, 2026, by Basti Ortiz.
- 02Anecdotally, LLMs like Claude enter a 'dumb zone' beyond 40% of their context window, where 'context rot' degrades reasoning and recall.
- 03Claude Code automatically compacts conversations at 85% context capacity, producing lossy summaries.
- 04A single parallelized Explore sub-agent can consume approximately 60k tokens.
- 05A medium-sized feature in a large codebase typically spawns 3 parallel Explore sub-agents (~60k tokens each), consuming ~180k of a 200k token budget before planning even begins.
- 06