Overview

Checklist

Examples

Context Management

Last verified: 2026-04-17 · next review in 118 days

Every agent has a context window — the total text it can "see" in one turn. Models in 2026 commonly ship with 200k to 1M tokens, but how much you use matters more than the ceiling. LLM performance degrades as context fills up. Managing context is the single highest-leverage skill in agentic coding.

Rule of thumb

Small tasks → narrow context. The less you load, the sharper the model.
Long investigations → sub-agents. Delegate exploration so the main session stays clean.
Unrelated tasks → new session. /clear resets the window.

What fills the window

On a typical Claude Code session, the context absorbs:

System prompt (fixed, a few hundred tokens)
CLAUDE.md files (global + project + subdirectory) — loaded at session start
Tool schemas for enabled MCP servers — each server's tools add tokens
Your messages + the agent's replies
Every file the agent reads — a 500-line TypeScript file is ~4k tokens
Every command output — pnpm test --verbose can be thousands of tokens
Every web page the agent fetches

A 30-minute debugging session on a moderate codebase can consume 100k+ tokens. Once you're past ~50% of the window, quality degrades.

Five techniques

1. `@`-imports in CLAUDE.md

Don't inline large docs in CLAUDE.md. Reference them:

# My Project

See @README.md for project overview.
See @docs/architecture.md for system design.

## Code style

- TypeScript strict, no `any`
- …

The agent loads the imported file only when it needs to read it.

2. Sub-agents for investigation

Sub-agents run in separate context windows and report summaries back:

You (main session): "Use a sub-agent to investigate how we handle
                    auth across the codebase. Return a summary of
                    the files touched and the session flow."

Sub-agent: [reads 40 files in its own context]
           ↓
           [returns a 500-token summary]

You (main session): still clean.

This is the single biggest win for long sessions. Claude Code, Claude Agent SDK, and OpenAI Agents SDK all support sub-agents; LangGraph models them as sub-graphs.

3. `/clear` and `/compact`

/clear — wipes the conversation. Start fresh when switching tasks.
/compact — summarizes the session so far into a shorter form and continues. Use when you want to keep going on the same task but context is getting heavy.
/compact <instructions> — direct the summarization: "focus on the API changes" keeps what matters, drops what doesn't.

Run /clear between unrelated tasks. Don't wait for auto-compact.

4. Read ranges, not whole files

When asking the agent to inspect a specific function:

Bad:  "Read src/services/pipeline/index.ts and tell me…"
      → loads the whole file
Good: "Read src/services/pipeline/index.ts lines 120-180 and tell me…"
      → loads ~2k tokens

Most agent tools (Claude Code's Read) accept offset + limit. Use them.

Forget earlier instructions
Repeat itself
Fail tasks it handled cleanly 10 minutes ago
Produce longer, less focused replies

Don't push through. Run /compact with specific instructions, or /clear and start a fresh session with a tighter prompt that captures what you learned.

Anti-patterns

The kitchen-sink session. You started debugging auth, then asked about unrelated API work, then came back to auth. Context is 50k tokens of noise. Fix: /clear between unrelated tasks.
Reading the whole repo "to understand". 200 files loaded, 90% of it irrelevant, context saturated. Fix: ask the agent to list files first, then read only the relevant ones; or use a sub-agent.
Auto-compacting every 5 minutes. Compaction loses nuance. Compact infrequently with clear instructions; /clear more often.
Huge CLAUDE.md. A 500-line CLAUDE.md rides along in every session. Prune aggressively — Anthropic's own guidance is "ruthlessly delete lines that don't change Claude's behavior."

On this page

Context Management

Rule of thumb

What fills the window

Five techniques

1. `@`-imports in CLAUDE.md

2. Sub-agents for investigation

3. `/clear` and `/compact`

4. Read ranges, not whole files

5. Cap MCP server count

Monitoring context usage

When context is full

Anti-patterns

Further reading

On this page

On this page

Context Management

Rule of thumb

What fills the window

Five techniques

1. `@`-imports in CLAUDE.md

2. Sub-agents for investigation

3. `/clear` and `/compact`

4. Read ranges, not whole files

5. Cap MCP server count

Monitoring context usage

When context is full

Anti-patterns

Further reading

On this page