Open source Python library targets context bloat in long coding agent sessions
u/Natural_Patience_228 is building an open source Python library that gives developers explicit, precise control over what enters an LLM's context window during long coding sessions with tools like Cursor, Aider, or Claude Code.
Score breakdown
The library directly addresses silent context truncation and token bloat — two failure modes the post identifies as causing hallucinations and wasted tokens in long coding agent sessions — by giving developers explicit, budget-controlled management of what enters the context window.
- 01Built as a Python library for explicit, precise control over LLM context windows in coding agent sessions
- 02Two-layer architecture: a summary agent for compressed session state, plus user-controlled context injection
- 03File and subfile chunking lets users inject whole files or specific functions/classes
u/Natural_Patience_228 is soliciting feedback on an open source Python library aimed at the context management problem that arises in long coding agent sessions with tools like Cursor, Aider, and Claude Code — where context either bloats with irrelevant history or gets silently truncated at critical moments.
User-configurable token limits can be set separately for the summary and context layers, and the library works across different models and context window sizes.
The library uses a two-layer architecture. The first layer is a summary agent that automatically maintains a compressed, always-accurate state of the session within a configurable token budget. The second layer gives the user explicit control over what is injected into the context window, supporting whole-file or subfile chunking (down to individual functions or classes), automatic dependency fetching when a referenced chunk is missing, and relationship tracking between chunks to prevent orphaned context. User-configurable token limits can be set separately for the summary and context layers, and the library works across different models and context window sizes.
Beyond the active session, the library also supports a cross-session context library where chunks from past sessions are stored and made searchable, surfacing relevant context automatically in new sessions. Context snapshots allow saving and restoring exact context states, enabling branching from a known-good point before attempting risky changes. An intent-based suggestion feature lets users type a title for their next prompt and receive relevant chunk suggestions from both the current session and the stored library. The post frames the library as provider-agnostic, supporting OpenAI, Anthropic, and Ollama backends.
Key facts
- 01Built as a Python library for explicit, precise control over LLM context windows in coding agent sessions
- 02Two-layer architecture: a summary agent for compressed session state, plus user-controlled context injection
- 03File and subfile chunking lets users inject whole files or specific functions/classes
- 04Dependency auto-fetch pulls in missing referenced chunks automatically
- 05Cross-session context library stores and indexes chunks from past sessions for reuse
- 06Context snapshots allow saving, restoring, and branching context state
- 07Provider-agnostic: supports OpenAI, Anthropic, and Ollama
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 15, 2026 · 11:57 UTC. How this works →