TACO framework compresses terminal agent context to cut token costs
TACO is a plug-and-play, self-evolving compression framework that automatically learns rules to reduce redundant terminal feedback in agentic interaction histories, cutting token overhead while improving benchmark performance.
Score breakdown
Developers building long-horizon coding agents can drop TACO into existing terminal agent frameworks to cut token costs and improve accuracy without redesigning their pipelines.
- 01TACO is a plug-and-play, self-evolving Terminal Agent Compression framework that learns compression rules from interaction trajectories.
- 02The core problem it addresses: retaining raw terminal feedback across multi-turn interactions causes token costs to grow quadratically with the number of steps.
- 03TACO was evaluated on six benchmarks: TerminalBench (TB 1.0 and TB 2.0), SWE-Bench Lite, CompileBench, DevEval, and CRUST-Bench.
The paper introduces TACO (Terminal Agent Compression), a plug-and-play framework targeting a fundamental scalability problem in long-horizon, multi-turn terminal-centric agentic tasks. When agents preserve raw environment feedback in their interaction history to support future decisions, the cumulative token cost grows quadratically with the number of steps — a significant bottleneck for long-horizon reasoning. Existing mitigation strategies based on heuristics or fixed prompts struggle to generalize across the heterogeneous nature of terminal environments, motivating a more adaptive approach.
TACO addresses this by automatically discovering and refining compression rules directly from interaction trajectories, allowing it to evolve task-aware compression strategies without manual intervention.
TACO addresses this by automatically discovering and refining compression rules directly from interaction trajectories, allowing it to evolve task-aware compression strategies without manual intervention. The framework is designed to be compatible with existing terminal agents as a drop-in enhancement. Experiments span six benchmarks: TerminalBench (TB 1.0 and TB 2.0), SWE-Bench Lite, CompileBench, DevEval, and CRUST-Bench. With MiniMax-2.5 as the backbone, TACO reduces token overhead by around 10% while maintaining or improving performance on most benchmarks. On TerminalBench specifically, it yields consistent accuracy gains of 1%–4% across strong agentic models, and improves accuracy by around 2%–3% when operating under the same token budget — demonstrating that self-evolving, task-aware compression can meaningfully improve both efficiency and capability in terminal agent settings.
Key facts
- 01TACO is a plug-and-play, self-evolving Terminal Agent Compression framework that learns compression rules from interaction trajectories.
- 02The core problem it addresses: retaining raw terminal feedback across multi-turn interactions causes token costs to grow quadratically with the number of steps.
- 03TACO was evaluated on six benchmarks: TerminalBench (TB 1.0 and TB 2.0), SWE-Bench Lite, CompileBench, DevEval, and CRUST-Bench.
- 04With MiniMax-2.5 as the backbone model, TACO reduces token overhead by around 10%.
- 05On TerminalBench, TACO delivers consistent accuracy gains of 1%–4% across strong agentic models.
- 06Under the same token budget, TACO further improves accuracy by around 2%–3% on TerminalBench.
- 07