MemToolAgent boosts tool-use accuracy via structured memory retrieval
MemToolAgent is a new framework that improves LLM agents' tool-using capabilities by storing and retrieving structured memories from past user-agent interactions, achieving relative improvements of 29%, 80%, and 17% on the WorkBench, NESTFUL, and PEToolBench benchmarks without requiring LLM fine-tuning.
Score breakdown
MemToolAgent demonstrates that structured memory management — without any LLM fine-tuning — can substantially improve tool-use accuracy, with an 80% relative gain on NESTFUL showing the approach's potential to close the gap between static LLM agents and agents that learn from experience.
- 01MemToolAgent is proposed by Suleyman Armagan Er, Danilo Ribeiro, and Yogesh Virkar.
- 02The framework improves LLM agents' tool-using capabilities using memory from past user-agent conversations.
- 03A memory extraction module converts past experiences into structured memory entries using environment and user feedback.
Suleyman Armagan Er, Danilo Ribeiro, and Yogesh Virkar introduce MemToolAgent, a framework designed to address a gap in current LLM agent research: while memory systems exist for dialogue agents, few studies have examined how memory of past user-agent conversations can specifically improve tool-using capabilities. MemToolAgent tackles this by combining a memory extraction module — which processes past experiences into structured memory entries using environment and user feedback — with a retrieval module that dynamically selects a subset of stored entries based on memory similarity distribution.
Second, it employs reflection-based memory extraction that distills incorrect tool executions into critiques stored for future reference.
The framework makes three core contributions. First, it defines a unified memory entry format that supports both general-purpose and personalized tool use without requiring LLM fine-tuning. Second, it employs reflection-based memory extraction that distills incorrect tool executions into critiques stored for future reference. Third, its retrieval module adapts how many past experiences to surface based on the similarity distribution of stored memories. Evaluated against strong baselines, MemToolAgent achieves relative improvements of 29%, 80%, and 17% on the WorkBench, NESTFUL, and PEToolBench benchmarks, respectively, demonstrating that structured memory management can meaningfully improve agent tool use across diverse evaluation settings.
Key facts
- 01MemToolAgent is proposed by Suleyman Armagan Er, Danilo Ribeiro, and Yogesh Virkar.
- 02The framework improves LLM agents' tool-using capabilities using memory from past user-agent conversations.
- 03A memory extraction module converts past experiences into structured memory entries using environment and user feedback.
- 04A reflection-based extraction step distills wrong tool executions into critiques stored for future use.
- 05A retrieval module dynamically selects how many stored memories to use based on memory similarity distribution.
- 06MemToolAgent achieves 29%, 80%, and 17% relative improvements on WorkBench, NESTFUL, and PEToolBench benchmarks.
- 07The approach requires no LLM fine-tuning.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 11, 2026 · 08:34 UTC. How this works →