★ Rank 24 today·Jun 11, 2026·1 min readResearch Papers

HyperTool doubles agent accuracy by batching tool calls into single code blocks

HyperTool is a unified MCP-style tool interface that lets LLM agents invoke multiple tools inside a single code block rather than one step at a time, more than doubling accuracy on the MCP-Universe benchmark for Qwen3-32B and Qwen3-8B models.

ArXiv·Yaxin Du, Yifan Zhou, Yujie Ge

Read at source

Composite · rank 24

6.3

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

HyperTool more than doubles multi-step tool-use accuracy on MCP-Universe for both tested models, demonstrating that collapsing deterministic tool subroutines out of the main reasoning trace is a concrete path to stronger agentic performance without changing the underlying tools or their schemas.

01HyperTool is a unified executable MCP-style tool interface that replaces step-wise atomic tool calls with a single code-block invocation.
02The paper identifies an "execution-granularity mismatch" where deterministic tool workflows are unfolded into repeated model-visible decisions, consuming context.
03A HyperTool code block can call existing tools via their original schemas, manipulate returned values, and pass intermediate results locally.

Summary— our read of the original

Yaxin Du, Yifan Zhou, and Yujie Ge identify a core inefficiency in how tool-augmented LLM agents currently operate: every tool call, its observation, and any value transfer is surfaced in the model's main reasoning trace. They term this an "execution-granularity mismatch," arguing that locally deterministic tool workflows are unnecessarily unfolded into repeated model-visible decisions that consume context and force the model to handle low-level dataflow explicitly.

To address this, the authors introduce HyperTool, a unified executable MCP-style tool interface that changes the model-visible unit of tool execution.

To address this, the authors introduce HyperTool, a unified executable MCP-style tool interface that changes the model-visible unit of tool execution. Instead of issuing atomic step-wise calls, a model invokes HyperTool with a code block that can call existing tools through their original schemas, manipulate returned values, and pass intermediate results locally — folding what would otherwise be a multi-step subroutine into a single outer call. To train models on this interface, the team synthesizes HyperTool-format trajectories from cross-tool compositional tasks and verifies them in real MCP environments.

Evaluated on MCP-Universe, HyperTool raises average accuracy from 15.69% to 35.29% on Qwen3-32B and from 9.93% to 33.33% on Qwen3-8B. The approach also surpasses GPT-OSS and Kimi-k2.5 on average accuracy, demonstrating that collapsing deterministic tool subroutines into a single interface call substantially improves multi-step tool use.

Key facts

01HyperTool is a unified executable MCP-style tool interface that replaces step-wise atomic tool calls with a single code-block invocation.
02The paper identifies an "execution-granularity mismatch" where deterministic tool workflows are unfolded into repeated model-visible decisions, consuming context.
03A HyperTool code block can call existing tools via their original schemas, manipulate returned values, and pass intermediate results locally.
04Training data is synthesized from cross-tool compositional tasks and verified in real MCP environments.
05On MCP-Universe, Qwen3-32B accuracy improved from 15.69% to 35.29% with HyperTool.
06On MCP-Universe, Qwen3-8B accuracy improved from 9.93% to 33.33% with HyperTool.
07HyperTool surpasses GPT-OSS and Kimi-k2.5 on average accuracy on MCP-Universe.

Topics

#mcp #tool-use #agent-framework #multi-agent #benchmarks

Methodology

Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 12, 2026 · 10:05 UTC. How this works →

HyperTool doubles agent accuracy by batching tool calls into single code blocks

Score breakdown

Key facts

Topics

More in Research Papers.

Score breakdown

Key facts

Topics

More in Research Papers.