SearchSwarm teaches LLMs to delegate research subtasks to subagents
Researchers introduce SearchSwarm, a framework that trains LLMs to intelligently decompose and delegate long-horizon research tasks to subagents, with their `SearchSwarm-30B-A3B` model achieving top scores of 68.1 on BrowseComp and 73.3 on BrowseComp-ZH among comparable-scale models.
Score breakdown
Watch for the open-source release of SearchSwarm's harness, model weights, and training data, which could provide a practical foundation for building multi-agent deep research systems that scale beyond single-context-window limits.
- 01The paper introduces the concept of "delegation intelligence": decomposing tasks, deciding when and what to delegate, and integrating subagent results.
- 02LLM context windows are finite, but long-horizon task context can grow without bound — motivating a main-agent/subagent architecture.
- 03Training data for delegation intelligence is scarce in naturally occurring text and largely unexplored in the open-source community.
Pu Ning, Quan Chen, and Kun Tao present SearchSwarm, a preliminary exploration into what they call "delegation intelligence" for agentic LLMs tackling long-horizon deep research tasks. The core problem they address is that real-world complex tasks can demand context that grows without bound, while model context windows remain inherently finite. A promising paradigm has a main agent decompose tasks and dispatch subtasks to subagents, which execute and return only summarized results — but performing this well requires the model to know when and what to delegate, and how to integrate returned results into its ongoing workflow.
The harness-guided trajectories encode correct delegation decisions, which are then used as supervised fine-tuning data to internalize the capability into model weights.
The authors identify a critical data scarcity problem: training signal for delegation intelligence rarely appears in naturally occurring text, and synthesizing such data has been largely unexplored in the open-source community. To address this, they design a harness that steers the model toward high-quality task decomposition and delegation decisions while constraining subagents to return results in a format that properly supports the main agent. The harness-guided trajectories encode correct delegation decisions, which are then used as supervised fine-tuning data to internalize the capability into model weights.
The resulting model, `SearchSwarm-30B-A3B`, achieves 68.1 on BrowseComp and 73.3 on BrowseComp-ZH, which the authors report as the best results among all models of comparable scale. The team plans to release the harness, model weights, and training data to support future research in this area.
Key facts
- 01The paper introduces the concept of "delegation intelligence": decomposing tasks, deciding when and what to delegate, and integrating subagent results.
- 02LLM context windows are finite, but long-horizon task context can grow without bound — motivating a main-agent/subagent architecture.
- 03Training data for delegation intelligence is scarce in naturally occurring text and largely unexplored in the open-source community.
- 04The authors design a harness to guide models toward high-quality task decomposition and constrain subagents to return properly formatted results.
- 05Harness-guided trajectories are used as supervised fine-tuning data to internalize delegation intelligence into model weights.
- 06Their model, `SearchSwarm-30B-A3B`, scores 68.1 on BrowseComp and 73.3 on BrowseComp-ZH — best among comparable-scale models.
- 07The team plans to release the harness, model weights, and training data publicly.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 9, 2026 · 09:19 UTC. How this works →