ComPASS framework brings tool-augmented social support to AI agents
Researchers introduce ComPASS, a framework that equips LLM-based agents with external tools to deliver personalized, multimedia-rich social support beyond traditional empathetic dialogue.
Score breakdown
Practitioners building AI companion or mental-health support agents can use ComPASS-Bench as a benchmark and the tool-augmentation paradigm as a blueprint for moving beyond text-only empathy toward richer, action-oriented social support.
- 01ComPASS equips LLM-based agents with external tools grounded in the psychological concept of 'social support' to deliver human-like companionship.
- 02The framework includes a dozen user-centric tools simulating various multimedia applications covering different social support behaviors.
- 03ComPASS-Bench is introduced as the first personalized social support benchmark for LLM-based agents, built via automated synthesis and manual refinement.
Zhaopei Huang, Yanfeng Jia, and Jiayi Zhao present ComPASS, a research framework aimed at building more substantive AI companionship systems. Rather than relying solely on empathetic dialogue generation — which the authors argue is limited in response form and content — ComPASS equips agents with external tools that can execute diverse actions. The design is grounded in the psychological concept of "social support," and the toolkit includes a dozen user-centric tools simulating various multimedia applications intended to cover different types of social support behaviors across human-agent interaction scenarios.
It is built via multi-step automated synthesis followed by manual refinement.
To support evaluation, the authors construct ComPASS-Bench, which they describe as the first personalized social support benchmark for LLM-based agents. It is built via multi-step automated synthesis followed by manual refinement. Using ComPASS-Bench, the team synthesizes tool-use records to fine-tune `Qwen3-8B`, producing a task-specific model called ComPASS-Qwen. Comprehensive evaluations across two settings show that while tested LLMs can generate valid tool-calling requests with high success rates, significant gaps remain in final response quality. Notably, tool-augmented responses outperform direct conversational empathy overall, and ComPASS-Qwen demonstrates substantial improvements over its base model, reaching performance comparable to several larger-scale models. Code and data are publicly available on GitHub.
Key facts
- 01ComPASS equips LLM-based agents with external tools grounded in the psychological concept of 'social support' to deliver human-like companionship.
- 02The framework includes a dozen user-centric tools simulating various multimedia applications covering different social support behaviors.
- 03ComPASS-Bench is introduced as the first personalized social support benchmark for LLM-based agents, built via automated synthesis and manual refinement.
- 04ComPASS-Qwen is produced by fine-tuning `Qwen3-8B` on synthesized tool-use records derived from ComPASS-Bench.
- 05Evaluations show tested LLMs can generate valid tool-calling requests with high success rates, but significant gaps remain in final response quality.
- 06Tool-augmented responses achieve better overall performance than directly producing conversational empathy.
- 07ComPASS-Qwen achieves performance comparable to several large-scale models despite being fine-tuned from a smaller base model.