Apr 24, 2026·1 min readApplications & Use Cases

Multi-LLM debate engine uses a validator agent to fact-check in real time

Developer Suat built a multi-agent debate system where a dedicated Validator agent fact-checks every concrete claim mid-debate — before other LLMs can agree with a hallucination — using structured `[OK]`, `[WARN]`, and `[FAIL]` markers that are injected into subsequent rounds.

Dev.to #llm·Suat

Read at source

Composite

5.8

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

Developers building multi-agent pipelines can adopt this Validator-as-shared-expert pattern to structurally suppress hallucination propagation across agent rounds without any fine-tuning.

01The system is published on GitHub as `capitansuat/swarm-debate` under the MIT license.
02Four debating personas — Analyst, Strategist, Devil's Advocate, and Researcher — each generate opinions per round.
03A fifth Validator agent runs on every round and uses `web_search` to verify every concrete claim (numbers, dates, company names, URLs, events).

Summary— our read of the original

Suat's post describes a multi-agent debate system designed to combat a specific failure mode: when multiple LLMs are asked the same question, fabricated citations carry the same rhetorical weight as real ones, and a downstream summarizer will typically treat all sources as equivalent rather than surface the hallucinations. The naive approaches — voting, arguing, or summarizing — make things worse because LLMs are prone to sycophancy, conceding to confident but wrong claims rather than pushing back.

In this system, that role is filled by a Validator agent that does not debate, take sides, or argue — it only verifies.

The architecture draws on the shared expert pattern from Mixture-of-Experts language models, where one expert processes every token regardless of routing. In this system, that role is filled by a Validator agent that does not debate, take sides, or argue — it only verifies. Four debating personas (Analyst, Strategist, Devil's Advocate, Researcher) each produce opinions in Round 1. The Validator then reads all four outputs, runs `web_search` on every concrete claim — numbers, dates, company names, report citations, URLs — and emits structured `[OK]`, `[WARN]`, or `[FAIL]` markers with corrections and source URLs. Critically, the Validator's full reasoning is not shown to the other agents; only the structured markers are injected, preventing agents from quoting the Validator as a peer. Future-dated source claims are automatically marked `[FAIL]`.

In Round 2 and Round 3, each persona sees the previous round's outputs plus the Validator's findings, with an explicit instruction not to reuse `[FAIL]`-marked claims. Suat reports that in test runs, the same model that fabricated a citation in Round 1 correctly dropped it in Round 2 and reframed its argument around real data — with no fine-tuning or retraining, purely through structured in-context feedback. The post also describes a comparison between a run with a functioning Validator and one where the Validator timed out, though the full results of that comparison are not included in the provided excerpt.

Key facts

01The system is published on GitHub as `capitansuat/swarm-debate` under the MIT license.
02Four debating personas — Analyst, Strategist, Devil's Advocate, and Researcher — each generate opinions per round.
03A fifth Validator agent runs on every round and uses `web_search` to verify every concrete claim (numbers, dates, company names, URLs, events).
04Validator output uses structured markers: `[OK]` (verified), `[WARN]` (suspect), and `[FAIL]` (fabricated or wrong, with correction and source URL).
05Future-dated source citations are automatically marked `[FAIL]` by the Validator's system prompt.
06Only the structured markers — not the Validator's full reasoning — are injected into subsequent rounds, to prevent agents from treating the Validator as a debate peer.
07The architecture is inspired by the shared expert pattern in Mixture-of-Experts language models.

Topics

#multi-agent #fact-checking #agent-framework #hallucination-mitigation #code-generation

Methodology

Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Apr 24, 2026 · 17:11 UTC. How this works →

Multi-LLM debate engine uses a validator agent to fact-check in real time

Score breakdown

Key facts

Topics

Score breakdown

Key facts

Topics