Happo MCP server lets AI agents review visual-regression diffs
Happo has launched an MCP server that enables AI agents like Claude to structurally review visual-regression and accessibility diffs, including writing rationale and catching flake — demonstrated by Claude reviewing 896 diffs in about a minute.
Score breakdown
The MCP server replaces manual spot-checking of large visual-regression diff sets with structured agent analysis that produces an auditable rationale and catches flake — a task the article describes as practically impossible for humans at hundreds of diffs.
- 01Happo launched a new MCP server enabling AI agents to review visual-regression and accessibility diffs.
- 02In a demo, Claude reviewed 896 diffs from a one-line icon swap in about a minute, with a documented rationale.
- 03Claude broke the 896 diffs down by component and browser, identifying ~100 page variants × 5 browsers at multiple viewport sizes as the blast radius.
Happo founder and CEO Henric Persson introduced the company's new MCP server, which connects AI agents such as Claude and Copilot to Happo's visual-regression and accessibility testing pipeline. The core insight is that agents already have the code change in context, so when diffs arrive they can reason about what UI changes to expect and cross-reference them against the actual diff metadata and images.
The flagship demonstration involved a one-line icon swap — replacing the Account sidebar item's icon — that triggered 896 diffs.
The flagship demonstration involved a one-line icon swap — replacing the Account sidebar item's icon — that triggered 896 diffs. Rather than spot-checking, Persson gave Claude a single instruction to review and approve or reject the diffs. Claude flagged the high diff count as suspicious, then queried the comparison structurally: it broke results down by component and browser, determined the blast radius was roughly 100 page variants across 5 browsers at multiple viewport sizes, pulled before/after images for the icon gallery and a page-in-context, confirmed only `AccountIcon` changed, verified `axeViolationsDelta` was 0, and approved both the Storybook and Cypress comparisons with a written rationale — all in about a minute.
A second example illustrated flake detection. On a six-diff PR touching the dashboard sidebar, Claude also encountered one unexpected Storybook diff on `EditProjectPage`, a page the PR never touched. Claude cross-referenced changed files, identified the diff as a focus-state flake caused by non-deterministic autofocus timing (a pink focus ring appearing in the baseline but not the after snapshot), and flagged it accordingly — completing the full session in about 20 seconds. Persson acknowledges the MCP is early-stage: for large diff sets, Claude approves based on structural analysis and image sampling rather than exhaustive review, and he cautions that sampling has inherent limits.
Key facts
- 01Happo launched a new MCP server enabling AI agents to review visual-regression and accessibility diffs.
- 02In a demo, Claude reviewed 896 diffs from a one-line icon swap in about a minute, with a documented rationale.
- 03Claude broke the 896 diffs down by component and browser, identifying ~100 page variants × 5 browsers at multiple viewport sizes as the blast radius.
- 04Claude confirmed `axeViolationsDelta` was 0 and approved both Storybook and Cypress comparisons.
- 05In a separate example, Claude detected a focus-state flake on `EditProjectPage` — a page the PR never touched — in about 20 seconds.
- 06The author notes that large diff reviews rely on sampling and structural analysis, not exhaustive image review.
- 07The MCP server is described as early-stage with a small sample size of real-world usage so far.
Topics
Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Jun 13, 2026 · 08:58 UTC. How this works →