Apr 23, 2026·2 min readApplications & Use Cases

MCP server doubles as a QA engineer for production data

Eugen's team built 118 MCP tools for their SaaS, ran 722 manual acceptance test scenarios through Claude, and discovered that an AI with read access across domains surfaces data inconsistencies, cross-entity bugs, and security vulnerabilities that unit tests never catch.

Dev.to #mcp·Eugen

Read at source

Composite

5.4

out of 10

Novelty · 25%

Novelty

Impact · 43%

Impact

Credibility · 12%

Credibility

Depth · 20%

Depth

Weights applied. How scores work ↗

Why it matters

Treat your MCP tools as raw public API endpoints — audit them with cross-domain queries and explicit ownership checks, because implicit web UI security and native-type test suites will not catch transport-layer bugs or IDOR vulnerabilities that Claude exposes in production.

01The team built 118 MCP tools organized into 19 files for their SaaS product.
02722 acceptance test scenarios were run through Claude, with results categorized as PASS, SKIP, BUG-FIXED, or KNOWN-LIMITATION.
03Six list tools used `z.number()` for pagination params; Claude sends these as strings over JSON-RPC, causing Zod rejections — fixed with `z.coerce.number()`.

Summary— our read of the original

Eugen's team built 118 MCP tools across 19 files for their SaaS application — primarily for data-entry operations like recording expenses and creating invoices. After running 722 manual acceptance test scenarios through Claude, they concluded that an AI with read access to multiple domains functions as a production data auditor, not just a code executor. Cross-domain queries like "which invoices have been in Sent status for more than 30 days?" or "compare total invoiced revenue with total recorded income in accounting" exposed inconsistencies that live between systems, where no single unit test has jurisdiction. Both systems could pass their own tests while a silent sync failure or missed transaction created a real discrepancy.

The structured test run — categorizing results as PASS, SKIP, BUG-FIXED, or KNOWN-LIMITATION — surfaced several concrete bugs.

The structured test run — categorizing results as PASS, SKIP, BUG-FIXED, or KNOWN-LIMITATION — surfaced several concrete bugs. Six list tools used `z.number()` for pagination parameters, but Claude sends numbers as strings over JSON-RPC, causing Zod to reject them; the fix was switching to `z.coerce.number()`. Three update tools (`update-company`, `update-client`, `update-product`) had `name: z.string()` as a required field, breaking partial updates where only a phone number needed changing; the fix was making every non-mandatory field `.optional()`. Two IDOR vulnerabilities were found in `update-document-status` and `delete-document-status`, which lacked `teamId` checks in their repository queries — the web UI never exposed this gap because it only showed the user's own data, but the raw MCP API had no such guard. Runtime crashes were also caught: `update-invoice` called `service.update()` when the actual method was `service.updateDocument()`, a mismatch TypeScript missed due to `as never` bypassing a complex generic, and `convert-estimate-to-invoice` used `require()` in an ESM context, hiding a type mismatch until runtime.

The post's central argument is that MCP tools are public API endpoints, and any implicit security baked into a web UI must be made explicit in every MCP tool query. The acceptance test process treated Claude as both a QA auditor for production data and a real-world transport-layer client, catching bugs that only appear when code runs through the actual JSON-RPC path rather than a native test suite.

Key facts

01The team built 118 MCP tools organized into 19 files for their SaaS product.
02722 acceptance test scenarios were run through Claude, with results categorized as PASS, SKIP, BUG-FIXED, or KNOWN-LIMITATION.
03Six list tools used `z.number()` for pagination params; Claude sends these as strings over JSON-RPC, causing Zod rejections — fixed with `z.coerce.number()`.
04Three update tools had `name` as a required Zod field, breaking partial updates; fix was making non-mandatory fields `.optional()`.
05Two IDOR vulnerabilities were found in `update-document-status` and `delete-document-status` — both lacked `teamId` ownership checks that the web UI enforced implicitly.
06`update-invoice` called `service.update()` instead of `service.updateDocument()`, a runtime crash TypeScript missed due to `as never` bypassing a complex generic.
07Cross-domain queries (e.g., matching bank balances to transaction sums) surface data inconsistencies that pass all unit tests because the inconsistency lives between systems.

Topics

#mcp #qa-automation #data-validation #production-testing #saas

Methodology

Summary and scoring are generated automatically from the original article. We always link back to the publisher and never republish images or paywalled content. Last processed Apr 23, 2026 · 11:04 UTC. How this works →

MCP server doubles as a QA engineer for production data

Score breakdown

Key facts

Topics

Score breakdown

Key facts

Topics