AI agents generate code fast, but that code needs the same (or more) testing rigor as human-written code. Here's how.
| Trust Level | What | Verification |
|---|---|---|
| High trust | Boilerplate, standard patterns | Quick scan |
| Medium trust | Business logic, data transformations | Unit tests |
| Low trust | Auth, payments, security, novel algorithms | Manual review + tests |
| Zero trust | Crypto, input validation, SQL | Always hand-review |
Write tests before asking the agent to implement:
"Write a function that scores articles by keyword matching.
Here are the tests it should pass: [paste test file]"The agent implements to satisfy the tests. You verify the tests are correct, not the implementation.
Define invariants the code must satisfy:
For formatters and renderers, snapshot tests catch unexpected changes:
expect(formatDigestMessage(articles, '2026-01-01', 'https://test.com')).toMatchSnapshot();Test the boundaries between components:
The agent may use functions that don't exist or have wrong signatures. Always check imports and types.
Pagination, array slicing, date calculations — classic AI mistakes. Test boundary conditions explicitly.
Agents often generate the happy path. Explicitly ask: "What happens if the API returns 500? What if the input is empty?"
The code looks right but has inverted conditions, wrong comparison operators, or missing null checks. Type-strict TypeScript (noUncheckedIndexedAccess, exactOptionalPropertyTypes) catches many of these at compile time.
Search for a command to run...