For engineers who don't trust their test suite: catch real bugs and kill the flake.
Reach for this when your suite is green but you don't believe it — high coverage that misses the bug, tests that flake in CI, mocks that pass while production breaks. It takes a suite from "runs" to "trustworthy": prove which gaps actually matter, root-cause flakiness instead of retrying around it, and use mutation testing to expose assertions that verify nothing. Pull it in before a release cut or when a passing build still ships regressions.
Click to play with sound.
Arranged in the author's recommended order. Walk through them in sequence, or open any one on its own.
Root-causes intermittently failing tests and eliminates the hidden dependency at its source instead of retrying around it. Use when a test passes locally but fails in CI, goes green on a CI re-run, fails roughly one run in ten, or is already tagged "flaky." Do NOT use when the task is to design the fake or stub that replaces a real dependency — use mock-stub-designer instead.
View skillProduces a risk-ranked list of untested critical paths and branches, with the specific missing cases and the smallest test that buys the most safety. Use when a coverage report shows high line coverage you do not trust, when deciding what to test next, or when auditing a suite before a release cut — anytime the question is "which gap actually matters," not "what is the percentage."
View skill