The Positive Test That's Probably Wrong
A test is 99% accurate. It just told you that you have a disease that 1 in 100 people carry. How worried should you be? The honest answer is a coin flip — and most people, including most doctors, are off by a factor of ten. This is the trap Bayes' theorem unsprings, and you can watch it spring below.
If your test comes back positive, the chance you are actually sick is
Out of everyone who tests positive, half are perfectly healthy people the test flagged by mistake.
Turn the three dials
How many people in the population actually have it, before any test.
Of people who are sick, the share the test correctly flags positive.
Of people who are healthy, the share the test correctly clears. The rest are false alarms.
Picture 10,000 people. 100 are sick; the test catches 99 of them. The other 9,900 are healthy, but the test still flags 99 of them by mistake. So 198 people test positive — and only 99 are truly sick. 99 in 198, a coin flip.
The same thing, as Bayes wrote it
That is all Bayes' theorem is: the positive results split into the truly sick (P · Se) and the false alarms ((1−P) · (1−Sp)), and you ask what fraction of the pile is real. When the disease is rare, the false-alarm pile is drawn from the enormous healthy majority — so even a tiny false-positive rate can swamp the few true cases. The accuracy of the test never appears alone; it is always weighed against how rare the thing you're testing for is. Ignore the base rate and you get the wrong answer — usually by reporting the test's accuracy as if it were your odds.
This isn't a hypothetical failure. In 1978 Casscells and colleagues posed exactly the 1-in-1000 / 5%-false-alarm question to staff and students at Harvard Medical School. The most common answer was 95%. The correct answer is under 2% — and only about 18% of them got it right.1 Set the dials to the Casscells preset above and watch why.
Show the check
Every number on this page is recomputed offline and cross-checked two independent ways in research/bayes-theorem/verify.mjs (no dependencies; run it from a fresh checkout). It proves:
- The closed-form posterior P·Se / (P·Se + (1−P)·(1−Sp)) equals a brute-force count over a finite population — the very grid above — to within one person's worth of rounding, across a sweep of cases.
- The probability form agrees with the odds form (prior odds × likelihood ratio Se/(1−Sp)) to 1 part in 1015, over 200,000 random inputs.
- The headline default — a 99%-accurate test (Se = Sp = 99%) for a 1%-prevalent disease — gives a posterior of exactly ½. That is structural: the true-positive and false-positive piles are literally equal.
- Three textbook results reproduce to their published values: Casscells 1978 (≈ 2%), Eddy 1982 breast cancer (≈ 7.8%), and Gigerenzer & Hoffrage 1995 mammography (≈ 9.2%).
- The posterior is strictly increasing in prevalence, → 0 as the disease vanishes and → 1 as it becomes universal; a useless test (Se = 1−Sp) never moves the prior at all.