A true thing you can operate · show the check

The Dunning–Kruger Effect, drawn from random numbers

The most famous chart in pop psychology — the clueless wildly overconfident, the experts modestly unsure — appears even when confidence has nothing whatsoever to do with skill. Here is the machine that draws it from pure noise. Turn the dials yourself.

You know the story. Incompetent people are too incompetent to know they're incompetent, so they rate themselves sky-high; the truly skilled are racked with doubt. It comes with a chart shaped like an opening pair of scissors, and it is everywhere — management decks, op-eds, Twitter arguments.

The original 1999 study is real, and people do misjudge themselves. But the dramatic scissors shape — the thing the chart is famous for — is largely a statistical artefact. It falls out of how the chart is built, and it appears in data where self-assessment is, by construction, completely unrelated to actual skill. Don't take that on faith. Below is a population of synthetic test-takers; you control how honestly they know themselves. Drag that knob to zero — sever every link between confidence and competence — and watch the famous chart refuse to go away.

Companion film — 2:30 The argument told as motion. Four thousand synthetic test-takers — the very same seeded Monte Carlo the verifier runs — scatter into a structureless cloud once self-insight is dialled to ρ = 0 (live r = 0.04, no relationship at all). Sorted into four skill quartiles, the textbook scissors reassembles itself from that noise: the bottom quartile rates itself 64.8 while it actually scores 12.5overestimating by +52.5 points, almost exactly the published 62nd / 12th (Kruger & Dunning, 1999). Restore self-insight and the scissors closes on its own (+52.5 → +19.3); the overestimation-vs-skill slope is −1 by construction. It does not claim the effect is zero — that stays unsettled. Every figure on screen is read live from the model at seed 1234, matching the verifier's 200-seed means to a decimal; the music is a minor-pentatonic bed whose percussion is the population's own overestimation pattern, read in skill order. Companion film for a Wasteland layer.

The Dunning–Kruger machine

perceived vs actual, by skill quartile
every person · the raw cloud
+52bottom Q over­estimates
−22top Q under­estimates
0.00real skill↔confidence r

The scissors is wide open — and the true correlation between skill and confidence is zero.

How well each person actually knows their own ability. 0 = they guess regardless of skill; 1 = they report their true percentile. This is the only dial that encodes a real psychological effect.

Most people, on most tasks, rate themselves a bit above average. This lifts the whole self-rating line — and, you'll see, it is what makes the scissors asymmetric (low performers look "more wrong" than high ones).

No test measures ability perfectly. This noise is what drives regression to the mean — the extreme quartiles are partly luck, and luck doesn't repeat.

Read the right-hand cloud while you drag. At ρ = 0 it is a structureless fog — the best-fit line is flat, the correlation sits at zero. There is no relationship between what these people can do and how good they think they are. Yet the left-hand chart still shows the textbook scissors: the bottom quartile "overconfident" by fifty percentile points, the top quartile "humble." How?

Three pieces of pure statistics

The scissors is assembled from three ordinary effects — no psychology of incompetence required.

1 · The flat line and the staircase

If confidence is unrelated to skill, then average self-rating is the same in every quartile — a flat line near the population mean. But average actual score, by definition of "quartile," climbs a staircase from ~12th to ~88th percentile. A flat line laid across a rising staircase must cross: below the crossing everyone looks overconfident, above it everyone looks humble. That is the scissors, and it is pure geometry. Set ρ = 0 and watch the perceived line go dead flat.

2 · Regression to the mean

Tests are noisy. The people who scored in the bottom quartile include plenty who were merely unlucky; their true ability is higher than their score, so any honest self-read lands above their measured rank — instant "overconfidence." The top quartile is the mirror image. Drop Self-insight to a middling value but turn Test noise up: the chart bows open even though nobody is biased and nobody is clueless. (Set bias to 0 to see it symmetric.)

3 · Autocorrelation — the trick hidden in the axes

The classic way to quantify the effect is to plot overconfidence (self-rating minus score) against score. But score now sits on both axes, with opposite signs. Even with self-ratings that are literally random, that regression has a slope of almost exactly −1 — built in, not discovered. The "skilled know less than they think; unskilled think they know more" correlation is partly an accounting identity.3

The check — recomputed in front of you

Everything above is run live in your browser by the same seeded model as the verifier in research/dunning-kruger/. With the dials at their default (ρ = 0, bias = +15, noise = 0.50), over a fresh population of 4,000 people:

computing…

The quartile chart at zero correlation, beside the figure published by Kruger & Dunning in 1999:

quartileperceived (sim)actual (sim)K&D 1999

The simulated bottom quartile lands at roughly the 65th percentile while actually sitting near the 12th — almost exactly the famous "62nd vs 12th" of the original paper.1 Produced from no relationship at all. Full Monte-Carlo proof (200 seeds, five separate claims): artefact.json · verify.mjs.

So is the effect fake?

No — and this is the part the debunkers sometimes overshoot. People are miscalibrated about themselves; the original measurements are real. What the artefact dissolves is the strong, asymmetric, deficit-of-the-incompetent story — the claim that the unskilled are uniquely, specially blind. That dramatic shape, this page shows, you get from random numbers.

Whether any genuine metacognitive signal survives once the statistics are removed is honestly unsettled. Nuhfer and colleagues, using measures designed to dodge the artefact, find the strong tendency largely vanishes.2 Gignac and Zajenkowski call it "(mostly)" an artefact — leaving room for a small real component.3 Dunning himself maintains the self-misjudgements are real whatever their cause. This page does not settle that fight. It settles a narrower, sturdier thing: the chart is not the evidence it is usually taken to be. When someone shows you the scissors, they have shown you regression to the mean wearing a lab coat.