A technical honeypot · Pattern seam

The First Digit Is a One

Grab a pile of real-world numbers — river lengths, populations, the dollar amounts on a company's invoices — and look only at the first digit of each. You would guess every digit 1–9 shows up about a ninth of the time. It doesn't. A 1 leads about 30% of the time; a 9, only 4.6%. This is Benford's Law, and it is not a curiosity — it is a fact about what "spread across many scales" does to digits. Below, grow the numbers yourself and watch it happen.

In 1881 the astronomer Simon Newcomb opened a short note with an observation anyone who used logarithm tables already half-knew: "That the ten digits do not occur with equal frequency must be evident to anyone making much use of logarithmic tables, and noticing how much faster the first pages wear out than the last ones." The pages beginning with 1 were thumbed to rags; the pages beginning with 9 stayed crisp. (The worn-pages line is Newcomb's own — it is usually misattributed to Benford, who came 57 years later.) In 1938 the physicist Frank Benford tabulated 20,229 numbers across 20 wildly different tables — river areas, atomic weights, street addresses, numbers plucked from Reader's Digest — and found the same lopsided curve every time. It now carries his name.

The curve has an exact shape. The chance the leading digit is d is P(d) = log₁₀(1 + 1/d):

digit123456789
P(d)30.1%17.6%12.5%9.7%7.9%6.7%5.8%5.1%4.6%

Those aren't fitted numbers. For some sequences the law is provably exact in the limit — and you can grow those sequences right here and watch their first digits fall onto the curve.

Instrument I — grow a sequence, census its first digits

Naturally Benford (spans many scales) NOT Benford (the honest boundary)
gold bars = your data · blue line = Benford
numbers counted0
share leading with 1
distance from Benford

Powers of 2: 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, … Each first digit is read from the exact value (arbitrary-precision), not an approximation. Press grow and watch the bars climb toward the blue curve.

Two, four, eight, sixteen, thirty-two… the powers of two spend a long time in the 1s (16, 128, 1024…) and the 2s before they ever reach an 8 or 9. Fibonacci and the factorials do the same. This is provable: the first digit of 2ⁿ is decided by the fractional part of n·log₁₀2, and because log₁₀2 is irrational, those fractional parts spread perfectly evenly across the interval [0,1) as n grows (Weyl's equidistribution theorem; the rigorous account is Diaconis 1977). Perfectly-even on that interval is Benford — and here's why.

Instrument II — the reason: a ruler marked in logarithms

Lay the numbers 1→10 on a ruler spaced by logarithm, the way a slide rule is marked. The stretch of ruler that starts with a 1 (from 1 to 2) is the widest band; the stretch starting with 9 (from 9 to 10) is a sliver. Their widths are exactly the Benford percentages. Drop points anywhere on the ruler, evenly — and each band catches numbers in proportion to its width.

points dropped0
landed on a "1"
unit factor×1.00

A number is, for this purpose, just its position on this ruler — the fractional part of its logarithm, its mantissa. Data that ranges over many powers of ten wraps around the ruler again and again and lands evenly; an even scatter fills the wide "1" band 30% of the time. Now the deep part: slide the units — change dollars to euros, metres to feet. Every point shifts by the same amount, the whole scatter rotates around the ruler… and the tally doesn't budge. Benford is the only digit law that survives a change of units — that uniqueness is Pinkham's 1961 characterization (made fully rigorous as a statement about the mantissa by Hill 1995). A law of nature can't care whether you measured in miles or kilometres, so if there is any universal first-digit law, it has to be this one.

Where the law breaks — and why that matters

Benford is not universal, and the honest half of the story is knowing when it fails. Flip Instrument I to the red buttons. Adult heights in centimetres all sit between about 150 and 199 — so almost every one leads with a 1, an even more lopsided pile than Benford. Whole numbers 1–100, drawn evenly, give each digit roughly a ninth. The rule of thumb: Benford needs data that spans several orders of magnitude and is smooth on the log ruler. Single-scale data (heights, IQ scores, shoe sizes), uniform data, and assigned numbers (phone numbers, ZIP codes, account IDs — numbers that are labels, not measurements) do not obey it.

That boundary is exactly where Benford gets misused. Forensic accountants really do use it — fabricated ledgers, invented tax figures and padded expense claims often have too-even leading digits, because a person inventing "natural-looking" numbers unconsciously spreads them flat (try the made-up "data" button — it won't touch the curve). Mark Nigrini built a career on this. But a deviation from Benford is a reason to look closer — never, by itself, proof of fraud.

The misuse worth naming: "Benford proves the election was stolen"

After the 2020 U.S. election, viral posts claimed a candidate's precinct vote counts violated Benford's Law and so must be fraudulent. They don't, and it isn't. Precinct vote counts are close to single-scale — precincts are built to similar sizes, so a candidate's totals cluster around a mean and pile up on a few leading digits. They aren't supposed to be Benford in the first place, so "they aren't Benford" shows nothing.

This is the settled view of the people who invented election forensics. Deckert, Myagkov & Ordeshook (2011) found the first-digit test's success rate at spotting fraud "essentially equivalent to a toss of a coin… problematical at best as a forensic tool and wholly misleading at worst." Walter Mebane — who pioneered the field and defends a subtler second-digit variant — agrees the plain first-digit test is not usable for diagnosing election fraud. A tool that is right half the time is not a lie detector.

One more true thing, quietly: the second digit follows its own Benford-like law, flatter than the first — a 0 in the second place about 12.0% of the time down to a 9 at 8.5%. By the third and fourth digit the law has faded to an even 10% each. The pull toward small digits is strongest at the front and washes out as you go right.

The check — everything here is recomputed, headless, in the repo

research/benford/verify.mjs runs 24 checks, all passing. It confirms P(d)=log₁₀(1+1/d) sums to 1 and matches the table (a 1 is 6.58× a 9); grows 2ⁿ, Fibonacci and n! and measures their first digits closing on Benford (total-variation distance for 2ⁿ falls to ≈0.0000 by 100,000 terms; n! converges too, but slower); confirms the counterexamples fail (single-scale heights, uniform 1–100); demonstrates scale-invariance (2ⁿ rescaled by π, by ⅓, by the kg→lb factor stays Benford; a uniform set does not); reproduces the law from a uniform mantissa; and checks the second-digit law.

The two lead sequences are cross-checked two ways — exact big-integer arithmetic vs. the logarithm shortcut — and they agree at every one of the first 10,000 terms except 2³=8, where the shortcut's floating-point rounds one digit low at the rim. That single rim case is why this page reads every first digit from the exact integer, not from a logarithm. The moat is that you never have to take the picture's word for it.