Fraudsters hate this one weird trick!
The mathematical law that can show when numbers aren't random.
Paying subscribers got this in January. Sick of waiting? Want to support the sort of groundbreaking nerdery those square in the mainstream media just won’t pay me for?
In 1881, an astronomer working for the US navy was using a book of logarithm tables when he noticed something weird. Before the invention of computers, those doing complicated bits of maths relied on such books: they were essentially a list of calculations that had already been done, and whose answers you could just plug into your work. What Newcomb noticed was that the early pages of the book – the ones on which the numbers started with the digit one or two – looked worn and grubby compared to the latter pages. That suggested heavier use.
Newcome came up with a theory which seemed, after some time trawling the data, to stand up: that in a naturally occurring set of random numbers, the odds are that the first number will be small. After extensive research, he even put percentages on it:

Half a century later a physicist named Frank Benford noticed the same thing and because the world is unfair got his name attached to it. He did the legwork to gather 20,000 bits of data to prove it, but nonetheless.

Benford’s Law – he more modestly called it the law of anomalous numbers, but it’s also known as the first digit law, or the first or leading digit phenomenon – is not just a curious oddity, but something that’s been successfully used in court. It’s a bit of a thing to get your head around, so I’ll say it again.
Take a naturally occurring dataset containing a sufficiently large range of values – populations of countries or settlements, stock prices, river lengths, mountain heights – and there will be significantly more numbers in it beginning with lower digits than with higher ones. In fact, nearly a third of the entries (30.1%) will begin with a one, and nearly half (48.7%) with one or two. Less than a tenth (9.7%) will begin with eight or nine.
Here’s a formula for working out the percentage chance (P) that the leading digit will be any particular number (n). Don’t worry, I don’t understand it either:
P(n) = log10(1 + 1/n)
The technical explanations for why this should be are quite frankly beyond me: there’s a lot of talk about log scales, multiplicative fluctuations, Krieger generator theorem and Kafri ball-and-box model theorem, and it’s a long time since I did A-level maths but I’m pretty sure none of those things came up.1
But it seems to be something to do with positional notation – that is, the way we write numbers. To put it simply, you have to go through all the numbers beginning with one before you get to the ones beginning with two, and through all the numbers beginning with one to eight before you get to those beginning with nine. That just doesn’t happen very often.2
Benford’s Law – I feel for Newcomb every time – is not universal. It’s what is known as a phenomenological law, something which describes the world, rather than explains it, so it does not apply to every dataset. Consider the range of human heights. Measured in feet, the numbers will almost all begin with five and six; in inches, it’ll be five, six or seven; in metric, they’ll be overwhelmingly ones; and so on. In the same way, datasets containing times of the day, or the results of a roulette wheel, will be biased towards ones and twos – but not in the proportions predicted.
To actually fit the rule, a dataset needs to fit certain conditions. It needs randomness – numbers which are naturally occurring; measured, not chosen. It can’t follow a normal distribution (which means clustering) or be artificially bounded (the problem with the exceptions listed above). It also fits better with larger datasets – especially those which cover several orders of magnitude (more sets of ones to get through before you hit the higher numbers). Even when it does they will only tend towards the percentages listed above. The bigger the dataset, the closer it will get.3
Here comes the clever part. In the sorts of data that do follow Benford’s Law it can be used to spot when numbers were not randomly chosen at all. It’s been used to convict fraudsters, by showing that the numbers were very likely to be man-made. It’s been used to identify Russian bot networks (because it also applies to follower or like counts), to show that elections were rigged (in Iran) or uncover manipulated economic data (when Greece was trying to join the Euro). In none of these cases is it proof – but it is a red flag. Numbers don’t lie, even when people do.
Simon Newcomb, incidentally, may not have got his name on the law he discovered (even if it is, on rare occasions, described as the Newcomb-Benford Law), but he did get his name on something else. Between 1943 and 1950, to commemorate his stint as director of the US Navy’s Nautical Almanac Office, the USS Simon Newcomb toured the Pacific looking for mines. So who’s to say who the real winner is here?
If you enjoyed that and would like to read more things by me but can’t afford a subscription, just hit reply and ask: I will say yes, I always do. If you can afford it, though…
Also, why not order 31 Inventions That Built Our Word, the book I’ve been working so hard to finish? It’s a history of all the things that gave us urban life (streetcars, sewers, skyscrapers et al.), Simon Newcomb comes up in that, too (that chapter’s about air conditioning), it’s out in August, and pre-orders really help in terms of visibility and later sales. Link tree here.
It’s at least possible it’s because I did mechanics rather than statistics, mind.
The specific probabilities are also a result of the fact we count in Base 10. A 2016 paper noted that, if we used binary numbers, the probability of the first digit being 1 would be... 100%.
Two other things I’m sticking in the footnote so as not to overload everyone with maths on which I have a fairly loose grasp. One is that, if a dataset does meet these conditions, it will fit even if you switch the units of measurement. You can measure mountain heights in metres or feet or number of apples, and it will still apply, because you’re effectively just multiplying your data by a particular number. It’s what is known as “scale invariant”.
The other is something interesting from this piece in Scientific American. “If a natural phenomenon arises from the product of several independent sources, then only one of those sources must accord with Benford’s law for the overall result to. Benford’s law is cannibalistic, much in the same way that a single zero in a bunch of numbers being multiplied together makes the result zero.” I don’t think “cannibalistic” here is a technical term – please let me know if I’m wrong. “Contagion” feels like a better metaphor.
