The special trick that helps identify dodgy stats
Why I love stats:
Government figures are subjected to various audits already, of course, but alongside checking that things marry up with one another, forensic statisticians also have ways of spotting suspicious patterns in the raw numbers, and thus estimating the chances that figures from a set of accounts have been tampered with. One of the cleverest tools is something called Benford’s law.
Imagine you have data on, say, the population of every world nation. Now, take only the “leading digit” from each number: the first number in the number, if you like. For the UK population, which was 61,838,154 in 2009, that leading digit would be “six”. Andorra’s was 85,168, so that’s “eight”. And so on.
If you take all those leading digits, from all the countries, then overall, you might naively expect to see the same number of ones, fours, nines, and so on. But in fact, for naturally occurring data, you get more ones than twos, more twos than threes, and so on, all the way down to nine. This is Benford’s law: the distribution of leading digits follows a logarithmic distribution, so you get a “one” most commonly, appearing as first digit around 30% of the time, and a nine as first digit only 5% of the time.
Why I don’t always love stats:
It doesn’t work perfectly: it only works when you’re examining groups of numbers that span several orders of magnitude, for example. So, for age, in years, of the graduate working population, which goes from around 20 to 70, it wouldn’t be much good, but for personal savings, from nothing to millions, it should be fine. And of course, Benford’s law works in other counting systems, so if three-fingered sloths ever develop numeracy, and count in base-6, or maybe base-12, the law would still hold.