Testing Benford's Law

1 min read Original article ↗

Imagine a large dataset, say something like a list of every country and its population.

Country Population
Afghanistan 29,117,000
Albania 3,195,000
Algeria 35,423,000
Andorra 84,082
Angola 18,993,000
  ↑ Leading digit

Chances are, the leading digit will be a 1 more often than a 2. And 2s would probably occur more often than 3s, and so on.

This odd phenomenon is Benford's Law. If a set of values were truly random, each leading digit would appear about 11% of the time, but Benford's Law predicts a logarithmic distribution. It occurs so regularly that it is even used in fraudulent accounting detection.

See the Wikipedia article for a more thorough discussion.

This is a simple experiment to see how many large, publicly accessible datasets satisfy Benford’s Law.

This site is on GitHub. Please help out by forking the project and adding more datasets.