This site is supported by donations to The OEIS Foundation.
Benford's law
Given a finite set of numerical data, it would be expected that the digits 1 through 9 would occur as first digits with roughly equal probability. However, in many cases, the digit 1 occurs as the first digit most often, with roughly 30% probability, and the other digits occur with decreasing frequency.
This is called Benford's law, after Frank Benford, who noticed a curious pattern of usage in logarithm tables. Benford went on to also notice this unexpected disparity in many different sets of data, including atomic weights, baseball statistics and street addresses.
The effects of Benford's law can be observed even in those cases that might not necessarily reflect a difference in usage between first digits and other digits of a number. Presumably, building contractors know about Benford's law and know whether it applies to the addresses of the properties they're building. Hardware stores, on the other hand, might not know for what addresses replacement house numerals are being bought, and a given numeral can just as easily be used for the last digit of a house address as it can be for the first. It is still nevertheless a good idea to stock more 1s and 2s than other numerals.
Prime numbers and Benford's law
Note: The prime numbers do not satisfy Benford's law (Daniel I. A. Cohen and Talbot M. Katz, "Prime numbers and the first digit phenomenon," J. Number Theory 18 (1984), 261-268; A. Berger and T. P. Hill, What is Benford's Law?, Notices, Amer. Math. Soc., 64:2 (2017), 132-134.)
Street addresses are different from prime numbers in that, even if we include all the world's addresses, we are still dealing with a finite set, while there are infinitely many primes.[1] So, to simplify things for ourselves, in this article, we will only consider primes up to some threshold. And, so as to give the digits 2 to 9 a fair shake, we will set that threshold at a power of 10.
We see that going only up to 10, 1 as a first digit gets off to a lousy start, since we don't consider 1 prime anymore (for reasons that are beyond the scope of this article). One might be tempted to temporarily regard 1 as prime for the sake of a handicap, but just going up to 100 will show that to be unnecessary.
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
Up to 10 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 |
Up to 100 | 4 | 3 | 3 | 3 | 3 | 2 | 4 | 2 | 1 |
Up to 1000 | 25 | 19 | 19 | 20 | 17 | 18 | 18 | 17 | 15 |
Up to 10000 | 160 | 146 | 139 | 139 | 131 | 135 | 125 | 127 | 127 |
Up to 10 5 | 1193 | 1129 | 1097 | 1069 | 1055 | 1013 | 1027 | 1003 | 1006 |
Up to 10 6 | 9585 | 9142 | 8960 | 8747 | 8615 | 8458 | 8435 | 8326 | 8230 |
Up to 10 7 | 80020 | 77025 | 75290 | 74114 | 72951 | 72257 | 71564 | 71038 | 70320 |
Up to 10 8 | 686048 | 664277 | 651085 | 641594 | 633932 | 628206 | 622882 | 618610 | 614821 |
Up to 10 9 | 6003530 | 5837665 | 5735086 | 5661135 | 5602768 | 5556434 | 5516130 | 5481646 | 5453140 |
Cf.: | A073517 | A073516 | A073515 | A073514 | A073513 | A073512 | A073511 | A073510 | A073509 |
However, as we go further up, the advantage of 1 as a first digit gradually erodes.
Another way to look at this is to see what is the 50th prime number to start with a given digit . Now, 50 is a bit arbitrary: the choice was arrived at by a rough estimate of term visibility for the relevant OEIS sequence entries.
50th | ||
1 | 1171 | A045707 |
2 | 2243 | A045708 |
3 | 3259 | A045709 |
4 | 4231 | A045710 |
5 | 5297 | A045711 |
6 | 6269 | A045712 |
7 | 7309 | A045713 |
8 | 8291 | A045714 |
9 | 9311 | A045715 |
If for these 50th primes we chop off the leading digit, we see that the resulting number for 9 is significantly higher than the one for 1.
But some reflection upon the prime number theorem suggests that as we look at higher and higher powers of 10, the distribution of first digits will become more or less uniform, leading to the conclusion that Benford's law does not actually apply to prime numbers.
References
- Daniel I. A. Cohen and Talbot M. Katz, "Prime numbers and the first digit phenomenon," J. Number Theory 18 (1984), 261-268.