Ich habe das mit den Dateien auf meiner Festplatte ausprobiert, es stimmt wirklich. Die Abbildung zeigt die Häufigkeit einer jeweiligen Ziffer für die Grösse aller Dateien (gemessen in Byte). Die Punkte zeigen ersten und zweiten Stelle der Grösse der Dateien. Die Linie zeigt die Theorie, das Benfordsche Gesetz. Während die zweite Ziffer (fast) gleichverteilt ist, folgt die erste Ziffer ziemlich genau der Verteilung nach Benford. Anwendung: Überprüfen der Glaubwürdigkeit von Angaben (Steuererklärung z.B.). Wenn die Zahlen gefälscht sind, folgen sie nicht dem Benfordschem Gesetz.
log_10( 1+ (1/d) )
This implies that a number in a table of physical constants is more
likely to begin with a smaller digit than a larger digit. It was
published by Newcomb in a paper entitled "Note on the Frequency of Use
of the Different Digits in Natural Numbers", which appeared in The
American Journal of Mathematics (1881) 4, 39-40. It was re-discovered
by Benford in 1938, and he published an article called "The Law of
Anomalous Numbers" in Proc. Amer. Phil. Soc 78, pp 551-72. Several
other references can be found in Hill's article.
Just for fun, I tabulated the first-digits of the physical constants
listed in Table 2.3 of Abramowitz and Stegun's "Handbook of
Mathematical Functions":
1
1
1*
1
1
1 2
1 2
1 2*
1 2
1 2 *
1 2 4 5
1 2 4* 5
1 2 4 5* 6 9
1 2 4 5 6 * 9*
1 2 3 4 5 6 7 8 9
The "*" symbols indicate the approximate predicted number of
occurrances according to Benford's Law. Aside from the conspicuous
deficiency of 3's, that's not a bad match for just 44 data points.
Although there have been many lengthy "explanations" for Benford's
Law, it seems to me this is a good candidate for a "Proof Without
Words":
1---------------2---------3-------4-----5----6---7--8--9[Okay, not entirely without words:] The underlying premise is simply that physical constants, expressed in the base 10 and more or less arbitrary units, will be somewhat evenly distributed on a logarithmic scale. This is confirmed by the fact that the exponents on these constants are fairly uniformly distributed, at least over several "decades". As a result, the probability of the leading digit being "d" clearly approaches
log(d+1) - log(d)
----------------- = log(1+(1/d))
log(10) - log(1)
where "log" signifies the common logarithm (base 10). Of course,
we COULD have chosen units for our physical constants such that the
leading digits were all 9's (for example), but evidently we have a
natural tendancy to choose units so that our numbers are evenly
distributed by order of magnitude, rather than absolute value. This
may be related to our basic impressions of hearing and sight (not to
mention earthquakes), where our intuitive senses of loudness and
brightness are logarithmic.
(siehe auch "Der Spiegel",
47/1998,
vom 16.11.1998)