Saturday, March 16, 2013

Benford's Law

Take any random quantity. Maybe a population of a city, a mass of a planet, a distance from a star, or an amount of twitter followers. Now, take the first digit of this number. What are the odds that it will start with a 1, or a 2, and so on?

One would think that it would be a one in nine chance for each digit. If you are randomly selecting this number, the first digit is just a random number selection between one and nine.

However, this reasoning does not work. Try finding these quantities yourself. Or, just go to http://testingbenfordslaw.com. If you check all of the different areas that are possible, you will see that the smaller the digit, the more commonly it appears.

Here are the approximate odds for each digit:

1: 30.1%
2: 17.6%
3: 12.5%
4: 9.7%
5: 7.9%
6: 6.7%
7: 5.8%
8: 5.1%
9: 4.6%

Since this is a math blog, something we would want to do is find a pattern between these numbers. This could probably be done on a graphing calculator, using similar techniques to the post on data analysis.

It is clear that a line of best fit would not be the solution. If we connected the points with a curved graph, it would look like this:


You can see that this graph gets really close to the y-axis, but it does not seem to touch it. Similarly, it gets really close to the x-axis, but doesn't touch it. Thus, the x and y-axes would be called asymptotes of this graph.

But, an asymptote could be found in a rational function, radical function, exponential function, logarithmic function, hyperbolic function, trigonometric function... So, it is hard to define this graph solely based on the presence of asymptotes.

The function that does work with this graph is a logarithmic one. A logarithm is basically the opposite of an exponent. For instance:

52 = 25
log5(25) = 2

Just knowing that the equation is logarithmic doesn't seem to narrow it down a lot, because there are so many different types of logarithms. However, there are three types which are seen the most frequently. In fact, most scientific calculators contain just these three types.

Logarithm Type BaseSimple Notation Standardized Notation Applications
Natural Logarithm
e (the irrational number ≈ 2.71828)
loge(x)
ln(x)
Calculus, Statistics, Physics, Chemistry, Economics
Common Logarithm
10
log10(x)
lg(x)
Algebra, Engineering, Geology, Spectroscopy, Music
Binary Logarithm
2
log2(x)
lb(x)
Discrete Mathematics, Computer Science, Information Theory

If we confine the possible equations to one of these, it narrows it down a lot. And this assumption is correct. The equation that the graph fits is:

y = lg(1 + 1/x)

(Remember that x is the starting digit and y is the percentage for that digit)

I was pretty surprised about this equation. But, all of the percentages listed above are the outcome of this equation.

Benford's Law can be used in biology, accounting, law, economics, etc. However, a more fun way to use it is to turn it into a game.

Tell someone that you will get the numbers 1, 2, and 3 and they will get the numbers 5, 6, 7, 8, and 9 (and nobody gets 4). You then have them come up with random quantities that they wouldn't know, and you look up the number (you can use Google or WolframAlpha for this). Every time it is a 1, 2, or 3, you win a point and every time it is a 5, 6, 7, 8, 9, they get a point. They think they have around a 2:1 advantage over you, but you are really the one with the 2:1 advantage.

Click here to see this game played on the show Scam School.

No comments:

Post a Comment