<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Mec2009summer&#039;s Blog</title>
	<atom:link href="http://mec2009summer.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://mec2009summer.wordpress.com</link>
	<description>Just another WordPress.com weblog</description>
	<lastBuildDate>Mon, 10 Aug 2009 16:11:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='mec2009summer.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Mec2009summer&#039;s Blog</title>
		<link>http://mec2009summer.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://mec2009summer.wordpress.com/osd.xml" title="Mec2009summer&#039;s Blog" />
	<atom:link rel='hub' href='http://mec2009summer.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Benford&#8217;s Law</title>
		<link>http://mec2009summer.wordpress.com/2009/08/06/benfords-law-2/</link>
		<comments>http://mec2009summer.wordpress.com/2009/08/06/benfords-law-2/#comments</comments>
		<pubDate>Thu, 06 Aug 2009 14:00:42 +0000</pubDate>
		<dc:creator>mec2009summer</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://mec2009summer.wordpress.com/?p=112</guid>
		<description><![CDATA[This article will explore a pattern that occurs in a variety of random samples of data, such as sizes of counties and populations, physical constants like densities and molecular masses, stock prices and as random as numbers appearing in newspaper. We want to study the distribution of the left-most nonzero digit (ranging from 1 to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=112&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This article will explore a pattern that occurs in a variety of random samples of data, such as sizes of counties and populations, physical constants like densities and molecular masses, stock prices and as random as numbers appearing in newspaper.  We want to study the distribution of the left-most nonzero digit (ranging from 1 to 9), also called the leading digit.  At first thought, each value seems to have equal probability (=1/9).  But it turns out 1 is much more likely to be the leading digit than 9.  Exact probabilities are shown in the chart below.</p>
<div class="wp-caption alignnone" style="width: 468px"><a href="http://plus.maths.org.uk/issue9/features/benford/index-gifd.html"><img title="Benfords law" src="http://plus.maths.org.uk/issue9/features/benford/compPlot1a.gif" alt="Probability of having leading digit 1 to 9 assuming exponential growth" width="458" height="329" /></a><p class="wp-caption-text">Probability of having leading digit 1 to 9 assuming exponential growth</p></div>
<p>Several possible explanations include:</p>
<ol>
<li><strong>Upper bound</strong>: How would you obtain data with leading digits evenly distributed among 1 to 9?  One way is to use a random number generator from 1 to 99.  This requires us to specify an upper bound but naturally occurring data does not have such a bound (the lower bound will always be 0 since we are concerned with positive numbers only).  Also the upper bound has to be 9 or 99 or 999 &#8230; for 1 to 9 to have the same probability; if the bounds is 1 and 19, then the leading digit will more likely be 1.<br />
With this explanation, certain examples, such as numbers in the newspaper, following this distribution seem to make more sense.  To understand how to get the exact probability try the following <a href="http://mec2009summer.wordpress.com/2009/08/06/upper-bound/">Exercise.</a><br />
<span style="text-decoration:underline;"><br />
</span></li>
<li><strong>Exponential</strong>: population over time is roughly exponential.  Suppose it starts at 1 and doubles every year (<img src='http://s0.wp.com/latex.php?latex=P%28t%29%3D+2%5Et&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(t)= 2^t' title='P(t)= 2^t' class='latex' />), then it takes 1 year to go from 1 to 2. To reach 3, solve <img src='http://s0.wp.com/latex.php?latex=2%5Et%3D3&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^t=3' title='2^t=3' class='latex' /> using a calculator, we get <img src='http://s0.wp.com/latex.php?latex=t%3D%5Cfrac%7B%5Clog+3%7D%7B%5Clog+2%7D%5Capprox+1.585&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t=&#92;frac{&#92;log 3}{&#92;log 2}&#92;approx 1.585' title='t=&#92;frac{&#92;log 3}{&#92;log 2}&#92;approx 1.585' class='latex' />, so it takes only 0.585 years to go from 2 to 3.   This will continue to decrease because of exponential growth.  But it will jump up when you calculate the time spent at 10 to 20 because we increased the increment when we go from unit digit to tens.<span style="text-decoration:underline;"><br />
Exercise 2</span>: calculate the time spent from 10 to 20.  Using properties of logarithm, what can you conclude?  From this you can kind of see the fraction of time spent from 1 to 2, 10-20, 100-200, &#8230;  versus the total time is just the fraction of time spent from 1 to 2 versus 1 to 10.  Using properties of log, the answer is <img src='http://s0.wp.com/latex.php?latex=%5Clog+2+%5Capprox+.301&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log 2 &#92;approx .301' title='&#92;log 2 &#92;approx .301' class='latex' />.  This is exactly the one appearing in the above graph.With this calculation we obtain the <em>log distribution</em>: probability that the leading digit is d <img src='http://s0.wp.com/latex.php?latex=%3D%5Clog%28d%2B1%29-%5Clog%28d%29%3D%5Clog+%281%2B%5Cfrac%7B1%7D%7Bd%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='=&#92;log(d+1)-&#92;log(d)=&#92;log (1+&#92;frac{1}{d})' title='=&#92;log(d+1)-&#92;log(d)=&#92;log (1+&#92;frac{1}{d})' class='latex' />.<br />
<strong>Definition</strong>: A sequence of numbers <img src='http://s0.wp.com/latex.php?latex=%5C%7B+a_n%5C%7D+&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{ a_n&#92;} ' title='&#92;{ a_n&#92;} ' class='latex' /> is <em>Benford</em> if the leading digits approaching the log distribution in the limit as n approches infinity.<br />
It is sometimes simpler to work with the probability that the leading digit is d or less because it is <img src='http://s0.wp.com/latex.php?latex=%5Clog%28d%2B1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log(d+1)' title='&#92;log(d+1)' class='latex' />.<br />
To understand why many phenomena satisfy Benford&#8217;s law without being exponential require us to explore exercise 2 more rigorously.</li>
<li><strong>Multiplicative (Geometric)</strong>:  It turns out that lots of phenomena that are multiplicative in nature can be shown to satisfy Benford&#8217;s law.  The way it is proved is similar to exponential growth.  From <img src='http://s0.wp.com/latex.php?latex=%5Clog+ab+%3D+%5Clog+a+%2B%5Clog+b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log ab = &#92;log a +&#92;log b' title='&#92;log ab = &#92;log a +&#92;log b' class='latex' />, we can reduce multiplicative processes to linear ones, which is usually easier.  For example, the stock market can be modeled by multiplying it by 2 and 1/2 with probability .5 and .5 , respectively, every year (where the values given are arbitrary and certainly inaccurate).  If we take log, this becomes adding 1 and -1 with probability .5 and .5, which is like flipping a coin and trying to figure out the total number of heads after some time (the latter is called random walk and can be modeled by a bell-shaped curve by the <a href="http://en.wikipedia.org/wiki/Central_limit_theorem">Central Limit Theorem</a>; the former is called geometric Brownian motion and satisfies Benford&#8217;s law)
<p><div class="wp-caption alignnone" style="width: 468px"><img title="Benfords law" src="http://tbn1.google.com/images?q=tbn:mHaJFjeMfgpZ5M:http://www.math.vt.edu/people/day/class_home/5725/ew.GIF" alt="Geometric Brownian Simulation" width="400" height="200" /><p class="wp-caption-text">Geometric Brownian Motion Simulation</p></div><br />
<div class="wp-caption alignnone" style="width: 468px"><img title="Benfords law" src="http://tbn1.google.com/images?q=tbn:OjDNwVwbvXoXkM:http://web2.uwindsor.ca/courses/physics/high_schools/2005/Brownian_motion/1dbm.jpg" alt="Random Walk Simulation" width="400" height="200" /><p class="wp-caption-text">Random Walk Simulation</p></div></p>
<li><strong>Universal</strong>: Forget about the previous explanations.  If there is a law about leading digit of physical data, then it shouldn&#8217;t depend on what unit; if it works for stock prices in US dollar, it should also work for Euro or British Pound.  This is called <em>scale-invariance</em>.  With this idea in mind, we will see that log distribution satisfies this property.  More importantly, it is the <em>only</em> distribution that satisfies the probability.<br />
To give a definition of scale-invariance,  we start with a sequence <img src='http://s0.wp.com/latex.php?latex=%5C%7Ba_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{a_n&#92;}' title='&#92;{a_n&#92;}' class='latex' /> with the probability that a number from the sequence has leading digit less than d denoted by D(d), then this probability will be the same as that of the sequence <img src='http://s0.wp.com/latex.php?latex=%5C%7Bc+a_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{c a_n&#92;}' title='&#92;{c a_n&#92;}' class='latex' />, for any c&gt;0.<br />
Similarly we can change the number system we use: instead of base 10, we use base 8.  We hope this will again give us log is the only distribution that work (this won&#8217;t work though because the sequence {1, 1, 1, &#8230;} is the same no matter what base we are in.  That means the distribution with 1 being 1 and everything else 0 will screw things up.  But that&#8217;s the only problem.)</li>
</ol>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mec2009summer.wordpress.com/112/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mec2009summer.wordpress.com/112/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mec2009summer.wordpress.com/112/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mec2009summer.wordpress.com/112/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mec2009summer.wordpress.com/112/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mec2009summer.wordpress.com/112/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mec2009summer.wordpress.com/112/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mec2009summer.wordpress.com/112/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mec2009summer.wordpress.com/112/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mec2009summer.wordpress.com/112/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mec2009summer.wordpress.com/112/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mec2009summer.wordpress.com/112/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mec2009summer.wordpress.com/112/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mec2009summer.wordpress.com/112/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=112&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mec2009summer.wordpress.com/2009/08/06/benfords-law-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f3865dec625003838a3280c1b4808664?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mec2009summer</media:title>
		</media:content>

		<media:content url="http://plus.maths.org.uk/issue9/features/benford/compPlot1a.gif" medium="image">
			<media:title type="html">Benfords law</media:title>
		</media:content>

		<media:content url="http://tbn1.google.com/images?q=tbn:mHaJFjeMfgpZ5M:http://www.math.vt.edu/people/day/class_home/5725/ew.GIF" medium="image">
			<media:title type="html">Benfords law</media:title>
		</media:content>

		<media:content url="http://tbn1.google.com/images?q=tbn:OjDNwVwbvXoXkM:http://web2.uwindsor.ca/courses/physics/high_schools/2005/Brownian_motion/1dbm.jpg" medium="image">
			<media:title type="html">Benfords law</media:title>
		</media:content>
	</item>
		<item>
		<title>1. Upper Bound</title>
		<link>http://mec2009summer.wordpress.com/2009/08/06/1-upper-bound/</link>
		<comments>http://mec2009summer.wordpress.com/2009/08/06/1-upper-bound/#comments</comments>
		<pubDate>Thu, 06 Aug 2009 13:58:20 +0000</pubDate>
		<dc:creator>mec2009summer</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://mec2009summer.wordpress.com/?p=109</guid>
		<description><![CDATA[Exercise 1: calculate the probability of leading digit being 1 for a number randomly chosen from 1 to 19. How about 1 to 29, 1 to 39, etc? So how should we solve the problem of having to specify a bound? You may want to try other range such as 1 to 199. Then you [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=109&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><span style="text-decoration:underline;">Exercise 1</span>: calculate the probability of leading digit being 1 for a number randomly chosen from 1 to 19. How about 1 to 29, 1 to 39, etc? So how should we solve the problem of having to specify a bound?</p>
<p>You may want to try other range such as 1 to 199.  Then you see the probability keeps oscillating (between what?), never approaching a number and so you need some averaging process to get a single number.  The key is to try to make the answer as independent of the bound as possible.</p>
<p>Answer: If we want to get rid of having to specify a bound, we can try looking at what happens as the bound gets bigger. The example above has probability 11/19, 11/29, 11/39, and we see it will oscillate between 1/9 and 5/9 but never approaching a single number. The next try is to average the probability for various upper bound: (1/1+1/2+&#8230;+1/9+2/10+3/11+&#8230;+10/18+11/19+11/20+&#8230;+11/99)/99=0.253 (this is quite difficult to calculate with a calculator, so for first try, I compute (11/19+11/29+&#8230;+11/99)/9=0.242. ). If we do this with more upper bounds, it will approach a single number around 24.1%. This is not the the desired answer(30.1%) but we are not far away. As explained in the literature, averaging once is not enough to get rid of the effects of having to specify the upper bound. If we take the data from the first average and average them again, it will become closer. To eliminate the upper bound effect completely, we have to repeatedly average the average and this will yield the answer, although we will not actually compute this limit here.  For those interested, the result is due to Flehinger and one method of computing is presented <a href="http://www.google.com/search?q=%22From+uniform+distribution+to+Benford%27s+law%22&amp;ie=utf-8&amp;oe=utf-8&amp;aq=t&amp;rls=org.mozilla:en-US:official&amp;client=firefox-a">here</a> in one of the .ps file.</p>
<p>The iterated averaging scheme is only one of many methods because the main factor is how well we get rid of the effects of choosing an upper bound. For example, our first idea is to use a random number generator. Instead of choosing an upper bound, we will use a second random number generator with upper bound 900,000 to obtain the upper bound for the first generator. Or we can use a third generator to get the bound for the second one instead of specifying the upper bound of the second generator. Of course we will eventually need to make a choice of upper bound, this method works very well after chaining it a few times. In fact we can use other distributions (random number generator is uniform distribution), such as normal distribution (which has parameters to be specified as well). This explains why Benford&#8217;s Law seems to work especially well if you combine data from different sources (such as numbers in the newspaper).</p>
<p>For a very detailed exposition of everything, <a href="http://arxiv.org/ftp/math/papers/0612/0612627.pdf">click here</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mec2009summer.wordpress.com/109/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mec2009summer.wordpress.com/109/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mec2009summer.wordpress.com/109/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mec2009summer.wordpress.com/109/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mec2009summer.wordpress.com/109/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mec2009summer.wordpress.com/109/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mec2009summer.wordpress.com/109/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mec2009summer.wordpress.com/109/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mec2009summer.wordpress.com/109/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mec2009summer.wordpress.com/109/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mec2009summer.wordpress.com/109/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mec2009summer.wordpress.com/109/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mec2009summer.wordpress.com/109/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mec2009summer.wordpress.com/109/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=109&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mec2009summer.wordpress.com/2009/08/06/1-upper-bound/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f3865dec625003838a3280c1b4808664?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mec2009summer</media:title>
		</media:content>
	</item>
		<item>
		<title>2,3 Equidistribution</title>
		<link>http://mec2009summer.wordpress.com/2009/08/06/23-equidistribution/</link>
		<comments>http://mec2009summer.wordpress.com/2009/08/06/23-equidistribution/#comments</comments>
		<pubDate>Thu, 06 Aug 2009 13:57:04 +0000</pubDate>
		<dc:creator>mec2009summer</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://mec2009summer.wordpress.com/?p=107</guid>
		<description><![CDATA[Several things to go into details here: exponential growth (exercise 2) and the multiplicative (Benford) vs. linear (equidistribution), leading to a way to prove something is Benford. Recall that we want to find the what proportion of a sequence satisfies for some integer m, with d being the leading digit we choose. Let&#8217;s go over [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=107&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Several things to go into details here: exponential growth (exercise 2) and the multiplicative (Benford) vs. linear (equidistribution), leading to a way to prove something is Benford.</p>
<p>Recall that we want to find the what proportion of a sequence <img src='http://s0.wp.com/latex.php?latex=%5C%7Ba_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{a_n&#92;}' title='&#92;{a_n&#92;}' class='latex' /> satisfies <img src='http://s0.wp.com/latex.php?latex=d+10%5Em+%5Cleq+a_n+%3C+%28d%2B1%29+10%5Em&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d 10^m &#92;leq a_n &lt; (d+1) 10^m' title='d 10^m &#92;leq a_n &lt; (d+1) 10^m' class='latex' /> for some integer m, with d being the leading digit we choose.</p>
<p>Let&#8217;s go over our example of population growth and modify things to give a new interpretation: the probability becomes <img src='http://s0.wp.com/latex.php?latex=P%28d+10%5Em+%5Cleq+2%5En+%3C+%28d%2B1%29+10%5Em%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P(d 10^m &#92;leq 2^n &lt; (d+1) 10^m)' title='P(d 10^m &#92;leq 2^n &lt; (d+1) 10^m)' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=%3DP%28%5Clog+%5Bd+10%5Em%5D+%5Cleq+%5Clog+2%5En+%3C+%5Clog%5B%28d%2B1%29+10%5Em%5D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='=P(&#92;log [d 10^m] &#92;leq &#92;log 2^n &lt; &#92;log[(d+1) 10^m])' title='=P(&#92;log [d 10^m] &#92;leq &#92;log 2^n &lt; &#92;log[(d+1) 10^m])' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=%3DP%28%5Clog+d+%5Cleq+n%5Clog+2-m+%3C+%5Clog%28d%2B1%29+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='=P(&#92;log d &#92;leq n&#92;log 2-m &lt; &#92;log(d+1) )' title='=P(&#92;log d &#92;leq n&#92;log 2-m &lt; &#92;log(d+1) )' class='latex' />                   (1)</p>
<p><strong>Interpretation</strong>: Note that <img src='http://s0.wp.com/latex.php?latex=%5Clog+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log d' title='&#92;log d' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=%5Clog+%28d%2B1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log (d+1)' title='&#92;log (d+1)' class='latex' /> lie between 0 and 1, so even though we specify m is whatever integer, it is forced to be the greatest integer that&#8217;s less than <img src='http://s0.wp.com/latex.php?latex=n%5Clog+2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;log 2' title='n&#92;log 2' class='latex' /> (for example if <img src='http://s0.wp.com/latex.php?latex=n%5Clog+2%3D23.134...&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;log 2=23.134...' title='n&#92;log 2=23.134...' class='latex' />, m has to be 23 and <img src='http://s0.wp.com/latex.php?latex=n%5Clog+2-m%3D0.134...&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;log 2-m=0.134...' title='n&#92;log 2-m=0.134...' class='latex' />).  Notation: <img src='http://s0.wp.com/latex.php?latex=n%5Clog+2-m&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;log 2-m' title='n&#92;log 2-m' class='latex' /> will be denoted <img src='http://s0.wp.com/latex.php?latex=n%5Clog+2+%5Cmod+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;log 2 &#92;mod 1' title='n&#92;log 2 &#92;mod 1' class='latex' /> and geometrically the interval from 0 to 1 is glued to become a circle and 0.3 is the same as 1.3, 2.3, &#8230; and as n increases by 1, <img src='http://s0.wp.com/latex.php?latex=n%5Clog+2+%5Cmod+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n&#92;log 2 &#92;mod 1' title='n&#92;log 2 &#92;mod 1' class='latex' /> looks like rotation around a circle (with an irrational angle of 108.37&#8230; degree).  Hence the name <em>irrational rotation</em>, meaning <img src='http://s0.wp.com/latex.php?latex=%5Clog+2+&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log 2 ' title='&#92;log 2 ' class='latex' /> is irrational (Exercise: why is log 2 irrational? <a href="http://en.wikipedia.org/wiki/Irrational_number#Square_roots">hint</a>).</p>
<p>Notice the answer we seek (<img src='http://s0.wp.com/latex.php?latex=%5Clog%28d%2B1%29-%5Clog+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log(d+1)-&#92;log d' title='&#92;log(d+1)-&#92;log d' class='latex' />) is precisely the difference between the left and right bound in <img src='http://s0.wp.com/latex.php?latex=%5Clog+d+%5Cleq+n%5Clog+2+%5Cmod+1+%3C+%5Clog%28d%2B1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log d &#92;leq n&#92;log 2 &#92;mod 1 &lt; &#92;log(d+1)' title='&#92;log d &#92;leq n&#92;log 2 &#92;mod 1 &lt; &#92;log(d+1)' class='latex' />.  So what we need to see is that the sequence of points generated by irrational rotation is evenly distributed around the circle:</p>
<p><strong>Definition</strong>: <img src='http://s0.wp.com/latex.php?latex=%5C%7Bc_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{c_n&#92;}' title='&#92;{c_n&#92;}' class='latex' /> is equidistributed in [0,1] if for any interval (a, b) in [0,1], the proportion of <img src='http://s0.wp.com/latex.php?latex=%5C%7Bc_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{c_n&#92;}' title='&#92;{c_n&#92;}' class='latex' /> in (a, b) is b-a.</p>
<p>So we are done if we show <img src='http://s0.wp.com/latex.php?latex=%5C%7Bn%5Clog+2+%5Cmod+1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{n&#92;log 2 &#92;mod 1&#92;}' title='&#92;{n&#92;log 2 &#92;mod 1&#92;}' class='latex' /> is equidistributed.</p>
<p>By the way, if we repeat everything above replacing <img src='http://s0.wp.com/latex.php?latex=%5C%7B2%5En%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{2^n&#92;}' title='&#92;{2^n&#92;}' class='latex' /> by <img src='http://s0.wp.com/latex.php?latex=%5C%7Ba_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{a_n&#92;}' title='&#92;{a_n&#92;}' class='latex' />, we get the following:</p>
<p><strong>Sufficient condition for Benford:</strong> if <img src='http://s0.wp.com/latex.php?latex=%5C%7B+%5Clog+a_n%5Cmod+1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{ &#92;log a_n&#92;mod 1&#92;}' title='&#92;{ &#92;log a_n&#92;mod 1&#92;}' class='latex' /> is equidistributed in [0,1], then <img src='http://s0.wp.com/latex.php?latex=%5C%7Ba_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{a_n&#92;}' title='&#92;{a_n&#92;}' class='latex' /> is Benford.</p>
<p>In fact this is how many sequences are shown to be Benford.<br />
<strong>Example 1</strong>: showing geometric Brownian motion is Benford is reduced to showing bell-shape probability distribution mod 1 is equidistributed in [0,1], which is more tractable although still messy.  Therefore we won&#8217;t do it (showing <img src='http://s0.wp.com/latex.php?latex=%5Csum_%7Bn%3D-%5Cinfty%7D%5E%7B%5Cinfty%7D%5Cint_a%5Ebe%5E%7B-%5Cpi+%28x%2Bn%29%5E2%2FN%7D%2F%5Csqrt%7BN%7D+dx+%5Crightarrow+b-a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sum_{n=-&#92;infty}^{&#92;infty}&#92;int_a^be^{-&#92;pi (x+n)^2/N}/&#92;sqrt{N} dx &#92;rightarrow b-a' title='&#92;sum_{n=-&#92;infty}^{&#92;infty}&#92;int_a^be^{-&#92;pi (x+n)^2/N}/&#92;sqrt{N} dx &#92;rightarrow b-a' class='latex' /> as N approaches infinity).<br />
<strong>Example 2</strong>: Fibonacci sequence is Benford.  It is defined by recurrence relation <img src='http://s0.wp.com/latex.php?latex=a_n%3Da_%7Bn-1%7D%2Ba_%7Bn-2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n=a_{n-1}+a_{n-2}' title='a_n=a_{n-1}+a_{n-2}' class='latex' /> and the first two values are 0 and 1.  In fact, most sequences defined by recurrence relation is Benford.  The reason is that they are roughly exponential (for Fibonacci, <img src='http://s0.wp.com/latex.php?latex=a_n%3Db%5En%2Bc%5E%7B-n%7D%5Capprox+b%5En&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='a_n=b^n+c^{-n}&#92;approx b^n' title='a_n=b^n+c^{-n}&#92;approx b^n' class='latex' /> where b, c&gt;1), so that taking log gives something that looks like irrational rotation for large n and can be shown to be equidistributed.<br />
(Details for these examples can be found <a href="http://www.google.com/url?sa=t&amp;source=web&amp;ct=res&amp;cd=7&amp;url=http%3A%2F%2Fwww.williams.edu%2Fgo%2Fmath%2Fsjmiller%2Fpublic_html%2FBrownClasses%2F1%2FBenfordTreatise.pdf&amp;ei=FeotSvmgCI7GM8nVsIcK&amp;usg=AFQjCNFtNso_dl54pnBB3wE5PV-TYV9Auw">here</a>)</p>
<p>Let&#8217;s get back to showing<br />
<strong>Theorem</strong>: Irrational rotation, <img src='http://s0.wp.com/latex.php?latex=%5C%7B+n+b%5Cmod+1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{ n b&#92;mod 1&#92;}' title='&#92;{ n b&#92;mod 1&#92;}' class='latex' />, is equidistributed in [0,1], where n=1, 2, &#8230; and b is irrational (Exercise: what happens if b is rational, say 3/4?)</p>
<p>One thing special about irrational is the following</p>
<p><strong>Property</strong>: For any x in [0,1], there exist some number in <img src='http://s0.wp.com/latex.php?latex=%5C%7B+n+b%5Cmod+1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{ n b&#92;mod 1&#92;}' title='&#92;{ n b&#92;mod 1&#92;}' class='latex' /> as close to x as we want (but not equal to x).<br />
Proof by example: let&#8217;s say x=0, b=0.31 and we want to find n so that <img src='http://s0.wp.com/latex.php?latex=n+b%5Cmod+1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='n b&#92;mod 1' title='n b&#92;mod 1' class='latex' /> so that we get closer to x by half or .31&#8230;/2.  From the sequence {0.31&#8230;, 0.62&#8230;, 0.93&#8230;=0.07&#8230;, &#8230;}, it is not hard as the third number is good enough .  It turns out we can keep on repeating, this time getting closer than .07&#8230;/2: {.07&#8230;, 0.14&#8230;, &#8230;, .98&#8230;=0.01&#8230;}.  So why does it work?  Because as you get close to x, the two numbers in the sequence that sandwich x are distance b away from each other so one of the two numbers has to be at most b/2 away from x.  Picking x to be 0, we have find a way to increment be less than b/2, and so we have find a way to continue, this time finding two numbers in the sequence separated by less than b/2 that sandwich x (why doesn&#8217;t this work for rational, say b=1/4? It fails during the second step using x=0, because some points land on 0 and so we cannot cut the distance from x by half.)</p>
<p>Going back to the theorem, we want to see why it works for the case [0, 0.5].  We need to show the proportion*</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cfrac%7B%5Ctextrm%7Bnumber+of+points+%7D+m+b%5Cmod+1%5Ctextrm%7B+is+in+%5B0%2C+0.5%5D+for+m%3D1%2C+...%2C+n%7D%7D%7Bn%7D+%5Crightarrow+0.5&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;frac{&#92;textrm{number of points } m b&#92;mod 1&#92;textrm{ is in [0, 0.5] for m=1, ..., n}}{n} &#92;rightarrow 0.5' title='&#92;frac{&#92;textrm{number of points } m b&#92;mod 1&#92;textrm{ is in [0, 0.5] for m=1, ..., n}}{n} &#92;rightarrow 0.5' class='latex' /><br />
as n approaches infinity.</p>
<p>Proof (Sketchy): It looks nasty to compute so let&#8217;s be crafty.  We can ask about the proportion for [0.5, 1] instead.  The answer should be the same intuitively and the above property confirms our intuition: by skipping an initial segment of <img src='http://s0.wp.com/latex.php?latex=%5C%7B+n+b%5Cmod+1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{ n b&#92;mod 1&#92;}' title='&#92;{ n b&#92;mod 1&#92;}' class='latex' />, it will be <img src='http://s0.wp.com/latex.php?latex=%5C%7B+x%2Bn+b%5Cmod+1%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{ x+n b&#92;mod 1&#92;}' title='&#92;{ x+n b&#92;mod 1&#92;}' class='latex' /> for any x arbitrarily close to what we want, which is 0.5.  Although skipping an initial segment will affect the proportion, but letting n go to infinity eliminates the effect.<br />
Finally, since the proportion in [0, 0.5] and [0.5, 1] must add up to 1, each must be 0.5 and we are done.  This argument can be repeated for other intervals and we are finally done.</p>
<p>The above concept is called translation invariance, characterizing uniform probability uniquely and is thus useful in proofs.  It will be seen later because it is a disguise for scale invariance of log distribution.</p>
<p>Exercise a (generalized Benford&#8217;s law): If instead of the leading digit, we specify the first two digits that we want the probability, say what&#8217;s the proportion of <img src='http://s0.wp.com/latex.php?latex=2%5En&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^n' title='2^n' class='latex' /> with starting with 14, how would you compute it?  Can you come up with a general formula?  How about if we use base 7 instead of base 10.</p>
<p>Answer: You want to repeat the procedure to get (1).  The first one is log 15 &#8211; log 14.  If you change the base, then again repeat and you will see that the formula only changes from log base 10 to log base 7.</p>
<p><strong>Different View: Dynamical System</strong></p>
<p>The proportion* above is called the <em>time average</em> as you think of the irrational rotation as happening every second.  There is also a space average, which is just the length of the interval under consideration (for above it was [0, 0.5]).  Equidistribution can be obtained without using the sketchy argument above: it is a consequence of the fact that time average equals space average:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5Cfrac%7B1%7D%7Bn%7D%5Csum_k+f%28T%5Ek+x%29%3D%5Cint_0%5E1+f%28x%29+dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;frac{1}{n}&#92;sum_k f(T^k x)=&#92;int_0^1 f(x) dx' title='&#92;frac{1}{n}&#92;sum_k f(T^k x)=&#92;int_0^1 f(x) dx' class='latex' /></p>
<p>where Tx=x+b, and f(x)=1 if x mod 1 is in the interval ([0, 0.5] for above) and f(x)=0 otherwise.</p>
<p>Exercise: how does this give equidistribution?</p>
<p>For those who know fourier series, the proof of the formula with our specific map T involves showing the formula is true for <img src='http://s0.wp.com/latex.php?latex=f%28x%29%3D+%5Ccos%282+%5Cpi+nx%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='f(x)= &#92;cos(2 &#92;pi nx)' title='f(x)= &#92;cos(2 &#92;pi nx)' class='latex' /> (or sine) and then claim that we are done because f is periodic.</p>
<p>For more detail, <a href="http://en.wikipedia.org/wiki/Ergodic_theory#Ergodic_theorems">click here</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mec2009summer.wordpress.com/107/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mec2009summer.wordpress.com/107/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mec2009summer.wordpress.com/107/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mec2009summer.wordpress.com/107/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mec2009summer.wordpress.com/107/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mec2009summer.wordpress.com/107/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mec2009summer.wordpress.com/107/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mec2009summer.wordpress.com/107/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mec2009summer.wordpress.com/107/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mec2009summer.wordpress.com/107/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mec2009summer.wordpress.com/107/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mec2009summer.wordpress.com/107/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mec2009summer.wordpress.com/107/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mec2009summer.wordpress.com/107/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=107&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mec2009summer.wordpress.com/2009/08/06/23-equidistribution/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f3865dec625003838a3280c1b4808664?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mec2009summer</media:title>
		</media:content>
	</item>
		<item>
		<title>4.1 Scale-Invariance</title>
		<link>http://mec2009summer.wordpress.com/2009/08/06/4-1-scale-invariance/</link>
		<comments>http://mec2009summer.wordpress.com/2009/08/06/4-1-scale-invariance/#comments</comments>
		<pubDate>Thu, 06 Aug 2009 13:56:04 +0000</pubDate>
		<dc:creator>mec2009summer</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://mec2009summer.wordpress.com/?p=105</guid>
		<description><![CDATA[To give a definition of scale-invariance, we start with a sequence . To simplify notations, let D(d+1) denote the probability that a number from the sequence has leading digit at most d, and F(x) denotes the probability that a number from the sequence is less than x (aka cumulative distribution function). So (eq 1) After [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=105&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>To give a definition of scale-invariance,  we start with a sequence <img src='http://s0.wp.com/latex.php?latex=%5C%7Ba_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{a_n&#92;}' title='&#92;{a_n&#92;}' class='latex' />.  To simplify notations, let D(d+1) denote the probability that a number from the sequence has leading digit at most d, and F(x) denotes the probability that a number from the sequence is less than x (aka cumulative distribution function).  So</p>
<p align="center"><img src='http://s0.wp.com/latex.php?latex=D%28d%29%3D%5Csum_%7Bn%3D-%5Cinfty%7D%5E%7B%5Cinfty%7DF%28d+10%5En%29-F%2810%5En%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(d)=&#92;sum_{n=-&#92;infty}^{&#92;infty}F(d 10^n)-F(10^n)' title='D(d)=&#92;sum_{n=-&#92;infty}^{&#92;infty}F(d 10^n)-F(10^n)' class='latex' />                  (eq 1)</p>
<p style="text-align:left;">After we change the unit, <img src='http://s0.wp.com/latex.php?latex=%5C%7Ba_n%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{a_n&#92;}' title='&#92;{a_n&#92;}' class='latex' /> becomes <img src='http://s0.wp.com/latex.php?latex=%5C%7B%5Cfrac%7Ba_n%7D%7Bc%7D%5C%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;{&#92;frac{a_n}{c}&#92;}' title='&#92;{&#92;frac{a_n}{c}&#92;}' class='latex' />, and denote the new D and F with a bar on top, so <img src='http://s0.wp.com/latex.php?latex=%5Cbar%7BF%7D%28x%29%3DF%28cx%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;bar{F}(x)=F(cx)' title='&#92;bar{F}(x)=F(cx)' class='latex' />, and scale invariance means <img src='http://s0.wp.com/latex.php?latex=%5Cbar%7BD%7D%28d%29%3D+D%28d%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;bar{D}(d)= D(d)' title='&#92;bar{D}(d)= D(d)' class='latex' />.  Repeatedly using eq 1,</p>
<p style="text-align:left;"><img src='http://s0.wp.com/latex.php?latex=%5Cbar%7BD%7D%28d%29%3D%5Csum_%7Bn%3D-%5Cinfty%7D%5E%7B%5Cinfty%7D%5Cbar%7BF%7D%28d+10%5En%29-%5Cbar%7BF%7D%2810%5En%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;bar{D}(d)=&#92;sum_{n=-&#92;infty}^{&#92;infty}&#92;bar{F}(d 10^n)-&#92;bar{F}(10^n)' title='&#92;bar{D}(d)=&#92;sum_{n=-&#92;infty}^{&#92;infty}&#92;bar{F}(d 10^n)-&#92;bar{F}(10^n)' class='latex' /></p>
<p style="text-align:left;"><img src='http://s0.wp.com/latex.php?latex=%3D%5Csum_%7Bn%3D-%5Cinfty%7D%5E%7B%5Cinfty%7DF%28c+d+10%5En%29-F%28c+10%5En%29%3DD%28c+d%29-D%28c%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='=&#92;sum_{n=-&#92;infty}^{&#92;infty}F(c d 10^n)-F(c 10^n)=D(c d)-D(c)' title='=&#92;sum_{n=-&#92;infty}^{&#92;infty}F(c d 10^n)-F(c 10^n)=D(c d)-D(c)' class='latex' /></p>
<p style="text-align:left;">Summarizing,</p>
<p style="text-align:center;"><img src='http://s0.wp.com/latex.php?latex=%5Cdisplaystyle%7BD%28c+%5Ccdot+d%29%3DD%28c%29%2BD%28d%29%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;displaystyle{D(c &#92;cdot d)=D(c)+D(d)}' title='&#92;displaystyle{D(c &#92;cdot d)=D(c)+D(d)}' class='latex' />, for <img src='http://s0.wp.com/latex.php?latex=d%3D2%2C+...%2C+10&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d=2, ..., 10' title='d=2, ..., 10' class='latex' />, and <img src='http://s0.wp.com/latex.php?latex=c%3E0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='c&gt;0' title='c&gt;0' class='latex' />   (eq. 2)</p>
<p><img src='http://s0.wp.com/latex.php?latex=D%28d%29%3D%5Clog+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(d)=&#92;log d' title='D(d)=&#92;log d' class='latex' /> satisfies eq. 2, but does eq. 2 implies <img src='http://s0.wp.com/latex.php?latex=D%28d%29%3D%5Clog+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(d)=&#92;log d' title='D(d)=&#92;log d' class='latex' /> (for <img src='http://s0.wp.com/latex.php?latex=d%3D2%2C+...%2C+10&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d=2, ..., 10' title='d=2, ..., 10' class='latex' />)?</p>
<p>First, we need to assume D(10)=1, which is reasonable.  Second, we extend D to accept positive real numbers as input, not just positive integers, so that we can talk about D being continuous, which we will also assume (This is still reasonable if we take eq. 1 as the definition of D.).  Then the answer is yes.</p>
<p>Exercise: D(1)=?</p>
<p>Answer:  Let c=1, then D(1)=1 by eq. 2.</p>
<p>Exercise: Using D(1) and D(10), can you figure out other values of D using eq. 2?</p>
<p>For example, D(1/10)=-D(10)=-1 or more generally <img src='http://s0.wp.com/latex.php?latex=D%2810%5En%29%3Dn&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(10^n)=n' title='D(10^n)=n' class='latex' />.  That&#8217;s about it without being clever.</p>
<p><strong>Trick for D(2)</strong>: we only need to figure out <img src='http://s0.wp.com/latex.php?latex=D%282%5Eq%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(2^q)' title='D(2^q)' class='latex' /> for some integer q, because <img src='http://s0.wp.com/latex.php?latex=D%282%5Eq%29%3Dq+D%282%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(2^q)=q D(2)' title='D(2^q)=q D(2)' class='latex' />.  The only thing we know is <img src='http://s0.wp.com/latex.php?latex=D%2810%5Ep%29%3Dp&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(10^p)=p' title='D(10^p)=p' class='latex' />, so try to find integers q and p so that</p>
<p style="text-align:center;"><img src='http://s0.wp.com/latex.php?latex=2%5Eq%3D10%5Ep+%5Ccdot+a&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='2^q=10^p &#92;cdot a' title='2^q=10^p &#92;cdot a' class='latex' />     (eq. 3)</p>
<p>where a is close to 1 (impossible for a to be 1, why?). After taking log, this means we want to approximate <img src='http://s0.wp.com/latex.php?latex=%5Clog+2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;log 2' title='&#92;log 2' class='latex' /> by p/q such that <img src='http://s0.wp.com/latex.php?latex=q%5Clog+2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q&#92;log 2' title='q&#92;log 2' class='latex' /> is as close to an integer as we want, which is possible by Lemma 1 as log 2 is irrational.  Finally applying D to eq. 3 (and simplifying using eq. 2 many times) gives</p>
<p style="text-align:center;"><img src='http://s0.wp.com/latex.php?latex=q+D%282%29%3Dp%2BD%28%5Clog+a%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='q D(2)=p+D(&#92;log a)' title='q D(2)=p+D(&#92;log a)' class='latex' /></p>
<p>, and in the limit as a approaches 1,  p/q approaches log 2 and D(a) approaches 0 by continuity of D , so <img src='http://s0.wp.com/latex.php?latex=D%282%29%3D%5Clog+2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(2)=&#92;log 2' title='D(2)=&#92;log 2' class='latex' />.  Similar arguments give</p>
<p style="text-align:center;"><img src='http://s0.wp.com/latex.php?latex=D%28d%29%3D%5Clog+d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='D(d)=&#92;log d' title='D(d)=&#92;log d' class='latex' /></p>
<p style="text-align:left;">where d&gt;0.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mec2009summer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mec2009summer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mec2009summer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mec2009summer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mec2009summer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mec2009summer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mec2009summer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mec2009summer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mec2009summer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mec2009summer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mec2009summer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mec2009summer.wordpress.com/105/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mec2009summer.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mec2009summer.wordpress.com/105/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=105&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mec2009summer.wordpress.com/2009/08/06/4-1-scale-invariance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f3865dec625003838a3280c1b4808664?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mec2009summer</media:title>
		</media:content>
	</item>
		<item>
		<title>4.2 Base-Invariance</title>
		<link>http://mec2009summer.wordpress.com/2009/08/06/4-2-base-invariance/</link>
		<comments>http://mec2009summer.wordpress.com/2009/08/06/4-2-base-invariance/#comments</comments>
		<pubDate>Thu, 06 Aug 2009 13:54:46 +0000</pubDate>
		<dc:creator>mec2009summer</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://mec2009summer.wordpress.com/?p=103</guid>
		<description><![CDATA[Working on exercise a, you saw that changing the base only changes the formula from log base 10 to log base whatever you are working with. But it is more subtle, i.e. the procedures in scale-invariance cannot be repeated, because the set of numbers with leading digit 1 (to 1.99&#8230;) in base 10 cannot be [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=103&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Working on exercise a, you saw that changing the base only changes the formula from log base 10 to log base whatever you are working with.  But it is more subtle, i.e. the procedures in scale-invariance cannot be repeated, because the set of numbers with leading digit 1 (to 1.99&#8230;) in base 10 cannot be translated to base 7 as numbers with some other leading digit (whereas scaling by say 4.5 means looking at numbers with leading digit(s) 45 to 90.  In the language of probability, not all probability can be computed and the ones that can be is called measurable).</p>
<p>What we can compare is &#8230;, base 0.1, base 10, base 100, base 1000, &#8230; (or base b^n, where b&gt;1), because leading digit 1 to 2 in base 10 is leading digit 1 to 2 or 10 to 20 in base 100.  Rewriting it as 10^0 to 10^(log 2) in base 10 and 100^0 to 100^(log 2 /2) or 100^(1/2) to 100^[(1+log 2)/2, we see the right definition of a base-invariant probability would be something like</p>
<p><img src='http://s0.wp.com/latex.php?latex=P%5B1%2C+b%5Ea%29%3DP%5B1%2C+b%5E%7Ba%2F2%7D%29%2BP%5Bb%5E%7B1%2F2%7D%2C+b%5E%7B%281%2Ba%29%2F2%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P[1, b^a)=P[1, b^{a/2})+P[b^{1/2}, b^{(1+a)/2})' title='P[1, b^a)=P[1, b^{a/2})+P[b^{1/2}, b^{(1+a)/2})' class='latex' /></p>
<p>where b is the base.</p>
<p>Using base-invariance will not give Benford&#8217;s law because the number 1 is special as changing the base will leave 1 unchanged.  Removing 1, we will get Benford&#8217;s law (after some analysis which we won&#8217;t do.  For the interested reader, see &#8220;<a href="http://www.google.com/url?sa=t&amp;source=web&amp;ct=res&amp;cd=1&amp;url=http%3A%2F%2Flinks.jstor.org%2Fsici%3Fsici%3D0002-9939%28199503%29123%253A3%253C887%253ABIBL%253E2.0.CO%253B2-L&amp;ei=jKN5SsymHI6-MIvCpKMO&amp;usg=AFQjCNHKG7147i0padzKzd0lcLSrNH6llA">Base-Invariance Implies Benford&#8217;s Law</a>&#8220;).</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mec2009summer.wordpress.com/103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mec2009summer.wordpress.com/103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mec2009summer.wordpress.com/103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mec2009summer.wordpress.com/103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mec2009summer.wordpress.com/103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mec2009summer.wordpress.com/103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mec2009summer.wordpress.com/103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mec2009summer.wordpress.com/103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mec2009summer.wordpress.com/103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mec2009summer.wordpress.com/103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mec2009summer.wordpress.com/103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mec2009summer.wordpress.com/103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mec2009summer.wordpress.com/103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mec2009summer.wordpress.com/103/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=103&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mec2009summer.wordpress.com/2009/08/06/4-2-base-invariance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f3865dec625003838a3280c1b4808664?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mec2009summer</media:title>
		</media:content>
	</item>
		<item>
		<title>4.3 Mantissa</title>
		<link>http://mec2009summer.wordpress.com/2009/08/06/4-3-mantissa/</link>
		<comments>http://mec2009summer.wordpress.com/2009/08/06/4-3-mantissa/#comments</comments>
		<pubDate>Thu, 06 Aug 2009 13:53:41 +0000</pubDate>
		<dc:creator>mec2009summer</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://mec2009summer.wordpress.com/?p=101</guid>
		<description><![CDATA[Notation: instead of writing out the probability that the leading digit is d (or less), we will write (and ),. Reason: 10 stands for the base, and things with leading digit d can be written as where and n is some integer. With this the formula becomes quite easy to remember: . Also, log can [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=101&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Notation: </strong>instead of writing out the probability that the leading digit is d (or less), we will write <img src='http://s0.wp.com/latex.php?latex=P%28%5Bd%2C+d%2B1%29_%7B10%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P([d, d+1)_{10})' title='P([d, d+1)_{10})' class='latex' /> (and <img src='http://s0.wp.com/latex.php?latex=P%28%5B1%2C+d%2B1%29_%7B10%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P([1, d+1)_{10})' title='P([1, d+1)_{10})' class='latex' />),. Reason: 10 stands for the base, and things with leading digit d can be written as <img src='http://s0.wp.com/latex.php?latex=r+10%5En&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='r 10^n' title='r 10^n' class='latex' /> where <img src='http://s0.wp.com/latex.php?latex=d+%5Cleq+r+%3C+d%2B1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d &#92;leq r &lt; d+1' title='d &#92;leq r &lt; d+1' class='latex' /> and n is some integer. With this the formula becomes quite easy to remember: <img src='http://s0.wp.com/latex.php?latex=P%28%5B1%2C+d%2B1%29_%7B10%7D%29%3D+%5Clog_%7B10%7D%28d%2B1%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P([1, d+1)_{10})= &#92;log_{10}(d+1)' title='P([1, d+1)_{10})= &#92;log_{10}(d+1)' class='latex' />.</p>
<p>Also, log can take in non-integer (positive) values as well, so it works as long as d&gt;1.  It helps to define the base b Mantissa function of a positive number p, as <img src='http://s0.wp.com/latex.php?latex=M_b%28p%29%3Dr&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M_b(p)=r' title='M_b(p)=r' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=p%3Dr+b%5Ek&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p=r b^k' title='p=r b^k' class='latex' /> where k is an integer such that <img src='http://s0.wp.com/latex.php?latex=1+%5Cleq+r+%3C+b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='1 &#92;leq r &lt; b' title='1 &#92;leq r &lt; b' class='latex' />.  For example <img src='http://s0.wp.com/latex.php?latex=M_8%284%29%3D4&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M_8(4)=4' title='M_8(4)=4' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=M_8%2816%29%3D16%2F8&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M_8(16)=16/8' title='M_8(16)=16/8' class='latex' />, <img src='http://s0.wp.com/latex.php?latex=M_8%2870%29%3D70%2F64&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='M_8(70)=70/64' title='M_8(70)=70/64' class='latex' />.</p>
<p>With Mantissa, the notation above is actually <img src='http://s0.wp.com/latex.php?latex=P%28%5Bd%2C+d%2B1%29_%7B10%7D%29%3DP%28M_%7B10%7D%5E%7B-1%7D%28%5Bd%2Cd%2B1%29%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P([d, d+1)_{10})=P(M_{10}^{-1}([d,d+1)))' title='P([d, d+1)_{10})=P(M_{10}^{-1}([d,d+1)))' class='latex' />.</p>
<p>Using Mantissa, we can define invariant-sum sequence which is another characterization of Benford sequence just like scale and base-invariance.  Using base 10, it means taking the numbers from a sequence whose leading digit is d, and summing up the Mantissa of the numbers, on average will be the same for d=1, 2, &#8230;, 9.  For details, <a href="http://www.google.com/url?sa=t&amp;source=web&amp;ct=res&amp;cd=1&amp;url=http%3A%2F%2Fwww.math.unt.edu%2F%7Eallaart%2Fpapers%2Finvar.pdf&amp;ei=QdJ6SqfeA6WUtgeJ96zsAQ&amp;usg=AFQjCNE6NLtlrjkZbspBHs3NgfpddYe6Gg">see this.</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/mec2009summer.wordpress.com/101/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/mec2009summer.wordpress.com/101/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/mec2009summer.wordpress.com/101/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/mec2009summer.wordpress.com/101/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/mec2009summer.wordpress.com/101/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/mec2009summer.wordpress.com/101/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/mec2009summer.wordpress.com/101/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/mec2009summer.wordpress.com/101/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/mec2009summer.wordpress.com/101/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/mec2009summer.wordpress.com/101/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/mec2009summer.wordpress.com/101/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/mec2009summer.wordpress.com/101/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/mec2009summer.wordpress.com/101/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/mec2009summer.wordpress.com/101/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=mec2009summer.wordpress.com&amp;blog=7988549&amp;post=101&amp;subd=mec2009summer&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://mec2009summer.wordpress.com/2009/08/06/4-3-mantissa/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/f3865dec625003838a3280c1b4808664?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mec2009summer</media:title>
		</media:content>
	</item>
	</channel>
</rss>
