(originally posted 13-May-2008)
We’ll start off first with an equation.
(collective shudder from the class)
No! No! No! Class! There is nothing to fear from equations. Equations are our friends. OK, I lied. Equations aren’t our friends. They actually have no mind or personality whatsoever (like some politicians and others who shall remain nameless for legal reasons). Equations are merely mathematical representations of the real world and are used every day by almost every one.
Um, sir, the equation…
Oh yes. Pardon my rambling: I(x) = log2[1/Pr(x)] (for units in bits)
Expressionless stares from the class.
Translation: The amount of information of event x is inversely proportional to the probability of event x.
More silent stares…
Translation of translation: the more unlikely event x is, the more information it contains.
“Ah”
Before we look at some examples, let’s look at logarithms, first. I believe the class should know exponents by now, such as 10^2 = 100 and 2^4 = 16, right?
Silent bobble heads bobbing…
Well, logarithms are similar to exponents, but the reverse. At a more general level, take the expression A^B = C: A is multiplied by itself B times to get C. With me so far?
More silent bobbing…
Good. Now for logarithms, the expression goes logA(C) = B. In other words, we are trying to calculate B, the amount of times we must multiply A by itself to get C.
“Ooooooh! Aaahhhh!”
Excellent! Now onto the examples from the article. For tossing a fair coin, the probability of heads/tails is 50% or 0.5 or 1/2 [Pr(heads) = 0.5] and 1/0.5 = 2. Therefore, when tossing a fair coin, the chances of it landing on heads contains I(head) = log2(2) = 1 bit of information.
For tossing a single six-sided die, the probability of getting a specific number (say six) is Pr(six) = 1/6. Therefore, the amount of information in tossing a six from a single six-sided die is I(six) = log2(6) = 2.585 bits.
For tossing two six-sided dice independently, the probability of a specific two-number combination (say snake-eyes or two ones) is Pr(snake eyes) = 1/6 x 1/6 = 1/36. Therefore, the amount of information in rolling snake-eyes from independently tossing two six-sided dice is I(snake eyes) = log2(36) = 5.170 bits.
Thus, as the probabilities get smaller (1/2, 1/6, 1/36), the amount of information gets larger (1 bit, 2.585 bits, 5.17 bits, respectively).
Now class, can anyone see a problem with using self-information as a “measuring stick” for information in nature?
A curious and bright student raises a hand.
Yes, the student with the inquisitiveness in the back…
Stands up sheepishly. “Um, is it because self-information deals only with events and not objects?”
YES! Very good! Determining the amount of information in an event is all fine-and-dandy, but what it really required is the amount of information contained in any event OR object. Also, an object is not necessarily an event. If you do equate objects with events, then on what objective grounds do you do so? IOW, what probability do you assign to an object and how do you obtain it without any subjectivity?
To sum up, self-information is a good start, but not sufficient as a generic “measuring stick” for information in objects. Class dismissed.
“Um, any homework, sir?”
For the keener, yes. For the rest of you, I hear Game 3 of the Pens-Flyers East Finals is on TV tonight Game 1 of the Pens-Caps East Semis is on TV Saturday afternoon. I expect a full report tomorrow. Class dismissed!


Recent Comments