Probability

4. Non-Random Probability

In the previous two pages, we've looked at poker, the lottery, and rolls of the dice: all games of chance. But what if we're looking at a non-random game? It may be possible to derive probabilities, but you would need to know the nature of the non-random mechanism.

Baseball

For example, let's look at a baseball player trying to hit a ball. This is a decidedly non-random situation; if the ball is moving through the middle of the stroke zone, the player reacts to it well, and swings his bat at the ideal time and in the ideal place, the probability of hitting the ball is 100%. Or, if the ballplayer completely misreads the pitch and swings at the wrong time in the wrong place, the probability of hitting the ball is 0%. This is why you cannot generate a reliable prediction for any given ballplayer in any given at-bat; in order to know the probability of him hitting the ball, you need to know factors which cannot be evaluated until after the fact.

"Ah, but we have batting averages" one might answer. That is true; baseball fans have long compiled lists of batting averages which indicate a player's past performance. Baseball statisticians even break them down into specific situations (batting against left-handed pitchers, batting against right-handed pitchers, batting against this particular pitcher, etc). But you still cannot generate a meaningful statistical prediction for any given at-bat, because there are too many variables you cannot evaluate. At best, you can look at past performance and assume that future performance will match, even though you know that many variables will change from year to year, game to game, or even at-bat to at-bat.

Could it be possible to get highly reliable predictions from baseball statistics? One would have to say yes ... if the game were played by robots. The field of statistical probability is not necessarily unreliable or unscientific, but it requires a high degree of experimental control which is just not possible with human beings playing baseball. If you could produce baseball players who were far more consistent in their behaviour (like robots), a more reliable set of statistical probabilities could emerge. However, that's not much of a solution for baseball, unless we switch to robot players. Barry Bonds is a good start in that direction, but the total elimination of the human factor is still many years away.

This is all rather disappointing compared to our nice clean mathematical analyses of poker games and lottery tickets, but unfortunately, that is the nature of reality. The mechanism of poker games and lottery tickets is incredibly simple: random selection from a precisely defined set. However, the mechanisms of real events tend to be considerably more complex. If you could nail down all of the variables and ensure that they remain fixed (impossible in the case of baseball games, but possible to within a high degree of accuracy for scientific experiments), you could use past performance to predict future performance, but the clean mathematical technique is extremely difficult to apply, if not impossible.

A Calculated Non-Random Probability

So is it ever possible to evaluate a non-random probability? Yes, but you need to know a fair bit about the mechanism. When the meteorologists say there is a 70% chance of precipitation, they are saying this based on highly detailed air movements and temperatures, in conjunction with a large body of research into weather patterns and a superb understanding of the mechanism of precipitation. But there are other situations where you can generate reliable probability estimates for non-random (or more accurately, partially random) situations.

Note: at this point, those of you who are starting to feel tired of the mathematics should probably skip ahead to the next page, because we're about to do some more work with poker and dice, and those of you who easily suffer "math fatigue" will probably start tuning out, if I didn't lose you already back on Page 2.

Poker ... again.

Still here? OK, let's manufacture an example of a non-random probability which can be evaluated: suppose you are playing a modified game of poker where you are required to discard any card which is the same rank as a card you are already holding (for example, if you draw a two of spades and a two of clubs, you have to discard the two of clubs and pick up another card). Given this rule, the probability of drawing a double, triple, four of a kind, two pair, or a full house would suddenly drop to precisely zero. But the odds of drawing other hands such as straight flushes would go up, due to a reduced number of alternatives.

For example, let's take the royal flush: there are still just 4 possible royal flushes, but the total number of combinations has changed because so many hands have been outlawed. You could calculate the number of doubles, triples, quadruples, two-pairs, and full houses and then subtract them all from 2598960, but that would be a lot of work. Luckily, there's a much easier way to calculate the odds of a royal flush with our modified rules. You start with the ten card, of which there are 4 in the deck out of 52 cards, or 1 in 13. To get the matching jack, you have to pick 1 of 48 remaining cards (remember that 1 card has already been drawn and the 3 other ten cards are off-limits, so you've got 52-1-3 cards left). In similar fashion, you have to pick 1 of 44 cards to get the matching queen, 1 of 40 cards for the king, and 1 of 36 cards for the ace. And finally, you have to divide by 5!=120 orders. Therefore, the odds are 13*48*44*40*36/120, or 1 in 329472.

Loaded dice.

Another type of non-random probability is the weighted probability. This is still "random", but unevenly so. The classic example of weighted probability is drawn from gambling (yes, gambling again), and it's known by the popular name "loaded dice". Suppose you had dice which had a small metal weight inside, making them three times more likely to land a six than any other number. You could still calculate odds, but if you assumed that the dice were normal, your predictions would not match the results.

The simplest way to account for weighted probabilities is to "double-count". For example, if sixes are 3 times more likely than any other number, you simply assume there are 3 sixes in the dice. So instead of each number having a 1 in 6 probability of coming up, numbers one through five would have a 1 in 8 chance of coming up, and the number six would have a 3 in 8 chance of coming up.

So what did we learn?

Hopefully, we learned that it's very difficult to generate non-random probability estimates unless you know precisely how the non-random mechanism works. Let's say that in our poker example, we knew that there were some special rules for the draw but we didn't know what those rules were. It would become impossible to determine the odds of drawing any particular hand. Worse yet, suppose we didn't even know how many cards were in the deck. Probability estimating is an extremely complicated business, where the smallest unjustified assumption can produce numbers which are completely wrong and where missing information can make any kind of estimate impossible.

Creationism versus Science

Probability

4. Non-Random Probability

Baseball

A Calculated Non-Random Probability

So what did we learn?