consultix small logo

The Riddle of Snake Island

Statistics is just a Tool, not the Ultimate Arbiter of Truth
Consultix


Home
About Consultix
Training Services
Courses
Clients
Publications
Interviews
Random sampling is preferable to "haphazard" sampling when collecting observations, and infinitely better than no sampling at all. But life doesn't always give you those choices.

I've been looking for a way to illustrate my thoughts on this subject, and I think I've got one. Let me know what you think!

Imagine that you and your colleagues have to visit a tropical island next week. But this ain't no GeekCruise; the island is populated by two kinds of deadly snakes, red ones and blue ones.

Each of you is allowed to take only one bottle of snake-bite antidote with you, so you've each got to pick a color in advance. (You won't meet up with other humans on the island, so "collaborative strategies" in picking antidotes won't work.)

Your colleague, Mr. A, has no other information about the island, so he flips a coin to choose the color of his antivenin.

Mr. B heard from Mr. X that there were lots more blue than red snakes there last week, so Mr. B chooses the blue stuff.

Mr. C also heard Mr. X's report, but he likes contrarian bets and the "law of averages" (aka the "gambler's fallacy"), so he chooses red.

Mr. D, fancying himself a statistician, boldly asserts that "nobody's done a proper random sampling of the population", so he rejects Mr. X's observations, flips a coin, and chooses accordingly.

What should you do, if your intent is to live -- as opposed to defending your actions in a Statistical journal?

My position is that you should always pay attention to your data. If the collection procedures were imperfect, you strive to refine them and beef up the testimony of your data. But if you can't expend that extra time or effort before making an important decision, "flipping a coin" is an insult to the reality of your existing observations. I for one would be taking the blue stuff to the island!

The moral of this story is that the field of Statistical Analysis provides scientifically sound procedures for inferring the underlying characteristics of populations -- to a measurable degree of certainty. But its procedures should not be construed as the sole criterion for determining the worthlessness or credibility of data.

For example, in scientific research, it's commonly the case that adding a few more subjects to an experiment can suddenly cause the results to cross the threshold required for a particular (somewhat arbitrary) level of statistical significance (usually , .05 or .01). This doesn't mean that the body of data that consistently showed the same pattern suddenly went from meaningless to priceless when those additional subjects were added; it just means that you weren't in a position to offer consensually acceptable proof of their value until the additional observations had been collected.

Having a "proof" is important, but it shouldn't be construed as the sole criterion for paying heed to what's happening in the world.

I'd be packing the blue stuff to the island -- I can't prove that my decision is the best one, but I'm satisfied that I've made the best choice under the circumstances. That's because, to the best of my knowledge, there are lots more blue snakes over there. I want to be ready for them.


© Copyright 1995-2004 Pacific Software Gurus, Inc.. All Rights Reserved.

   Powered by Google