Site icon Flopping Aces

Political Polls and the (Mis)Use of Statistics [Reader Post]

The political campaign season is upon us (Did Obama ever leave it?), and we are currently being inundated with polls about all kinds of subjects, such as debt ceiling compromise, or preidental approval polls, or who won the Republican debate, so being an informed citizen and knowing how to validly interpret poll results is imperative. With the MSM bias when reporting poll results, this guide is even more important. Being knowledgeable will have you screaming at your TV and/or newspaper.

First, this is NOT meant to be an introduction to statistics; far from it. This is nothing more than some famous and humorous quotes and an explanation of how politicians and the MSM (can) misuse statistics.

Second, let me establish my bona fides, definition 4. I have a Ph.D. in statistics from the Florida State University. I have also done quite a bit of consulting on (among other things) marketing projects, so I have taken samples and formulated questions to ask. The companies with which I consulted are doing well, so I must have done something correct!

Third, take a moment to look at these two very short, excellent articles about polls by Rosslyn Smith: Poll Games: when the goal is not to inform but to persuade, and Poll Games: why one should always follow the link to the poll itself, concentrating on the crosstabs while ignoring the media spin. These articles explain why this post is important.

Fourth, before y’all all go glassy eyed, the subject of “statistics” has NOTHING to do with mathematics. It is quite unfortunate that statistics has been lumped in with mathematics. The reason for the “lumping” (IMHO) is that (1) almost all early statisticians were mathematicians, and (2) before the advent of widely available computers, the mathematical formulas used were nothing more than short-cuts to make statistical calculations easier.

We have all seen the quote from Benjamin Disraeli, Prime Minister of Great Britain under Queen Victoria: There are three kinds of lies: lies, damned lies, and statistics. Benjamin Disraeli, British politician (1804 – 1881), and and some other attributions.

Here are some more humorous (?) quotes about statistics and statisticians:

Just Remember, This Is A Quick (and dirty) Guide

First, NOTHING is ever proven with statistics. All statistics can ever do is provide additional information to backup a decision and/or judgment.

“Statistics are no substitute for judgment”: Henry Clay.

Statistics may be defined as “a body of methods for making wise decisions in the face of uncertainty.”: W.A. Wallis, famous statistician

When you hear on a TV commercial that something is “clinically proven” to work, that statement is referring to a sample that exhibits characteristics that, when viewed by a reasonable person (whatever that is), that person would interpret the results as “proof.” (see “significance level” and hypothesis testing below)

Second, there is a difference between a parameter (the unknown in which we are interested that is from an entire population) and a statistic (calculated from a sample or subset of a population). Populations are usually too large to observe all members, so a sample (a subset) of the population is taken, a statistic is calculated from the sample that estimates the unknown population parameter, and, based on the calculated statistic, a decision is made.    So Ernie Banks had it wrong when he said, “Awards mean a lot, but they don’t say it all. The people in baseball mean more to me than statistics”.    What are called “statistics” in baseball are actually “parameters.” Think about it: in baseball EVERY at-bat (a parameter, not a statistic) is included.   This link, slides 2 through 4, explain the difference between parameters and statistics.

If the sample (subset) is randomly drawn (by random, statisticians mean that every member of the population being studied [referred to as the population of interest] has an equal chance of being included in the sample), then a statistic calculated from the sample can be useful for making a decision. Notice that a statistic is never correct, hence the “margin of error” (see below).

When a sample is taken (drawn) from a population, an error occurs – never is the sample an exact subset of the population from which it is taken. So any statistic calculated, by definition, is also in error. Again, hence the “margin of error.”

The larger the sample, the less likely that it will vary from the population from which it is taken. So pollsters like to take “large” samples. Large samples reduce the “margin of error.” But (and there is always a “but”) the larger the sample, the more it costs. So pollsters try to strike a balance between accuracy and cost. That balance is usually expressed in the “margin of error.” Reducing the “margin of error” means increasing costs (and vice versa).

Of course, the “margin of error” cannot be meaningfully reduced, regardless of sample size, if the sample is taken from a non-representative population. That is why you should read carefully about the population being studied. For example, there is a difference between a population of “voters” and a population of “likely voters.”

Third, there are four levels of measurement. Measurement Scales and Permissible (valid) statistics are explained. ANY statistic calculated from inappropriately measured data cannot validly be interpreted and are therefore meaningless, especially when trying to make a decision.

Fourth, statistics NEVER shows causality. Causality is a management interpretation.    “The invalid assumption that correlation implies cause is probably among the two or three most serious and common errors of human reasoning,” says Stephen Jay Gould, in The Mismeasure of Man.

Margin of Error or Confidence Interval

Technically, there IS a difference between a margin of error and a confidence interval (so, please, all you statisticians out there, don’t write to me – we are basing our decision on only one sample), but for our purposes here we can consider them to be equivalent.

You often hear politicians, their campaign managers, and media types say, “Polls show the race within the ‘margin of error’.” You can interpret it that way if you want, but that is an incorrect interpretation of the margin of error, what is properly known as a “confidence interval.” This link explains what a confidence interval is, and how it is properly interpreted. So the next time you hear that the race is “within the margin of error,” disregard it as wishful thinking. Sample results are what they are. The “margin of error” simply gives you a “feel” for what error could have been committed.

For example, say (for illustrative purposes only) that a poll result is reported as “the citizens are in favor of this initiative, 60% to 40%.” The margin of error is ± 3%, with a confidence level of 95%. What this means is that the pollster is (at least) 95% confident that he/she is correct when he/she states that the true population parameter (unknown) of citizens favoring the initiative is between 57% (60% – 3%) and 63% (60% + 3%). Notice that the sample size does not have to be reported. Notice, also, that the population being studied is assumed to be representative of citizens who will vote on the initiative in question.

We have not gotten into how the question was phrased, nor how it was asked. That is another subject in and of itself.

Confidence and Significance Level

You often hear someone specify the “confidence level” or “significance level” at 95% or 99%. They are NOT the same thing! Hopefully, this link will explain the difference.

The confidence level refers only to a confidence interval, and refers only to the probability (usually expressed as a percentage) that the calculated confidence interval embraces the (unknown) population parameter.

The significance level refers only to hypothesis testing. A hypothesis is nothing more than a statement or belief held by a manager. That statement is either correct or incorrect. The “significance level” states, unambiguously (usually as a percentage), how “confident” you are when saying that the hypothesis is incorrect. There is a lot more to this, but we can ignore it for now.

From this explanation I hope you can see how confusion has arisen. BTW, there is nothing sacred about a 90% or 95% or 99% confidence level or significance level. Those “levels” were chosen for convenience.

Why bother with all of this?

If you are a politician (or any kind of manager, political or otherwise, or just a political junkie) you (are paid to) make decisions. Just guessing won’t do. You had better have some (valid) analysis to back yourself up. You never know when you may encounter someone like me.

Ultimately, the onus is upon you, the information consumer, to make a decision. So being forewarned is to be forearmed.

”Statistics is the grammar of science.” Karl Pearson, famous statistician.

FWIW, here is a great statistics glossary.

But that’s just my opinion.

0 0 votes
Article Rating
Exit mobile version