Friday, June 13, 2008

Some Thoughts on Polls

Every day, sometimes many times a day, the results of new polls are announced somewhere. The results may be broadcast on television or radio; they may be printed in newspapers, general circulation magazines, or scholarly journals. They may appear on any number of different types of Internet venues. You may hear about the results in a classroom or around the water cooler during a coffee break at work. Friends may call and tell you about the results; strangers may volunteer the results as you find yourselves together in some place. Somehow the information gets "out there." And depending on your time, inclination, interest, and education you may become part of the chain that passes on the poll results.

As Joel Best has pointed out in his work Damned Lies and Statistics, the American people have a fascination with statistics--"I count therefore I am." We use statistics to validate our way of thinking, of acting. We work as if the numbers that bombard us have objective rationality. Do they?

I put up two polls on medical practitioners and sex. One poll was for women, one for men. Each poll gave five statements and asked respondents to check off those statements that applied to them. In the poll introduction I mentioned that everyone should be able to check off three of the five statements. No further instructions were given.

The results of that poll are out there for everyone to see. But what are we really seeing? How valid or how valuable are the results for the two polls? What use can we make of the resulting data?

It's time for true confessions, so let me say that I knew from the outset that the polls I put up were problematic. I actually counted on that fact. I had ulterior motives from the start. Why? Because I was trying to make a point and get us to think.

Polls have become a ubiquitous part of our lives, and yet we know so little about the process of polling and of the possible manipulation inherent in polls. We assume that because a poll is an open process that we know all we need to know about the poll.

1) All polls are flawed to one degree or another. Where the flaw(s) appear can greatly influence how useful the poll will be.

2) Pollster intent can influence the poll right from the start, can "rig" the results before a single question is answered. If a pollster is set on proving that what they believe is true is indeed true, they can design a poll that will show they are right. Or that will show that someone else is wrong. This is one reason why there can be many polls out there on the same topic that point to radically different results.

3) Questions are not neutral. First, how they are worded can influence or force a particular answer. Second, one question can influence how you will answer subsequent questions. The order that questions appear in can also influence reader response. (The classic example is the "Have you stopped beating your wife?" question.)

4) How a poll is taken can influence the results. Is the poll taken face to face with the pollster? Do you know the pollster? Does the pollster know you? Is the poll mailed to you for you to fill out? Is the poll on the Internet? Is the poll taken on the street between an unknown pollster and an unknown poll taker?

5) People lie. There is no way for a pollster to know if people are answering what they really believe to be true or if they are answering what they would like to be true or think should be true. In face to face situations there is the element added of not wanting to be thought of as radically different from others, or of making pronouncements that are not thought well of in society today. Example: how many people, in a face to face poll would answer the question "Do you believe that you have racist tendencies?" with "Yes"? They might, however, answer the question "Do you believe that there are people in society with racist tendencies?" with "Yes." Put those questions into an anonymous poll and you might or might not get different answers.

Even innocuous questions are not always answered truthfully. If you are asked to fill in your age, height, weight, occupation and last year of schooling the chances are excellent that the answers to those will be fudged. After all, who will really know the difference? Why not make yourself a bit more "perfect"? Even asking if someone is male or female does not guarantee that the answers will be truthful. Case in point: in the polls I put up there are actually two polls--one for males and one for females. How could I, as pollster, know that it was indeed only females who answered the female questions and only males that answered the male questions? And since there are two polls, how could I as pollster know whether one person voted in both polls?

6) People misunderstand the questions being asked. They may read the question one way while the "real" question is quite different. They answer on the basis of their understanding, which may be incomplete or inaccurate.

7) People may have views that are not represented by the questions asked. They may be willing to answer yes to a question but only with a caveat or under special circumstances (note the comments on the announcement of the polls that said just this), but because there is no way on the poll to qualify the answer, they may pick an answer even though it does not fully represent what they believe. And this means that those taking a poll may answer what are two related questions in diametrically opposing ways.(See the first and last questions on my poll for an example.) And it also means that the numbers that accrue may or may not be representative of the audience being polled.

8) Polls are self-limiting. There may be formatting issues in putting out a poll that limit what can be asked and how. What could be crucial information is not gathered. The polls I put up were limited to five statements to respond to--blogger has a physical limitation of five questions. Even if I wanted to I could not ask more questions that might give me better responses or more detailed responses.

9) How representative or random is a sample? I have almost no information on the readers of this blog. I do not know what segments of the population they come from. I do not know how many from various segments of the population there are. I do not know ages, religious orientation, economic level or any of thousands of things that could influence the answers to a poll or how the poll results could be used.

10) The results of a poll will change depending on whether or not those being polled already know what others have answered. We may be influenced by the answers already given by others. Others who glanced at the poll results may not have noticed that some answers were changed as the poll progressed. Blogger allows you to go back and change your vote. It is possible that some people gave more thought to the statements and went back to change their vote. It is also quite possible that some people changed their vote after seeing how others voted. The very openness of the poll can work to skew results.

There are other factors which can influence polls at all points in the polling process. On this blog the voting is open for everyone to see. As pollster, I cannot remove any votes that I don't agree with--blogger doesn't let me do so. But when polls are taken in other formats, how do we know that the total numbers being reported are true? How do we know that all data is being reported? How do we know that numbers haven't been changed? What is the actual margin for error that a pollster is working with? Can that margin make a real difference in the numbers reported?

In short, all that I, as pollster, can actually say about the two polls taken is that in a given time period a group of people checked off the answers shown on the poll results. How representative that poll is cannot be known. How accurate the results are also cannot be known. I cannot make any pronouncements that "males feel this way" or that "females feel this way." I cannot quantify frum opinion on the subject polled about.

Then why all the rigmarole involved in asking the questions and putting up the poll to begin with? Because every day you will be faced with the results of polls "out there." Because every day you may/will be asked to make decisions based on the results of polls. Because every day someone will tell you the "truth" about how people act/feel/think based on poll results. Because though the polls on this blog may be dismissed by you as being "only a blog poll," those other polls out there may not be any more accurate or any more useful.

Because we all need to be painfully aware that numbers can lie. We need to understand that statistics can be made to say whatever someone wants those statistics to say. We need to learn to view numbers with a bit more of a jaundiced eye. When it comes to polls and the statistics that result from them, let the buyer be ware.

Nonetheless, I found it very interesting that of the 80 "votes" cast, only 12 out of the 28 males and only 29 out of the 52 females indicated that in an emergency they would use a medical practitioner of either sex. I'm truly puzzled trying to envision an emergency scene where the person having the emergency puts up a hand and calls a timeout until a medical practitioner of the "correct" sex can be found.


Selena said...

I filled out the poll and I didn't check the "in an emergency I would use a medical practioner of either sex" because I picked #1 (I never choose a medical practioner based on sex). I thought the second was reduntant.

ProfK said...

Frequently pollsters use this type of redundant-seeming question to check on how "truthful" respondents are being. They reason, as you did, that #1 and #5 are similar in nature.If you answered question #1 then you would also answer question #5. Now how might the pollsters use the descrepancy in answering when examining the poll? One pollster might point out that people are either lying about #1 or are lying about #5, that people say that the sex of the medical practitioner doesn't matter but it really does. Or you might also have the pollster that takes either question out of the context of the poll and uses it to "prove" something.

Anonymous said...

Aren't you basically saying that we shouldn't trust any poll or any statistics? If we don't trust any of them then on what basis are we going to make decisions that require the results of polls or of statistics?

ProfK said...

A healthy dose of skepticism (yes, healthy) is necessary when dealing with polls and statistics. Some polls are more flawed than others; some statistics are more flawed than others. If you know where the flaws occured you can use any results taking that into consideration. Yes, we need statistics; no, we don't have to believe every word.

Bas~Melech said...

(I could be wrong, but I think that when Blogger tallies the amount of responses, they count how many answers were given, not how many people answered. So since you allowed people to check more than one answer in your polls, your "29/52" might not mean that 29 of 52 distinct people said that.)

ProfK said...

Bas Melech,
Blogger allows you to answer more than one question in a poll if the poster checks off that option. But no single respondent is allowed to vote more than once on any given question. If you try the "change my vote" option comes up. Your original vote is subtracted from where you put it and is now counted towards which ever answer you are changing to, even if it is the same answer. Blogger keeps track of the ISP ID# that a vote is coming from--only one voting session per ID#. Technically this means that if two people share a computer only one of them is going to be able to vote in a blogger poll.

Anonymous said...

You are right ProfK that blogger does not allow multiple votes from one computer for any given poll. On this poll my wife and I could both vote because you had two separate polls up. On other polls we have to pool our answers. An example of what you mean by a poll being limited.