Friday, March 5, 2010

Polling Data vs Polling figures

If you are gearing up for an upcoming election the thing you often times look for to get an idea how the race is going to result is polling data.
I cannot say that i have a degree in election polling, but i can say that i did get involved first hand with political polling in the 2008 election once, both for PA and DE. Being the person on the end of the phone who bothered you while you were eating or running out the door, you see things a little bit differently.
The standard size polling data is often time 500 sample, however is small states like Delaware for example 280-350 is good enough for qualifications. The Poll must reach a percent of the state population, as well as an equal proportion in the poll in each county or district as the state population. The poll must also reach general stereotype of the region, for example if an area is mostly elderly people the polling data should reflect that. If it is a college town, is should reflect the younger age group. If you are in an inner city you should have higher numbers of minorities such as blacks or Hispanics depending on where you are sampling. Your sample size should also be similar with male to female ratio of the state and areas of polling. Now do all polls do a good job at this, for the most part yes, but sometimes they do slip up on some of these voter blocks in some data. To some degree it can be forgiven, in others it can not. But you get the point.

A controversial figure has to do with how calls are conducted: if it is a robo call many people are more likely to conduct the survey to an automatic message, but there is always risk of less serious responses.
In a person to person call: the person conducting the survey can convince you to choose one answer over the other. Also the way questions are asked or read off can push you to answer one way over the other.

-enough of that: its time to get onto the tricks that polling has come up with to mislead its raw data.

we are going to look at the RCP average for President Obama's Approval vs disapproval ratings:
RCP Average [2/17 - 3/4] -- [48.7] [45.7] = [+3.0]

Gallup [3/2 - 3/4] [1547 A] [50] [44] = [+6]
Rasmussen Reports [3/2 - 3/4] [1500 LV] [46] [53] = [-7]
Ipsos/McClatchy [2/26 - 2/28] [1076 A] [53] [44] = [+9]
FOX News [2/23 - 2/24] [900 RV] [47] [45] = [+2]
POS (R) [2/17 - 2/18] [900] [RV] [48] [48] = [Tie]
Newsweek [2/17 - 2/18] [1009 A] [48] [40] = [+8]

above we have the 6 latest national polls on President Obama's approval rating. First you see the name of the polling firm: Gallup ie. next you see the time the poll was conducted: 3/2-3/4: this is a 2 day polling cycle. Next you will find a number followed by an ab. 1547 A ie. the next numbers in line are the approval/positive number: 50 which is 50% followed by the disapproval/negative number of 44 which is 44%. The last number is basic math: 50-44 = 6%. The remaining numbers not in the data reflect no opinion/ unsure or another candidate (usually a 3rd party candidate not polled).

When it comes to the number: the old myth is that the higher the sample size the more accurate: this is true to some degree. With a low number you usually see for example "a margin of 3.5%-4% and on a higher sample size it is cut down as little as "1.5%" usually. However the truth is beyond 2.5% it is almost like guessing if the additional samples are going to make the data more accurate or less accurate; a sample is a sample. if you sampled 25% or 50% of voters that is a bit excessive: that would be an "election".

So what is important here: the most important information when looking at a poll is the abbreviation after the sample size. If you notice above: grad any poll with the same abbreviation and the results should be somewhat similar, now compare different ones and there are larger ranges of differences. What do these symbols mean?

There are 3 common for official polls.
1. A
For a poll to be official it must contain only Adults in the sample: When you get these annoying calls the first thing they ask if is if you are old enough to vote, or 18. If you say no, they ask for an adult.
There are polls done without this criteria, but you will not find them posted on official poll averages.
This is the least accurate form of polling.
2. RV
Registered Voters: After you are asked if you are old enough the vote: one of the first questions the surveyor will ask you is if you are registered to vote, or will register to vote prior to the election being surveyed. If you say no, they say thank you for your time *click*
This polling data is much more accurate than Adults only data, because it eliminates those who can not vote due to either their own reason or because of legal reasons such as criminal laws in some states or non citizens.
3. LV
This polling is the most accurate especially immediately prior to an election: Likely Voters. Towards the end of a survey you are almost always asked how likely you are to vote in the upcoming election. Depending on the survey you are given either 3 or 5 choices. For example: Will not vote, not likely, somewhat likely, likely, very likely.
In this data: if you say either Not going to vote, or not likely going to vote your information will not be counted towards the final results of the poll, but still included in the general questions asked. Polling finds that those who are most involved and/or informed in elections are most likely going to vote. So many time there are registered voters who have strong opinions, but don't find it necessary to go and vote on election day for which ever reason, the biggest reason statically is that your vote doesn't matter, or either choice is not going to make a difference.

Real Clear Politics founder calls Gallup-Rasmussen the most accurate polling combination. These polls make a wide range of the spectrum.
Gallup usually polls Adults only, besides when they team up with USA Today or a few other groups to poll RV polls. While Rasmussen Almost always polls in LV, when able too. This is why Rasmussen prior to elections is the most accurate, while Gallup is the best as noticing national future trend rather than immediate results.

In our RCP sample for today 3/5/10
we see Obama up 3+ points.
A average = Obama is up by 7.67%
RV average = Obama is up by 1pt.
LV Rasmussen = (-)7 points

With elections only 8 months away campaigns should be focusing on RV and LV right now, and in the next couple months they should be focusing only on LV response.

Over the past 2 weeks we have seen many liberal/democratic leaning firms release their data in either A or RV form, when in some cases they normally use the RV/LV form. this is the pollster trick.