What can we learn from the polls, if anything?

The polling for the 2008 Presidential Election is all over the map, though generally agreeing on Obama having a lead from anywhere between 4 to 7 points, depending on the poll and how much you trust the average.

I’ve read numerous articles trying to “explain” the polls, though none from someone objective outside of a single polling organization. Last night I found a great article which went much more in depth and was able to look at all the polls collectively without the bias of a pollster defending his or her individual poll. The article comes from RealClearPolitics.com, a site I frequent many times a day, written by Jay Cost on why the polls differ.

Here are the important parts:

I’ve received several emails from people asking about the polls. The national polls do seem pretty variable, so I thought I would toss in my two cents on them.

First, we need a short primer on basic statistics. Real Clear Politics offers an unweighted average, or mean, of the polls. As long as there is more than one poll in the average, we can also calculate the standard deviation, which is one of the most important concepts in inferential statistics. The standard deviation simply tells us how much the polls are disagreeing with one another.

With this stuff in mind, let’s focus on some hard numbers. As of this writing, Barack Obama’s share of the vote in the RCP average is 50.3%. His standard deviation is 2.7. For McCain, whose average is 42.5%, the standard deviation is 2.3. For comparative purposes, I looked at the polls RCP was using from its 2004 averages. For roughly the same time in that cycle (10/17/04 to 10/24/04) Bush’s standard deviation was 1.8; Kerry’s was 1.7. This means that there is more disagreement among pollsters now than there was in 2004.

Well obviously we know there are large discrepancies between polls. As Cost discusses, the pollsters all disagree much more than they did during the 2004 election.

Cost continues:

So, let’s push the analysis a little bit further by looking at specific polls. We can test to see if the polls are separated from the average by a statistically significant amount. Again, since we’re dealing with each candidate’s individual poll positions – we’ll test each candidate’s number in an individual poll against the RCP average. To make sure we dot all our “i’s” and cross all our “t’s,” we’ll supplement the RCP average with a weighted average of the polls, which takes into account the number of observations when averaging the polls together.

Of the fifteen polls in the RCP average, four fall significantly outside the average for Obama and five do so for McCain. Meanwhile, three polls are right at the boundary of significance (one for Obama, two for McCain). The rules of statistics being what they are, we should expect a few polls here or there to fall outside the average by a statistically significant amount. But this is a lot. 40% of all our tests produced results around or outside the acceptable range.

So, we have made three observations: (a) relative to 2004, the standard deviation for Obama and McCain’s polls are high, indicating more disagreement among pollsters at a similar point in this cycle; (b) the shape of the distribution of each candidate’s poll position is not what we might expect; (c) multiple polls are separated from the RCP average by statistically significant differences.

Combined, these considerations suggest that this variation cannot be chalked up to typical statistical “noise.” Instead, it is more likely that pollsters are disagreeing with each other in their sampling methodologies. In other words, different pollsters have different “visions” of what the electorate will look like on November 4th, and these visions are affecting their results.

The bottom line here is that polls showing Obama at +15 could be correct, or they could be very wrong. The same goes for the polls which have Obama at +1, they could be correct or very wrong, nobody really knows.

It all depends on the sampling the pollster decides to use, how many Democrats, Republicans, and Independents they will include based on who they think will turn out to vote.

Something else which is interesting from the Gallup organization:

PRINCETON, NJ — Gallup finds 13% of registered voters saying they will vote for president for the first time in 2008. That matches the figure Gallup found in its final 2004 pre-election poll.

The current data are based on interviews with more than 2,700 registered voters as part of Oct. 17-19 Gallup Poll Daily tracking. Gallup asked these voters a question it had asked in its 2004 election polling: whether this would be the first time they had voted in a presidential election, or whether they had voted for president before. Despite much discussion of the possibility of large numbers of new voters in 2008, the percentage of “first time” voters in Gallup polling this election cycle is no higher than it was at approximately the same time in 2004.

The vast majority of these “new voters” are of the 18-30 age range, they are the “youth” voters who the Obama campaign is counting in, in some areas of the country. This information is interesting since we watched as the “youth” vote didn’t materialize for John Kerry in 2004. It remains to be seen if a similar trend will happen again this year.

Another interesting tidbit from Gallup about the breakdown of early voters:

Gallup reports this morning that about 11 percent of registered voters who plan to vote already have done so, and that they’re split almost evenly between supporters of John McCain and Barack Obama.

The 11 percent early voting rate is just a little higher than the 9 percent who’d voted at this stage in 2004, according to Gallup.

But another 19 percent tell the polling organization they plan to vote before this Election Day, meaning three out of 10 voters would have voted before then.

According to Gallup, the early voting has been very split between Obama and McCain. This is significant since it does not show a landslide of support in either direction, it merely confirms that the race may be tighter than indicated by some polls.

However, it’s difficult to draw concrete conclusion from early voting and we really won’t know until the rest of us vote on November 4th.