Sampling and the Presidential Election, part 3 of 3.

In the first part of a three-part series, we talked about some basics with sampling, including some definitions. In the second part, we talked about various types of sampling.

In this final installment, we will talk about the Presidential election polls, and important dynamics to consider when taking samples and making inferences. 

Before going further, I want to point something regarding discretion.  Part of a model might include some judgement calls.  When we take a sample and make a conjecture about the population (called making an inference), very often, a researcher must use judgement calls that might be based on anecdotal information or even “gut feel”.  Thus, as you will see, I will give my beliefs on some nuances related to the 2020 Presidential polls.  Those beliefs may be different than yours, and unlike many things in mathematics, it can be hard to refute one theory or another.  This can be an art as well as a science. 

I briefly mentioned bias in the last article, and it is important to elaborate on this.  We mentioned last time that bias can occur when the sample is not a good representation of the population.  The most famous case of bias happened in 1948 in the Presidential race between Harry Truman and Thomas Dewey.  In fact, the Chicago Newspaper wrote the headline: “Dewey wins”.  Why would this be the case when Dewey clearly did not end up winning?  They listened to the polls and the polls they adhered to was biased.  How was it biased?  Well back then, for the most part, only the affluent people had telephones, and the survey was done by telephone.  Thus, the people giving responses were not a representative subset of the population and more likely to favor Dewey.

Bias can be a hard thing to anticipate.  Obviously, if it had been anticipated in 1948, it would not have been a problem.  They would have allowed for it in some way.  Also, even though we mentioned they were not taking representative subsets of the population, bias really occurs where the misrepresentation affects the sample. 

For example, if we wanted to know the percent of people who liked chocolate ice cream better than vanilla ice cream, and we surveyed 80 males and 20 females, we might not have bias.  I stress ‘might not’ because males and females might have the same tendencies towards preference of ice cream flavors. 

But if what if we had that same 100 people and sampled whether a person likes a love story movie or an action film.  If you are trying to generalize for the whole population, you are almost certainly going to get a distorted representation when sampling 80 males and 20 females.

So, anyway, this poll was heavily slanted towards voters liking Dewey, and it distorted the general sentiment.

To summarize, we want to avoid bias in our polling, and bias is any factor (in this case, wealth) that may misrepresent the population (recall, the population is what is of interest to the researcher).

Let us fast forward from 1948 to 2020.  Dynamics have changed.  You have much more polarization about parties.  You have a media that many consider is much more biased.  I am going to try to stay away from politics here but since I am hopefully speaking to a common-sense audience, it is obvious what is going on with the media and their bias towards Democrats.  I believe this is relevant to polling today for reasons that I will explain and that is the reason I mention it

Three main concepts I want to discuss are bias, honesty, and volatility and in some ways I believe there is some connection with each of these to the media.

Now, one might ask, what is the population in the context of Presidential polls?  There is no rule as to how we define a population, but it really depends on what is of interest, and this is a subtle yet important distinction.  Is it registered voters?  Is it likely voters?  You will see some of each.  That said, it is the consensus (which I agree with) that the likely voters are better to poll because of course, they are the ones that are… well.. likely to vote.  So, right off the bat, we might have bias if we were to sample registered voters.

Let us assume a reputable polling organization takes a poll of 1000 voters.  Let us also assume there is a decent mixture of Democrats, Republicans, and Independents.  Why might there be bias

Well, perhaps the most obvious thing to look at is how the sample is split.  Assume that the electorate is made up of 40% Republicans, 40 Democrats, and 20% Independents.  Then without doing any further research, we would want to have our sample split in that fashion (this is called stratified sampling).

Now, I do not want to bog you down with granularity, but we could get very specific if we wanted to.  For example, it might be the propensity for one party or the other to be more likely to switch.  But if there is some tendency for either party, it is probably negligible.

This all said, and without having followed polls ultra-closely, anecdotally it seems to me that Democrats are slightly more represented than should be in many polls.  At least I have seen that in some cases.  Slight bias there.

I mentioned honesty.  The fact is that today, with the ‘deplorables’, and general mindset that FOX news is not mainstream (why is that doctor’s offices and such almost never will show Fox news?) … it is believed that some trump voters do not want to divulge who they like, and I think in part because they do not want to be the ‘bad guy’, so to speak.

And finally, volatility.  This is something that it is not really written in textbooks, but I believe is relevant in today’s political landscape, and I say this largely due to the media.

I believe that the media is the most biased in the U.S history.  But what makes this relevant is that they are influencing some “soft” voters, and those soft voters, I believe are more likely to say they are Democrat and vote Republican than vice versa.

Then there is the matter of electoral votes and not just the popular vote.

As of this writing and depending on the poll(s) you might follow, Joe Biden is up about seven points overall, and up in most battleground states by three to seven points.

I believe that with bias, honesty, and volatility all potentially leaning back in Trump’s favor, this vote will be awfully close again. 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s