Election polling

Election polling

by | Sujit Rathod -
Number of replies: 7

A long post-mortem from the New York Times about election polling in the United States.

1. What epi study design is an election poll?

I'll encourage you to read through and pick out where they are talking about selection bias and where they are talking about measurement bias. There's no shortage of epi-related content here!

2. How would you improve polling for the next election?

In reply to | Sujit Rathod

Re: Election polling

by | Layth Hanbali -
I think this whole moodle page is interesting but I've been a "lurker" - who knew all it would take to get me involved was a juicy election poll question? And on a Friday night too!

1. My gut reaction to this question initially was that an election poll is a cross-sectional study. But I'm also tempted to think of it as a series of mini-cross-sectional studies of various demographic groups, that are then re-constructed to better represent what the population looks like (what I mean is: a white 35 year old manual labourer man and black 75 year old university educated pensioner woman are part of the same cross-sectional study, but their "outcomes" are analysed as if they were part of 2 different studies). Or maybe that's totally normal for a cross-sectional study and I'm just not very creative...

There's a huge amount of stuff that talks about selection bias being the main source of error, some to do with the way polling is conducted, some to do with who is likely to accept the invitation to participate.

- I was struck by the fact only 6% respond to polls now, compared to >50% in 1980s. But although that is truly striking, that didn't seem to lead to differential misclassification before 2016. So although a large part of the analysis is about "some types of voters seem less willing to respond to polls than others, perhaps because they are less trusting of institutions, and these voters seem to lean Republican", that doesn't explain the accuracy of models pre-2016. The article does talk about this a bit: "Once researchers analyzed the data, they landed on an explanation for why low response rates were manageable: “Whether somebody was going to participate in a survey was not really related the things surveys tended to measure”" - which is basically saying that selection bias didn't lead to differential misclassification up before 2016.

- I wasn't sure whether this was measurement or selection bias: "Late-deciding voters, accurately identified as undecided in polls, broke strongly for Mr. Trump" - is it measurement bias because the act of sending a survey weeks in advance of the election is a tool that leads to underestimation of the support for Trump, or selection because sending a survey weeks in advance would inherently capture more Biden/Clinton supporters?

- There seemed to be more solutions focusing on fixing the measurement bias than selection bias, despite presenting very few concrete problems inherent to the measurement tools: "polling firms are asking whether they need to accelerate their shift to new research methods, such as surveying people by text message" (although there was mention at the end about the length of surveys)

- Then there are bits of analysis that were truly wild, dismissing selection bias by emphasising existing biases. "A much-hyped theory that Trump supporters lie to pollsters appears to be wrong or insignificant. Polls did not underestimate his support more in liberal areas, where supporting Mr. Trump can be less socially acceptable, than in conservative areas." Why does being open about supporting Trump in one area mean that all of Trump's supporters should be open about supporting him?!

- But basically the whole analysis and the quotes by pollsters don't give me much hope that they know what they're doing: "Decades ago, most people would be happy to answer the door to a stranger or answer the phone to a stranger,” Courtney Kennedy, the director of survey research at Pew, said, “and those days are long gone" - I have no idea why this is relevant

The last two bullet points are why I have no hope that polling will improve in the future. But a sentence in the end I think is one of the best bits of analysis: "Some pollsters also wonder whether the problems may recede when Mr. Trump is not on the ballot." Going back to the first point I made (why did the low response rate not matter for so long and then suddenly it did?): the most convincing explanation for me is simply: Trump (or Brexit).

The methodology has always been flawed, and will always be flawed, because voting is insanely complex, and reducing down the political choices of 10s/100s of millions to a poll of 1000 people is just bad science because it's so reductionist (and it's also a bit sexist and racist: women voters do this, black voters think that, whereas white men get like 70 different analysis charts). But, the political fight had always been between 2 close-to-centre groups which looked, spoke and legislated quite similarly to each other. So the measurement and selection errors weren't fatal for the polls because there wasn't that much to choose from. Then in 2016, for the first time in decades, the two sides were speaking completely different languages, behaving in totally different ways and the polarisation actually "meant" something. So those errors were massively exaggerated and exposed things that had been "averaged out" before. Hence, Trump (and Brexit).

Again, this point is made but not really expanded upon: "These voters do not fit any one demographic group, which is part of why they are so difficult to reach. Instead, they appear to be a distinct group of voters within some groups." Assuming that people vote by demographic group is just... ugh.

Anyway I have no idea if anyone was interested in my 1000-word personal commentary on election polls, but here it is anyway, in the deep dungeons of moodle on Friday night! I'll try to answer #2 before I go.

2. I don't think they can be fixed without surveying 1% of the population (but like, not through facebook and twitter), and I don't think anyone has the time or money for that, so I say we stop polling and just campaign for what we believe in! Eventually we win or we lose but we regain all the time we would have lost on inherently flawed (but very interesting) data.

^ But that's not going to happen because there's too much money in it, so in reality, we're just going to keep obsessing over those numbers every election until the revolution (or the meteor) comes.
In reply to | Layth Hanbali

Re: Election polling

by | FATHIMA MINISHA -
Hey Layth....
It was a pleasure reading your post!! Thank you for sharing it with us...
I agree with you... the polling sounds like a cross sectional study...as its the views of the responders at a particular point in time before elections- hence the fact that the poll results lose validity if somebody decided to change their mind, switch, or just plain decide not to vote.. Just like cross-sectional studies have the limitation of not being able to say what happens after that point in time.
The fact that only 6% respond to polls nowadays is definitely where the selection bias starts. They do point out that this was a major problem in the 2016 election as they failed to represent significant sections of the society- which apparently they tried to rectify this year (just not sure how successfully).
I completely agree with you as to why it has the spotlight now when the methodology has always been flawed. Previously, just by pure chance and luck, the polls predicted accurately. Its more in focus now because it actually matters who wins now... and probably relying too much on the poll results might have cost precious campaign strategies to both parties.

I have no practical ideas on how to improve the polling for next time. But theoretically, from an epidemiological point of view, I think probably there is a requirement to study why exactly only 6% of ppl respond to polls... probably thats a starting point. And probably figuring out what can be done about it. Getting more and the underrepresented ppl to respond to the polls is one way to start handling the selection bias. The non-responders are important in our clinical trials as well. It's recommended to put an effort into finding out why ppl have not responded to questionnaries or follow up- so as to figure out how much of a selection bias has occured in the sample under study, whats the reason for the same and if there is any way to account for that loss...

Fathima
In reply to | FATHIMA MINISHA

Re: Election polling

by Beth | Bethany Evans -
Morning folks. @Layth - very much enjoyed your Friday night commentary on the polls article! Few thoughts from me...

(1) Agreed polls are a cross-sectional study - that presumably aims to weight responses by demographics to get a picture for the full population. There's some more details on how this is done in another article I read - https://www.vox.com/policy-and-politics/2020/11/10/21551766/election-polls-results-wrong-david-shor

(2) The biggest selection biases appear to be happening where there is some correlation between voting for a given outcome (e.g., Trump or Brexit) and low levels of trust, which translates into non-responding to polling surveys. This suggests that rather than just polling more people using the same methods (phoning people on a landline in 2020 - really!?) alternative approaches are required e.g.,:
(a) In-person, qualitative, surveys: would require a lot of effort to deploy people to the streets to quiz people in different areas - but may get a more representative response from people who wouldn't answer their phone?
(b) Other non-phone based survey methods: via Facebook, email, text etc. if that feels more "anonymous" and easier to respond to - though honestly I think this would skew more towards millenials who hate answering the phone rather than the voters thought to be missing from current polls in the US
(c) More weighting of responses: I guess that trying to weight for levels of trust in responders and essentially account for Trump supporters who don't respond to surveys or "undecided" voters who later vote Trump might be possible? This may require adding a question to infer trust to some of the polls
(d) Focusing on 'swing' areas - I'm not American and my understanding of exactly how American voting happens isn't great, but I feel like there are a subset of all states that have the real potential to swing between either party. I would be interested to see data showing the stickiness of state voters to given parties and that focus more detailed polling (e.g., in person or focus groups) on states where outcomes have changed over time.
(e) Other quantitative analyses of voter preferences: so not really polling but is there any useful data that can be gleaned from things like Twitter or other networking sites where people vote and voice political opinions? I'm skeptical because of the existence of bots and the fact those with strong political opinions tend to post most (e.g., responses to Trump's tweets are either idolizing Trump or vilifying him) - but wondering if trends scraping political posts could be used to infer anything?

Final point from me - I feel like election polls are a relatively rare cross-sectional survey in that the results are used almost immediately to (a) inform allocation of scarce resources (e.g., should Biden go to Iowa or Arizona for a rally?) and (b) solicit more funding given the huge costs of campaigning (particularly in the US). Given the challenges with polling I feel like maybe less weighting should be given to polls to guide both of these areas - in line with what Layth was saying - i.e., campaign more for something than against something.
In reply to Beth | Bethany Evans

Re: Election polling

by | Jone Garcia Lurgain -
Morning everyone!

Sujit, thanks a lot for this non-medical Epi post!!

Layth, Fathima and Bethany, thanks for your comments. I really enjoyed and learnt reading them!

Some thoughts from my side:

1. What epi study design is an election poll?

I agree with all. This seems a population-based cross sectional study with descriptive data (outcome: presidential voting intention).

I guess another cross sectional analytical study could address the correlation between sociodemographic characteristics and political position. In this case, we would need to deal with confounders, as you can have a political position and in some point you can decide to ´punish´ your candidate (voting another candidate) because of bad governance, or whatever... I guess that's why it is important the proportion of undecided voters in polls designs?


2. How would you improve polling for the next election?

I have added below some reflections, ideas about how to minimize bias and therefore how to improve polling.

Selection bias:

It seems that the main issue for pollsters is to construct a representative sample of the target population (US eligible voters) due to low poll respondent rates (6% compared to 50% in the 80s) and I would also add because of survey design issues (often linked to costs/budget issues). This (low response rates and survey costs) makes almost impossible to eliminate selection bias. So the key point would be how to reduce selection bias:

1. Increase response rates to reduce selection bias. I agreed with Fathima that it would be helpful to conduct perhaps a more qualitative study to explore ´who´ are those who do not respond polls, and to understand why people do not respond telephone surveys (in this article, there are some reasons https://www.pewresearch.org/fact-tank/2019/02/27/what-our-transition-to-online-polling-means-for-decades-of-phone-survey-trends/ ) and under what circumstances they would respond election polls. Perhaps, the poll industry needs to change the telephone survey protocols or just replaces the data collection technique (instead of random telephone surveys with live interviewers, use online polls, as proposed by Pew Research Center, or text messages, proposed by others. Bethany mentioned different options as well!

2. Construct a representative sample to reduce selection bias. Underrepresentation of specific population groups is one of the major source of selection bias in all election polls. I think it is important to have up to date descriptive data on the eligible voters, stratified by age, sex, residency, socio-economic and education levels, ethnic groups, political approaches and also by types of voters (e.g. undecided or late-decided, refuse-to-vote, new voters).

With this information (and good budget) you could combine stratified (e.g. sociodemographic parameters) and multi-stage (e.g. regions, districts, etc.) sampling strategies to get a better representative sample. I would concentrate efforts in those regions with lowest response rates and among relevant groups (e.g. undecided, new voters). For example, I would wonder how would affect the response of young new eligible voters´ (the millennials) to the polls´ results.

As Layth mentions, what seems interesting is that some studies found that there is no association between response rate and accuracy (Pew Research Center). I would like to read these studies, as it seems that the response rate or sample size is crucial in all quantitative studies, isn´t it?

Information bias:

Another information/measurement bias could be related to the survey protocol/questionnaire? Obviously, you need to have some knowledge about the country election system (Congress, Senate), but simplifying if you ask binary questions (Biden vs Trump) and you ignore questions about the election of Senate candidates, you might be missing info of those voters who vote both Biden and Republican Senate candidates, and got some misleading information.

I would also like to mention in this post some lines about Social desirability bias, which is often a source of information bias in election polls. Social desirability bias refers to the tendency of research subjects to give socially desirable responses instead of choosing responses that are reflective of their true feelings. In the case of this election, different factors might lead to this type of bias. As mentioned in the article, “some combination of the coronavirus, Mr. Trump’s reaction to police brutality and his erratic behavior at the first debate had put Mr. Biden within reach of the most lopsided presidential win since Ronald Reagan’s in 1984”. This influential factors might have led people not to give true responses about their vote and as a result, the polls might underestimate Republicans´ voters in liberal areas, where supporting Trump can be less socially acceptable than in conservative areas.

It seems that media including NYT are re-evaluating how they portray polls in future coverage, giving them less prominent coverage. I think this is really interesting, because media has a massive influence not only on the politicians´ campaigns, but also on the voters´ opinions, attitudes and actions. As said before, voters may change their mind along the campaign process, so I agreed with Fathima about questioning the validity of the polls´ results, and that conducting polls closed to the election day would give more accurate results. However, I think it is not allowed by law campaigning and conduct polls close to the election day, at least in my country, as the polls results could influence the final election results.

From an epidemiology point of view, polls can never be perfect and it seems that there is an acceptable “range of historical polling errors” in US, so the point is how to reduce “random error” and systematic error... I guess the answer is with the help of stats experts (to improve inferences from sample to population) and good insights into the parameters of the population under study, respectively.

I must say that I am not epidemiologist. Actually, I am journalist with some years behind working in Politics, covering some political campaigns in my small Basque country and Spain, and “translating” some polls´ results/stats to the wider audience. I have ambivalent feelings about the need of election polls. I guess we can live without election polls, they are actually thought to cover different interests... however, I also think these polls may play a role in encouraging people to participate in the elections. In some countries there is a big problem related to the ´absent voters´, which leads to low participation rate in the elections, which means that governments/decisions may not be representing accurately the population. But this is another debate!! Think also that not all those who respond polls go later to vote!

Have a good day, guys!
In reply to | Jone Garcia Lurgain

Re: Election polling

by Beth | Bethany Evans -

Hey Jone, 

I like your additional ideas and the reflections on social desirability bias! 

In particular, I'd love to see a breakdown of the election polls by the categories you suggest (age, sex, income, socio-economic status, region etc.) to see if there is any particular demographic that is under-represented with an insufficient sample size in the polls?

I was also wondering, does the US find they get more accurate forecasts from the exit poll data? (e.g., here - https://www.bbc.com/news/election-us-2020-54783016) or are they as subject to bias? If not, I'd be interested to compare the pre-election polls with the exit polls, by categories, to see which differ more from the "actual" results? This could then identify where to follow-up for additional effort in earlier polls, or find ways to adjust based on expected swings later on. Has anyone seen anything like this - a quick look only showed the really high-level exit polls on the BBC.

Beth

In reply to Beth | Bethany Evans

Re: Election polling

by | Layth Hanbali -
I think exit polls have generally performed quite well, which makes sense because they eliminate quite a bit of bias by design. They recruit only from those who have voted (removing those who participate in polls but don't end up voting), and asking participants to recall an action they took in the immediate past, rather than about potential future intent. The design doesn't eliminate social desirability bias but it eliminates most other sources.

I like the idea of comparing sub-groups in exit polls and opinion polls!
In reply to | Layth Hanbali

Re: Election polling

by | HANY IBRAHIM KHALIFA MAHMOUD -
Hi Colleagues,

Thank you very much for the great insights you shared and unique ideas you proposed to improve polling in the next election.

I agree with you all, that the election poll as described in the article looks like a cross sectional study. As most of you mentioned, considering the political nature of the objectives of this study, we will definitely face different forms of bias and confounding.

If I would recommend 2 points for consideration for the next election:

  • I would say the process should be designed as an ecological study rather than a cross sectional study. Collecting individual responses could be challenging, however collecting information about the whole population in a specific state could be more feasible with higher level of validity.

  • Data collection process should be upgraded, standard interviews, surveys or phone calls are so subjective and participants can be misleading. I would rather recommend collecting information through other channels that can be unbiased; for example, using Artificial Intelligence to search what are the political terms residents of a specific state are searching on google? – Data should be collected anonymously. This could reduce the bias and the subjective impact.

These are some primitive thoughts in addition to all the points we already mentioned. My impression is, applying scientific research principles to this election polling process would be a real challenge 

What do you think?

Regards,
Hany
Accessibility

Background Colour

Font Face

Font Size

1

Text Colour