- The Crosstab
- Posts
- Weighting by Recalled Ballot
Weighting by Recalled Ballot
Should you do it?
With the release by NYT/Sienna of a new Pennsylvania poll with a recalled ballot that had 8 points more Biden supporters in 2020 as compared to his 1.2 point win margin, the fierce debate around weighting or quotas on recalled ballot has exploded again.
A lot of ink has been spilled by people much smarter than me like Nate Cohn, but I do think this is an interesting rorschach test about polling. The basic premise is that often pollsters ask people who they voted for in 2020 and use that as a way to ensure the partisanship of the sample accurately reflects what we think Election Day might look like.
Now in states like Virginia or Georgia where there is no party registration on the file, the necessity of recalled ballot is a little more clear as a replacement for partisanship. However, Pennsylvania not only has registration on the file but is one of the few states where independents make up a small (but important!) part of the electorate.
The case for using recalled ballot is that voting in this election is highly correlated to who you support in the future and since we have election results we know how many of each side voted in the last election giving us at least some hard dates to go by.
This helps to correct for response bias issues and ensure that neither group is too overrepresented in the polling. It’s also easy and relatively cheap for pollsters to do and doesn’t require a lot of advanced math.
The downsides though are pretty significant. The first is that you pretty much lock in the key composition of the electorate and have way to know if something has happened impacting turnout. This was the old debate around setting quotas for party, which I use to never do for this very reason. For example in PA partisan composition varied from D+0 to D+7 in the last few years, and if you set it you are cooking a lot of the cake. However the massive response bias issues have in some way forced a decision between trying to adjust for not getting enough hard to reach Trump Republicans into the sample or for allowing the float.
However since party is on the voter file and we have many years of examples including in odd years, low turnout races, etc we have a real sense of what the range of possible outcomes are. The other challenge with doing this with recalled vote in a state is that we really only have 2 data points (2016,2020) and who you voted for in 2020 is not on the voter file.
This is somewhat similar to the arguments around education quotas where we also have nothing on the file AND unlike recalled Presidential ballot the modeled data available usually sucks. I would take the same view on recalled ballot as I do education. I watch it on every survey, and often will mention. The results are x but please note it may be a little over educated meaning it may be a little worse than reality. If 2 nights into the survey its wildly out of whack then we may suppress any additional post graduate interviews for example. However to send hard and fast numbers to me seems a bit too inflexible in the composition of the electorate but I do think there’s a legitimate debate on both sides of the article.
However if I came back with a D+8 party survey or Biden +8 recalled survey in a state where this is no history for said outcome I certainly would weight back at least somewhat to the norm or if budget allowed add interviews to see if it came back at all.
Politics, a little science, a lot art.