The nerds have taken over the political space over the last decade-plus as the tools that started to revolutionize sports over the previous decade have been brought to bear on politics with wild success. Every major election sees a mind-numbing amount of numerical and mathematical analysis focused on it, and poll averages, forecasts, and other numerical analysis tools abound. For many, poring over polls has become as much if not more of a pastime than following what the candidates themselves are doing.
Perhaps surprisingly, there is no one single “poll average”, and indeed there seem to be as many different poll averages as there are outlets collating the polls. The two most prominent, widely cited poll averages are the ones from RealClearPolitics and FiveThirtyEight, and as the race for the Democratic presidential nomination has progressed I find that neither of them quite fit what I want from them. RealClearPolitics publishes a straight average of whatever polls they record and deem worthy, usually from the most prominent outlets, over whatever period they choose to average them over. The only quality control, if any, is in what polls are included; among the polls included, there is no attempt to control for sample size, methodology, or overall quality, and polls simply age out of the average once they get too old (however “too old” is defined) or the next poll from that pollster comes along.
FiveThirtyEight, on the other hand, weights its poll average based on those factors, but the details of their methodology aren’t public, and it also includes their own model’s assumptions about how the race should develop, meaning in the days immediately after a contest the “average” tries to predict how much of a “bump” candidates will get based on their performance, and states with little recent polling will have their “average” extrapolated from larger national trends. Such extrapolations don’t always incorporate mitigating factors or common sense; for example, the current FiveThirtyEight “average” of South Carolina has Mike Bloomberg in fourth place at 9.5%, despite him not actually being on the ballot there. The copious polling conducted in South Carolina that doesn’t include Bloomberg is merely interpreted as failing to catch whatever bump Bloomberg might have received. The result is so complex with so many mitigating factors that it’s hard to accurately call it a “poll average” at all; it’s more an attempt to capture the state of the race based on local and national trends and past history, and FiveThirtyEight themselves readily admit that it’s not really intended to be much more than the backbone of their election forecasts. It’s useful in its own way, but not really the best way of capturing what the polls are actually saying right now like what RealClearPolitics and most other media outlets try to do. But is there a middle ground between a straight average of the topline numbers and FiveThirtyEight’s complex model?