With Senator Warnock safely re-elected in Georgia, and the counting done for House seats, we can follow up on the series of stories I posted about the pollsters and pundits leading up to the midterms (see the final preelection story here) and see how their assessments of the election fared. The political sites I was following in the run-up to the election included the following (with abbreviated names for the table headings):
FiveThirtyEight: Senate ratings and House forecast. Provides estimates of election-day vote based on both a polling average and a “deluxe” model that includes historical trends and expert ratings. (538)
Electoral-vote: Provides a simple last-week polling average for Senate races. (E-V)
RealClearPolitics: Another polling average, but broader than Electoral-Vote’s. (RCP)
270towin: Yet another polling average and algorithm. (270)
Cook Political Report: Expert ratings of each race (solid, likely, lean, or tossup). (Cook)
Sabato's Crystal Ball: Also expert ratings (safe, likely, lean, or tossup). (Sabato)
Inside Elections: Yet more expert ratings (solid, likely, lean, tilt, or tossup). (Inside)
First, let’s take a look at the Senate outcomes. The numbers/predictions from each site were those available on Sunday morning, 11/6, so they would not have included any final tweaks on Monday 11/7. However, I don’t think any changes would have been notable. The new first column is the actual vote spread from the final results. I converted the final numbers from 538 into the same format as I used for the polling aggregators. The result shown for GA is the election day result, in which Warnock finished ahead of Walker, even though he failed to get the 50% threshold needed to avoid the runoff election.
Senate Races
|
Actual vote |
E-V |
RCP |
270 |
538 (polls) |
538 (deluxe) |
Sabato |
Cook |
Inside |
Az |
D +4.9 |
D +1 |
D +1.0 |
D +1.0 |
D +4.2
|
D +2.0 |
lean D |
tossup |
tilt D |
FL |
R +16.4 |
R +6 |
R +7.5 |
R +7.0 |
R +7.7 |
R +9.7 |
likely R |
likely R |
likely R |
GA |
D +0.9 |
D +1 |
R +0.4 |
D +0.2 |
D +0.3 |
R +0.5 |
tossup |
tossup |
tossup |
IA |
R +12.2 |
R +12 |
R +12 |
R +11.3 |
R +10.2 |
R +13.0 |
likely R |
likely R |
likely R |
NV |
D +0.9 |
R +6 |
R +2.4 |
R +2.2 |
R +1.2 |
R +1.2 |
tossup |
tossup |
tossup |
NH |
D +9.2 |
D +2 |
D +0.7 |
D +2.4 |
D +3.5 |
D +3.9 |
lean D |
lean D |
tilt D |
NC |
R +3.2 |
R +3 |
R +5.2 |
R +3.8 |
R +3.0 |
R +4.8 |
lean R |
lean R |
tilt R |
OH |
R +6.6 |
R +9 |
R +5.0 |
R +2.7 |
R +3.4 |
R +5.6 |
lean R |
lean R |
lean R |
PA |
D +4.9 |
D +1 |
R +0.1 |
R +0.6 |
D +0.8 |
D +0.4 |
lean D |
tossup |
tossup |
UT |
R +10.4 |
R +6 |
R +10 |
R +11.3 |
R +9.7 |
R +9.7 |
likely R |
likely R |
likely R |
WI |
R +1.0 |
R +3 |
R +2.8 |
R +2.8 |
R +3.2 |
R +4.2 |
lean R |
lean R |
tilt R |
Senate overview: E-V and the 538 polls-only model both correctly forecast the winner of every Senate race I was tracking, except NV. 538 was closer to the mark on some races, but the same was true of E-V — so both were pretty good indicators of how the election would actually turn out. However, you would have been more pessimistic than necessary taking their estimates at face value, given the misfire in NV.
270 fared slightly worse, incorrectly showing a razor-thin GOP advantage in the PA Senate race. Fetterman’s final vote margin turned out to be stronger than anyone was predicting, but at least E-V and 538 called it in the right direction. RCP did even worse, incorrectly calling for GOP wins in both PA and GA on election day, missing 3 out of 11 races.
The 538 deluxe model (polling + pundit predictions + secret sauce) was an odd mix of results, incorrectly forecasting the GA result and generally doing worse — but in some cases better — than the polls-only model. The instances where it fared notably better were red-state races where the GOP was favored anyway, with Democratic candidates showing some signs of early promise but losing by large margins (FL, IA). It similarly did better in OH, where the R advantage in this red-trending state showed up in Vance’s larger-than-expected win despite his generally poor campaign. In strongly contested states with Democratic wins (AZ, GA, PA), the deluxe model underperformed relative to the polls-only, predicting smaller D margins and even indicating a R win in GA. It also fared worse in NC, where the polls-only forecast was notably closer to the actual R margin of victory.
In the race with the narrowest margin (GA), the polling aggregators were generally quite accurate, with no one calling for a margin greater than 1%. Elsewhere, though, D wins were greater than the polling averages indicated (AZ, NH, PA). In NV, actually tied with GA for closest race, the aggregators were just wrong. and in Wi the R win was substantially more narrow than predicted. This is consistent with the picture of an overall election in which the Democrats experienced historic success at multiple levels (Senate, House, gubernatorial, takeover of state legislatures) for a party in a midterm election with a relatively unpopular incumbent president.
As for the pundits, we can simply count the number of races they called correctly. For this, I ignored the “tossup” ratings where the pundits refused to commit one way or the other, and only looked at races where they picked an expected winner (regardless of their degree of confidence in the outcome). Sabato got 9 of 9 races; Inside Elections 8 of 8; and Cook 7 of 7. So the prize has to go to Sabato, which was more willing to go out on the proverbial limb and call a winner.
House results: The current balance in the House is now R 222, D 212. The one open seat is in a heavily D district in Virginia, so as soon as a special election is held the balance should be 222-213. The final forecasts from the sites tracked are below. The listed numbers for D and R seats count all those where either party was predicted to have an advantage, regardless of the degree of confidence.
RCP: R 228, D 174, T 33
270: R 222, D 200, T 13
538: R 220, D 205, T 10
Sabato: R 219, D 196, T 20
Inside Elections: R 216, D 199, T 20
Cook: R 212, D 188, T 35
So 270, 538, and Sabato came closest to the final count of Republican seats. RCP and Cook did the worst, far understating the final total of Democratic seats (mainly because of the very large numbers of “tossup” seats that they projected). It’s also worthwhile to note that what this shows is that everyone underestimated Democratic strength in the election, with the seats they counted as tossups overwhelmingly going to the Democrats (as always, some expected D or R wins failed to materialize, but the vast majority of called seats turned out as expected).
Next up, the gubernatorial races, with 538 once again converted to the same format as the polling aggregator sites.
Gubernatorial races
|
actual vote |
RCP |
270 |
538 (polls) |
538 (deluxe) |
sabato |
cook |
inside |
AZ |
D +0.6 |
R +1.8 |
R +1.8 |
R +1.8 |
R +1.6 |
tossup |
tossup |
tossup |
FL |
R +19.4 |
R +11.5 |
R +11.0 |
R +10.7 |
R +11.9 |
safe R |
likely R |
likely R |
GA |
R +7.5 |
R +8.1 |
R +8.2 |
R +7.4 |
R +7.6 |
likely R |
lean R |
lean R |
KS |
D +2.1 |
D +2.5 |
D +2.5 |
D +6.1 |
D +1.5 |
tossup |
tossup |
tossup |
NV |
R +1.4 |
R +2.6 |
R +1.4 |
R +1.8 |
R +1.0 |
tossup |
tossup |
tossup |
OK |
R +13.7 |
R +2.3 |
R +2.4 |
R +4.6 |
R +9.7 |
likely R |
likely R |
likely R
|
OR |
D +3.4 |
tie |
D +0.4 |
D +1.4 |
R +0.6 |
tossup |
tossup |
tossup |
PA |
D +14.8 |
D +10.7 |
D +10.5 |
D +10.3 |
D +9.7 |
likely D |
likely D |
lean D |
TX |
R +11.0 |
R +9.2 |
R +7.7 |
R +8.9 |
R +12.3 |
likely R |
likely R |
solid R |
WI |
D +3.4 |
R +0.4 |
R +0.4 |
R +0.4 |
tie |
tossup |
tossup |
tossup |
Gubernatorial Overview: The polling did worse in the gubernatorial than in the Senate races. For two states, they were just wrong: in both AZ and WI the final days of polling called for R wins, suggesting a story of surging Rs going into election day, only to see AZ flip to a D governor and an incumbent (Evers) in WI defend his seat by an unexpectedly strong final margin. Elsewhere, the polling understated D wins in OR and PA.
However, in the two closest states, KS and NV, the polling averages were very close to the final outcome and correctly called for a D and an R win, respectively. Similarly, averages for the GA race were very close to the final margin. Elsewhere, as in the Senate races, the polls underestimated final R margins in red (OK, TX) or trending-red (FL) states.
Except for the OR race, the 538 deluxe model actually seemed to fare better than the polling-only model. In FL, KS, OK, and TX it was notably closer to the final margin. Only in OR did it notably miss, calling for a R win while the polls-only model correctly predicted a D victory. This was partly offset by its not indicating a R win in WI. In other states (GA and NV) both models came equally close to the final margin.
Finally, the pundits were more cautious about calling winners for governor than they were for senator. All three of those I followed had the same 5 out of 5 record on races they actually called. But they get no points for courage in punditry for choosing not to call a winner in so many of the races tracked here.
Final thoughts: The 2022 midterms were generally not a good one for poll haters. In some of the closest senate and governor’s races, polling averages were less than a point from the final margins. In numerous others, the polls understated the winning candidate’s final margin of victory, but this turned out to be true for races won by candidates of both parties. Most of the largest misses occurred in states where one party was a strong favorite going into the election, usually with an incumbent candidate (the PA governor’s race was an exception here). The long-observed effect of “undecided” voters coming home to their party in the final days before the election seems a likely culprit.
It may be true that polling is easier without Donald Trump on the ballot (directly). And despite the many challenges noted by today’s political pollsters (plummeting response rates, difficulties polling mobile phone users, distrust/dislike of polling), it can still be very accurate, even in very close races. Certainly there is no evidence in the results reviewed here of a systematic bias in favor of either party. However, it’s possible this was due in part to the unique circumstances of the 2022 midterm (Trump promoting many flawed GOP candidates, for one) rather than methodological success.
In one sense the pundits also did very well, all of them scoring 100% on the races tracked here where they chose to make a call. Where they don’t do so well is on closer races, preferring to retreat to the neutral ground of “toss-ups” rather than commit one way or the other. So they’ll maintain their reputations going forward (and isn’t that really what any pundit wishes to do?).
I’ll probably do this again in 2024. Hope to see you then.