Predicting Presidential Elections
One of the better known models in this area is the Time for Change Model designed by A. Abramowitz. The model uses three factors—the incumbent president’s net approval rating at the end of June (approval minus disapproval), the change in real GDP for Q2 (as percentage) of the election year annualized, and a first term incumbency advantage (two terms for the incumbent party becomes a disadvantage), to predict the winner of the national popular vote, which can also be used as stand-in for electoral collage.
Code is below. The fit is crazy good, Prob F near zero, R^2 at 90%, all predictors are significant.
import io, statsmodels.formula.api as smf, pandas as pd
s="""year,gdp_growth,net_approval,two_terms,incumbent_vote
2012,1.3,-0.8,0,52
2008,1.3,-37,1,46.3
2004,2.6,-0.5,0,51.2
2000,8,19.5,1,50.3
1996,7.1,15.5,0,54.7
1992,4.3,-18,1,46.5
1988,5.2,10,1,53.9
1984,7.1,20,0,59.2
1980,-7.9,-21.7,0,44.7
1976,3,5,1,48.9
1972,9.8,26,0,61.8
1968,7,-5,0,49.6
1964,4.7,60.3,0,61.3
1960,-1.9,37,1,49.9
1956,3.2,53.5,0,57.8
1952,0.4,-27,1,44.5
1948,7.5,-6,1,52.4
"""
df = pd.read_csv(io.StringIO(s))
regr = 'incumbent_vote ~ gdp_growth + net_approval + two_terms + net_approval*two_terms'
results = smf.ols(regr, data=df).fit()
print ('R^2',np.round( results.rsquared, 2))
R^2 0.87
2016
We used different GDP growth and approval rating scenarios for current adminstration come June; These are growth 1% net popularity 0, growth 3% popularity 10, and growth %5 and popularity 30. The last two cases are pretty out there, yes; Right now Bam has 0 net popularity. We based this on here and here. GDP can get better - maybe.
Based on this, you get
conf = results.conf_int()
pred = [1., 1.0, 0., 1, 1*0]
print (np.dot(pred, conf), np.dot(pred, results.params))
pred = [1., 2.0, 5., 1, 5*1]
print (np.dot(pred, conf), np.dot(pred, results.params))
pred = [1., 3.0, 10., 1, 10*1]
print (np.dot(pred, conf), np.dot(pred, results.params))
[43.15056514 53.07497262] 48.11276887846732
[42.85516551 55.08462623] 48.969895873124074
[42.55976589 57.09427985] 49.82702286778082
For the first scenario Hillary's chances of winning are between 43% and 52%, likely loss. The second one at 2% growth and net popularity 5 is also likely loss. The third is a toss up
[Note after the election: the final numbers were growth 1.2%, net popularity 6, close to the second case, and HC lost the election].
It is interesting to note that Bill Clinton, known as a good campaigner, had significant advantages going into the 1992 election unlike his wife now going up against Trump. It is also interesting so much hinges on a very rough number such as growth and general popularity. But in a way this makes sense; Voting for a single person is a blunt instrument really, hence, the basis people use to judge it is also pretty general. Intuitively it makes sense; if a party stays in da house too long, people want to throw you outa there, if there is no growth, the incumbent is not popular, the climb for the candidate from that party becomes steeper and steeper.
conf = results.conf_int()
net_approv = -10.0; gdp_growth = 0.0
pred = [1., gdp_growth, net_approv, 0, net_approv*0]
print (np.dot(pred, conf), np.dot(pred, results.params))
[47.77661018 50.51619322] 49.146401704326685
2020
The massive fall in GDP QoQ decline due to covid made that part of the pred parameter meaningless. But we can assign a zero to that parameter, effectively taking it out of the equation, then using the remaining incumbency, net popularity params, still can predict an advantage for Biden, as Trump's net popularity was at -15%. Trump lost the election.
conf = results.conf_int()
net_approv = -15.0; gdp_growth = 0.0; two_terms = 0
pred = [1., gdp_growth, net_approv, two_terms, net_approv*two_terms]
print (np.dot(pred, conf), np.dot(pred, results.params))
[47.40553806 49.4575212 ] 48.43152963108934
2024
conf = results.conf_int()
net_approv = -20; gdp_growth = 2.0; two_terms = 0
pred = [1., gdp_growth, net_approv, two_terms, net_approv*two_terms]
print ('Interval:', np.round(np.dot(pred, conf),2),
'Average:', np.round(np.dot(pred, results.params),2))
net_approv = -23; gdp_growth = 1.5; two_terms = 0
pred = [1., gdp_growth, net_approv, two_terms, net_approv*two_terms]
print ('Interval:', np.round(np.dot(pred, conf),2),
'Average:', np.round(np.dot(pred, results.params),2))
net_approv = -20; gdp_growth = 2.7; two_terms = 0
pred = [1., gdp_growth, net_approv, two_terms, net_approv*two_terms]
print ('Interval:', np.round(np.dot(pred, conf),2),
'Average:', np.round(np.dot(pred, results.params),2))
Interval: [47.34 49.93] Average: 48.64
Interval: [47.04 48.91] Average: 47.98
Interval: [47.45 50.47] Average: 48.96
References