Developing an Empirical Model to Forecast United States Presidential Elections: A Machine Learning Approach
Document Type
Article
Publication Date
10-17-2020
Publication Title
Advances in Social Sciences Research Journal
Volume
7
Issue
10
First page number:
186
Last page number:
198
Abstract
In this paper, we develop and compare two models for forecasting the 2020 U.S. presidential election using multiple linear regressions (MLR) and the Machine Learning method of Extreme Gradient Boosting (xgboost). We predict each state’s Republican vote share using seven continuous predictors from 1976-2016, as well as dummy columns for each state. After computing 95% confidence intervals for each prediction, we determine the candidates’ electoral college probabilities. The xgboost appears to be a very strong predictor, accounting for 98.6% of the variance with a 3.34% root mean square error (RMSE), whereas the MLR only accounts for 71.8% of the variance and leaves an RMSE of 6.35%. We observe that 1) both models predict a Democratic electoral college landslide in the 2020 elections, 2) Georgia, Iowa, Florida, North Carolina, and Ohio are crucial for the Republicans to win, and 3) Extreme Gradient Boosting is an attractive alternative to MLR in election forecasting.
Keywords
Presidential election; Electoral college; Forecast; XGBoost; Multiple linear regression
Disciplines
Other Political Science | Political Science | Social and Behavioral Sciences
Language
English
Repository Citation
Loewy, N.,
Singh, A.,
Gallagher, T. M.
(2020).
Developing an Empirical Model to Forecast United States Presidential Elections: A Machine Learning Approach.
Advances in Social Sciences Research Journal, 7(10),
186-198.
http://dx.doi.org/10.14738/assrj.710.9210