530 likes | 542 Views
This article discusses the different methods used to forecast Presidential elections, including pre-election polls, qualitative scoring, and predictive modeling based on long-term patterns. It also examines the accuracy of these forecasts and explores the reasons for their variability.
E N D
FORECASTING PRESIDENTIAL ELECTIONS N. R. Miller POLI 423
Forecasting Election Outcomes • Let us consider how — and how accurately and how far in advance — the outcome of Presidential elections can be predicted. • We need to distinguish between • short-term forecasts (made shortly before the Presidential election), and • long-term forecasts (made at the outset of the general election campaign or even earlier). • We will consider three types of forecasts: • pre-election polls and surveys; • a more or less qualitative scoring method; and • predictive models based on long-term patterns in aggregate data.
Pre-Election Polls • In the relatively distant past, pre-election polls taken even shortly before election day have sometimes been disastrously and famously wrong — most notably • a Literary Digest poll predicted the defeat of President Roosevelt in 1936, and • the Gallup Poll (and almost all other polls) predicted the defeat of President Truman in 1948.
The Literary Digest Polls • The Literary Digest collected from telephone companies and state motor vehicle departments lists with the names and addresses of telephone subscribers and registered automobile owners. • The magazine then mailed out many millions of “sample ballots” to these names and addresses. • Many people did not respond, of course, but the Digest still received millions of ballots, on the basis of which it forecast the outcome of the election. • This procedure was quite successful in predicting the Republican presidential victories in the 1920s and also Franklin Roosevelt’s victory in 1932. • But in 1936 it predicted that President Roosevelt would be badly defeated for re-election by Alf Landon, when in fact he won in one of the greatest Presidential landslides of all time. • Can you account for why the Literary Digest forecasts that had previously been successful failed so badly in 1936?
Gallup and Other Polls in 1948 • Gallup (and other) polls still used quota sampling (rather than random sampling). • There were relatively few polling organizations, each of which took relatively few polls. • Gallup averaged together the results of their several most recent polls. • Gallup stopped polling altogether about two weeks before the election. • There were two third-party candidacies, support for which was hard to predict • In the election itself, Truman ran relatively weakly in the Northeast but much more strongly in the farm belt and the West. • Hence the famous Chicago Tribune photo.
Polling and Survey Research Today • Today most commercial and media polls) use well tested methodologies, • based on some variant of random sampling, and with • considerable care going into the wording of the questions and their assembly into a questionnaire. • Most such polls are very accurate, giving very similar estimates. • However, all such polls are subject to sampling error, stated in terms of a margin of error of ± X% that depends largely on sample size. • They are also subject to “house effects” pertaining to mode of interview, call-back procedures, various adjustments, and (especially) their “likely voter” screen.
Pre-election polls ask something like “If the election were held today, who would you vote for?” • Such polls taken many months in advance of an election bear no reliable relation to the actual outcome. • Hence the question: why are American Presidential election campaign polls so variable [over time] when voters [in the aggregate] are so predictable?
Lichtman’s Keys to the White House • Lichtman developed this system in 1981, in collaboration with Volodia Keilis-Borok, a [Soviet] world-renowned authority on the mathematics of prediction models. • It presumes that presidential elections are primarily referenda on how well the party holding the White House has governed during its term, but taking account of more than economic performance. • The Keys further presumes that “by default” the incumbent party (that controls the White House) wins unless six or more keys are false [turned against the incumbent]. • The keys gives specificity to this idea of how presidential elections work, assessing the performance, strength, and unity of the party holding the White House to determine whether or not it has crossed the threshold that separates victory from defeat.
Polling and Survey Research Today • Today most commercial and media polls) use well tested methodologies, • based on some variant of random sampling, and with • considerable care going into the wording of the questions and their assembly into a questionnaire. • Most such polls are very accurate, giving very similar estimates. • However, all such polls are subject to sampling error, stated in terms of a margin of error of ± X% that depends largely on sample size. • In general, the margin of error is inversely proportional to the square root of sample size. • Margin of error = X%: if we took a great many random sample of this size from the same population, 95% of them would come within ±X% points of the “true value” (population parameter).
Polling and Survey Research Today (cont.) • Different polls are also subject to “house effects” pertaining to mode of interview, call-back procedures, various adjustments, and (especially) their “likely voter” screen. • Poll averages/aggregates are more accurate than individual polls, e.g., Real Clear Politics Poll Average • But all polls ask (literally or in effect): “If the election were held today, how would you vote?” • And voters may change their voting intentions. • The claim of predictive models (e.g., Lichtman) is that they can predict how voters will vote (in the aggregate) better than voters can.
Predictive Models • Forecasting models constructed by social scientists seek • to predict the percent of the popular votes received by the Presidential candidate of the party the controls the White House, and • to do this using information that is available well before the election (e.g., in mid-summer). • All such models use information pertaining to • the recent performance of the economy, and • the recent popularity of the incumbent President • The models differ with respect to the exact measures of economic performance and Presidential popularity that are used and with respect to what other variables [if any] are also used.
PS #1: Problem 2 The predictive models constructed by Abramowitz, Fair, Lewis-Beck, and others all use information concerning (i) the performance of the economy and (ii) the popularity of the incumbent President (and perhaps other information) that is available long before the election in order to predict the percent of the popular vote received by the Presidential candidate of the party the controls the White House. (The models differ with respect to the exact measures of economic performance and Presidential popularity that are used and with respect to what other variables [if any] are also used.) Attached you will find data for economic performance and Presidential popularity for each Presidential election from 1948 through 2008, as well as the percent of the popular vote received by the incumbent party’s Presidential candidate for 1948-2008. Also find and fill in the data for “Streak” and “Electoral Votes.” Can you devise some kind of rule or formula based on this data to predict the percent of the popular vote won by the incumbent party candidate? According to your rule or formula, is President Obama likely to be re-elected?
YEAR is the Presidential election year. INC is 1 if the incumbent President is running for re-election; 0 if “open-seat” election. GDP is real annualized GDP growth over the Fall, Winter, and Spring quarter preceding the election (e.g., from October 1, 2011 through June 30, 2012 (from U.S. Department of Commerce, Bureau of Economic Analysis, http://www.bea.gov/national/index.htm#gdp). UNEMP is the Unemployment Rate in July preceding the election (from U.S. Department of Labor, Bureau of Labor Statistics, http://data.bls.gov/timeseries/LNS14000000). ΔUN is the change in the Unemployment Rate from July of the year preceding the election to July of the election year. STRK is “streak,” i.e., the number of consecutive elections won by the incumbent party PRES is the incumbent President’s approval rating in the first Gallup Poll taken after June 30 of the election year (Gallup Reports and http://www.gallup.com/). PV is the percent of the two-party (i.e., excluding Perot, Nader, etc.) popular vote won by the incumbent party candidate. EV is the number of electoral votes won by the incumbent party candidate (including “faithless electors”). The total number of electoral votes was 531 prior to 1960, 537 in 1960, and 538 since 1960 * 39 electoral votes were cast for States Rights Democrat candidate J. Strom Thurmond ** 14 “unpledged” electoral votes were cast for Harry F. Byrd. *** 45 electoral votes were cast for American Independent Party candidate George C. Wallace.
Summary Statistics: Incumbent vs. Open-Seat Elections(excluding 2012)
The Regression Equation: Predicting the PV in 2012 on the Basis of PRES Only
PV by GDP and PRES (and INC) • Multiple Regression Equations: PV = 36.13 + 0.467 x GDP + 0.298 x PRES R² = 0.736; Adj. R² = 0.695 2012: PV = 36.13 + 0.467 x 2.6 + 0.298 x 47 = 51.35 PV = 35.48 + 0.439 x GDP + 0.278 x PRES + 2.48 x INC R² = 0.792; Adj. R² = 0.740 2012: PV = 35.48 + 0.439 x 2.6 + 0.278 x 47 + 2.74 = 52.42 • For comparison: PV = 36.12 + 0.337 x PRES 2012: 51.96 PV = 47.45 + 1.18 x GDP 2012: PV = 52.52
Predictive Models Presented at APSA 2008 Political ScientistPredicted Obama Vote Brad Lockerbie 58% Thomas Holbrook 55.5% Alan Abramowitz 54.3% Christopher Wlezien 52.2% Alfred Cuzan 51.9% Helmut Norpoth 50.1% Michael Lewis-Beck 50.07% James Campbell Wait till Labor Day Department of Commerce has revised second-quarter growth from 1.9% to 3.3%
Alan Abramowitz 2012 PV = 47.3 + (.107*NETAPP) + (.541*Q2GDP) + (4.4*TERM1INC) • PV stands for the predicted share of the major party vote for the party of the incumbent president; NETAPP stands for the incumbent president’s net approval rating (approval – disapproval) in the final Gallup Poll in June; Q2GDP stands for the annualized growth rate of real GDP in the second quarter of the election year; and TERM1INC stands for the presence or absence of a first-term incumbent in the race. • In order to incorporate this polarization effect in the Time for Change Model, I added a new predictor (POLARIZATION) for elections since 1996.** The estimates for the revised model are as follows: PV = 46.9 + (.105*NETAPP) + (.635*Q2GDP) + (5.22*TERM1INC) – (2.76*POLARIZATION) • **For elections since 1996, the polarization variable takes on the value 1 when there is a first-term incumbent running or when the incumbent president has a net approval rating of greater than zero; it takes on the value -1 when there is not a first-term incumbent running and the incumbent president has a net approval rating of less than zero.