1 / 53

Expert Forecasts

Expert Forecasts. Their Uses and Limits. I. Who Counts as an Expert Forecaster?. Expertise: Subject-specific (generally discipline and subfield, but some problems are interdisciplinary) Requires hundreds or thousands of hours of work/practice Accepted as expert by other experts (problems?)

jamar
Download Presentation

Expert Forecasts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Expert Forecasts Their Uses and Limits

  2. I. Who Counts as an Expert Forecaster? • Expertise: • Subject-specific (generally discipline and subfield, but some problems are interdisciplinary) • Requires hundreds or thousands of hours of work/practice • Accepted as expert by other experts (problems?) • Forecasting: Claims have the form of a forecast (future outcome predicted from available data) • Reproducible Findings: • Claims are derived from evidence using valid (correct) and reliable (reproducible) methods • Claims are falsifiable • Claims are more accurate than chance or simple models

  3. II. Obstacles to Expert Forecasting • The Elicitation Problem • Asking the right question – open-ended questions generally result in open-ended, vague, non-falsifiable, and eve out-of-expertise predictions • Stating predictions in the form of easily-observed variables • Ensuring that both/all branches of conditional forecasts are accounted for • Getting estimates of likely error or known biases, rather than just a point forecast

  4. B. The Specificity Problem • Vague terms: Beginning/termination points of forecast are not specified, concepts which combine many variables are used in place of single variables, adjectives are nonspecific, etc. Example = horoscopes. • Direction without a scale. Saying something will “increase” necessitates have a measure of what level it is currently at. Example = “will undermine/enhance national security.”

  5. C. The Contingency Problem • Single-issue vs. “package deal” predictions: Many forecasts are of the form of “If A and B and C and D – and not E, F, or G – then H.” Problem: forecast has so many conditions that its success or failure is unlikely to be observed since A-G are unlikely to line up in exactly the correct way. Forecasts of “A” are more testable.

  6. D. The Bias Problem • Marcus (2008): Physicians give overly-optimistic estimates of patient survival time. Why?

  7. D. The Bias Problem • Marcus (2008): Physicians give overly-optimistic estimates of patient survival time. Why? • Physicians have emotions (stress) and fear giving bad news but not good news • Evidence: Bias is more pronounced when physician has closer relationship to the patient

  8. D. The Bias Problem 1. Asymmetric Loss Bias: Bias can be created by “asymmetric loss functions” – that is, when making an overly optimistic/pessimistic prediction that turns out to be incorrect carries greater costs that if the forecaster had erred in the opposite direction.

  9. Asymmetric Loss Functions -- Examples From Alexander and Christakis (2008) • “For example, government experts making budget forecasts may be influenced by political incentives, as the costs of wrongly projecting a surplus may lead to public disapproval, while wrongly projecting a deficit may lead to an impression of exceptional government performance” (describing Elliot et al, 2005). • “In his study of the market for single family homes in a 1965 California town, Varian (1974) noticed that assessors faced a significantly higher cost if they happened to overestimate the value of a house. While in the case of an underestimate, the assessor's office faced the cost in the amount of the underestimate, conversely, in the case of the overestimate by an identical amount, the assessor's office faced a possibility of a lengthy and costly appeal process. Since this classic study, loss functions have become an important aspect of the study of expert forecasts.”

  10. 2. Affective Bias • People tend to overestimate the duration (and possibly the amount) of pain/pleasure or sadness/happiness they will feel if some event comes to pass. • Key cause is “focalism:” People focus on the event being predicted and forget all the other things they’ll be doing when/after it comes to pass.

  11. Example (Press and Academic Takes) • “How we forecast our feelings, and whether those predictions match our future emotional states, had never been the stuff of laboratory research. But in scores of experiments, Gilbert, Wilson, Kahneman and Loewenstein have made a slew of observations and conclusions that undermine a number of fundamental assumptions...” —New York Times, 2003

  12. Example (Press and Academic Takes) • Gilbert et al (1998): “People are generally unaware of the operation of the system of cognitive mechanisms that ameliorate their experience of negative affect (the psychological immune system), and thus they tend to overestimate the duration of their affective reactions to negative events. This tendency was demonstrated in 6 studies in which participants overestimated the duration of their affective reactions to the dissolution of a romantic relationship, the failure to achieve tenure, an electoral defeat, negative personality feedback, an account of a child’s death, and being rejected by a prospective employer.”

  13. 3.Political Bias: A Product of Both Asymmetric Loss and Affective Biases • The 51/49 Principle – Incentive to misrepresent certainty of forecast • Groupthink – Leader’s preferred focus becomes attractive to group members: those with “bad” forecasts are ostracized, fired, or executed • The “File Drawer Effect” – Even when private forecasts are unbiased, public ones may be biased by selective release • The Precautionary Principle – Exaggerate magnitude of consequences (often combined with 51/49 principle) • Source Bias – Media/public demand for “both sides” can allow “side” with poor forecasting record to continue to publicly forecast (e.g. most pundits)

  14. Exaggerated Confidence Gives Influence (Or Repeat Business) Tschoegl and Armstrong (2007): • “Experimental studies have shown that authentic dissent, such as that of the hedgehogs, is more effective than devil’s advocacy.” • “The effectiveness of the hedgehogs’ authentic dissent stems from the users’ realization that the hedgehogs are not just making up their arguments…it is not unusual for a user of forecasts to ask the forecaster, “Is that the house opinion or is it yours?” If the forecaster wants the user to consider his forecast seriously, he has to maintain it seriously; a forecaster who qualifies his forecast with the words, “I don’t think this is really likely but you should think about it” will find his forecasts ignored.

  15. 3.Political Bias: A Product of Both Asymmetric Loss and Affective Biases • The 51/49 Principle – Incentive to misrepresent certainty of forecast • Groupthink – Leader’s preferred focus becomes attractive to group members: those with “bad” forecasts are ostracized, fired, or executed • The “File Drawer Effect” – Even when private forecasts are unbiased, public ones may be biased by selective release • The Precautionary Principle – Exaggerate magnitude of consequences (often combined with 51/49 principle) • Source Bias – Media/public demand for “both sides” can allow “side” with poor forecasting record to continue to publicly forecast (e.g. most pundits)

  16. IIII. Evaluating Expert Forecasts

  17. A. Accounting for Task Complexity • Difficulty: Task complexity appears to have more effect than forecaster differences.  Some tasks are harder to predict than others • Example -- Stewart, Robert, and Bosart (1997) study on student, professor, and local media weather forecasts: All experts did better forecasting temperature than precipitation. Why?

  18. Statistical models: Look at R2

  19. A. Accounting for Task Complexity • Difficulty: Task complexity appears to have more effect than forecaster differences.  Some tasks are harder to predict than others • Example -- Stewart, Robert, and Bosart (1997) study on student, professor, and local media weather forecasts: All experts did better forecasting temperature than precipitation. Why? • Temperature = fewer and linear predictors • Precipitation = many interacting and nonlinear predictors • Did the experts outperform the model?

  20. A. Accounting for Task Complexity • Difficulty: Task complexity appears to have more effect than forecaster differences.  Some tasks are harder to predict than others • Example -- Stewart, Robert, and Bosart (1997) study on student, professor, and local media weather forecasts: All experts did better forecasting temperature than precipitation. Why? • Temperature = fewer and linear predictors • Precipitation = many interacting and nonlinear predictors • Did the experts outperform the model? • All of them did – even the undergraduate!

  21. B. Experts vs. Nonexperts: Does expertise matter for political forecasting? (Tetlock) • Cases: Evaluated 82,361 predictions from 284 people who were professionals tasked with “commenting or offering advice on political and economic trends.” (Real experts?) • Method: Forecasters asked to make predictions both in and out of their areas of expertise, as were non-experts. Comparison of subjective vs. objective probabilities used to determine accuracy.

  22. 3. Major Results a. Overall accuracy: • Experts (in own area): Worse than flipping a coin (more precisely, choosing randomly between three possible outcomes) • Dilettantes (experts out of area): About the same • Well-informed non-experts: About the same • Uninformed non-experts: Somewhat worse • Simple statistical models outperform the experts

  23. Menand (2005) • “(M)ore than a hundred studies … have pitted experts against statistical or actuarial formulas, and in almost all of those studies the people either do no better than the formulas or do worse. In one study, college counselors were … asked to predict … freshman grades in college. The counselors had access to test scores, grades, the results of personality and vocational tests, and personal statements from the students, whom they were also permitted to interview. Predictions that were produced by a formula using just test scores and grades were more accurate. … In one (study), data from a test used to diagnose brain damage were given to a group of clinical psychologists and their secretaries. The psychologists’ diagnoses were no better than the secretaries’.”

  24. b. Some experts were more successful than others • Moderates do better than “boomsters” and “doomsters” • Cognitive style matters more than experience, education, or topic: • Use of qualifiers like “however,” thinking about strategic interaction, awareness of trade-offs, and believing that other sensible people might reach different conclusions  “integrative cognitive complexity”  more success! • “Foxes” who use multiple methods and models, are more open to opposing arguments, take their time making decisions, see both sides of issues, revise their predictions using new evidence are more accurate than “hedgehogs” who focus on one area only and have a unified ideology or theory and a reluctance to revisit decisions or forecasts

  25. Tschoegl and Armstrong (2007) Review Tetlock “…Tetlock… employ[ed] the metaphor that Isaiah Berlin (1953)…used in his essay on Tolstoy: ‘The fox knows many things, but the hedgehog knows one big thing.’ Foxes draw on many ideas and sources of information; hedgehogs interpret the world using their favorite theory or dogma. Foxes are more tolerant of ambiguity and uncertainty than hedgehogs, who tend to be confident in the rightness of their view of the world.”

  26. Tschoegl and Armstrong (2007) Review Tetlock • “Tetlock argues that there is an inverse relationship between what works best in forecasting and what works best in the media. The fox may make the more accurate forecasts, but it is the dramatic, single-minded, combative hedgehog ideologue who makes the best TV, especially when matched against an opposite number in a point-counterpoint debate.”

  27. Tschoegl and Armstrong (2007) Review Tetlock • “Tetlock’s results suggest that talking to both a pessimistic hedgehog and an optimistic one introduces little bias as the mean of the hedgehogs’ forecasts is little different from the mean of the foxes’ forecasts. One of us (Tschoegl) worked for six years …generating economic forecasts of the Japanese economy… He quickly realized that his clients talked to and read the reports of several forecasters from a variety of stockbrokers and research institutes, Japanese and foreign. The fund mangers could not speak to everyone and read everything, so their problem was to find the smallest number of sources that would enable them to get both a mean and standard deviation. Although none used the following rule of thumb, it is not a bad characterization of the fund managers’ information gathering strategies: • Talk to four foxes, one pessimistic hedgehog and one optimistic hedgehog. Add the forecasts, divide by six to get the mean, and then use the mean and the six forecasts to calculate the standard deviation.”

  28. Tschoegl and Armstrong (2007) Review Tetlock “…Tetlock… employ[ed] the metaphor that Isaiah Berlin (1953)…used in his essay on Tolstoy: ‘The fox knows many things, but the hedgehog knows one big thing.’ Foxes draw on many ideas and sources of information; hedgehogs interpret the world using their favorite theory or dogma. Foxes are more tolerant of ambiguity and uncertainty than hedgehogs, who tend to be confident in the rightness of their view of the world.”

  29. IV. Improving Expert Forecasts • Tetlock, p.235, footnote 17, 2ndpara • Tschoegl and Armstron (2007): “What does help? First, there is safety in numbers, as long as the user of forecasts does this in an objective manner: forecasters should draw upon foxes as well as hedgehogs and combine their forecasts. However, putting experts in unstructured groups unaided by any formal techniques only makes things worse in that they become more confident without any gain in accuracy. Thus the automatic reaction to difficult problems of ‘let’s form a committee or study group’ of experts might make people feel better in the short run, but is unlikely to produce good forecasts (or policies).”

  30. A. The Delphi method: Structured Groups of Experts • Experts answer questionnaires in series of “rounds.” • Wikipedia: “After each round, a facilitator provides an anonymous summary of the experts’ forecasts from the previous round as well as the reasons they provided for their judgments. Thus, experts are encouraged to revise their earlier answers in light of the replies of other members of their panel. It is believed that during this process the range of the answers will decrease and the group will converge towards the "correct" answer. Finally, the process is stopped after a pre-defined stop criterion (e.g. number of rounds, achievement of consensus, stability of results) and the mean or median scores of the final rounds determine the results.”

  31. B. “The Wisdom of Crowds” 1. Opinion Polls: Frequently wrong, but right when outcome is obvious. Example: Forecasting Election Winners (Oct)

  32. 2. An Experiment in Sweden (Predict Vote Shares of Seven Parties). “Public” = Forecast Questionnaire (217 Respondents)

  33. C. Prediction Markets And Bookies • You can “buy” stock in a candidate (real money futures contracts) or or some political outcome  similar to gambling • Theory: people who invest money have a huge stake in the outcome, so have incentives to weigh information carefully (invisible hand) • Findings: • Oddsmakers generally predict well

  34. B. Prediction Markets And Bookies • You can “buy” stock in a candidate (real money futures contracts) or or some political outcome  similar to gambling • Theory: people who invest money have a huge stake in the outcome, so have incentives to weigh information carefully (invisible hand) • Findings: • Oddsmakers generally predict well • Electoral stock markets outperform public pundits

  35. 1996: Widespread Agreement on Outcome: But Pundits Were Less Stable Indicators

  36. 2000: A Weakness in the Market Revealed

  37. 2004: Market Predicts Correctly (Barely)

  38. Congress 2006: Blue (DH/DS) comes from behind Black (DH/RS) and Red (RH/RS)

  39. 2008: McCain is Written Off as a Nominee

  40. 2012 Forecast:

  41. B. Prediction Markets And Bookies • You can “buy” stock in a candidate (real money futures contracts) or or some political outcome  similar to gambling • Theory: people who invest money have a huge stake in the outcome, so have incentives to weigh information carefully (invisible hand) • Findings: • Oddsmakers generally predict well • Electoral stock markets outperform public pundits

  42. B. Prediction Markets And Bookies • You can “buy” stock in a candidate (real money futures contracts) or or some political outcome  similar to gambling • Theory: people who invest money have a huge stake in the outcome, so have incentives to weigh information carefully (invisible hand) • Findings: • Oddsmakers generally predict well • Electoral stock markets outperform public pundits • Even when informed, markets follow polls

  43. B. Prediction Markets And Bookies • You can “buy” stock in a candidate (real money futures contracts) or or some political outcome  similar to gambling • Theory: people who invest money have a huge stake in the outcome, so have incentives to weigh information carefully (invisible hand) • Findings: • Oddsmakers generally predict well • Electoral stock markets outperform public pundits • Even when informed, markets follow polls • Both polls and simple models outperform markets

  44. B. Electoral Stock Markets • You can “buy” stock in a candidate (real money futures contracts)  essentially a form of gambling • Theory: people who invest money have a huge stake in the outcome, so have incentives to weigh information carefully (invisible hand) • Findings: • Oddsmakers generally predict well • Electoral stock markets outperform public pundits • Even when informed, markets follow polls • Both polls and simple models outperform markets

More Related