1 / 43

Introduction to Probability and Probabilistic Forecasting

Introduction to Probability and Probabilistic Forecasting. Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University AMS Short Course on Probabilistic Forecasting San Diego, CA, January 9, 2005.

dima
Download Presentation

Introduction to Probability and Probabilistic Forecasting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Probability and Probabilistic Forecasting Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University AMS Short Course on Probabilistic Forecasting San Diego, CA, January 9, 2005 L i n k i n g S c i e n c e t o S o c i e t y

  2. Questions about the Future We make forecasts to answer questions about the future: Will it rain in San Diego next weekend? Will it snow in San Diego next weekend? Will the Red Sox win the 2005 World Championship? Will Dick Cheney die a pauper? Will this surfer live to 50? L i n k i n g S c i e n c e t o S o c i e t y

  3. Probability For most situations the future is uncertain. In cases where the answer to a question about the future is uncertain, we tend to use probabilities to express this uncertainty in the outcome. L i n k i n g S c i e n c e t o S o c i e t y

  4. Probability and Events We refer to a specific outcome, or a specific combination of outcomes, as an event, and refer to the probability of this event. What do these terms mean? • event: a predefined outcome that forms the subject of a forecast (an outcome of interest) – examples:- rain in San Diego this afternoon;- tornado touch down anywhere in Iowa tomorrow;- average LAX January temperature of below 10°C;- NIÑO3 anomaly of more than +2°C by October;- global warming of more than +1°C by 2050. L i n k i n g S c i e n c e t o S o c i e t y

  5. Probability and Events Events can be: - elementary (“hot”) or; - compound (“hot and dry”; rain two days in a row). Events either occur or do not occur – there are only these two possible outcomes (but there may be some uncertainty as to whether the outcome has occurred). If the event does not occur, its complement occurs. Events need to be well-defined to avoid ambiguity. (What does “unfaithful” mean? What does “in San Diego” mean?) L i n k i n g S c i e n c e t o S o c i e t y

  6. Notation An elementary event is often denoted by the letter E, and its complement by . To distinguish an elementary event from a second elementary event, subscripts may be used: first elementary event: E1 second elementary event: E2 A compound event occurs when the first AND the second elementary event occur (or, more generally, when all elementary events occur): L i n k i n g S c i e n c e t o S o c i e t y

  7. Probability and Events Uncertainty often is expressed using expressions such as “it is likely”, “the chances are”, etc. Different degrees of uncertainty can be indicated: “possibly” indicates higher uncertainty than “probably”. Compare: - will it rain in San Diego next weekend? - will it snow in San Diego next weekend? • probability: a quantitative measure of the uncertainty in the event. L i n k i n g S c i e n c e t o S o c i e t y

  8. Probability and Uncertainty Probabilities are used where there is uncertainty. Apart from ambiguity, there are two sources of uncertainty: • our understanding is limited; • there is some inherent randomness in the outcome. • We do not know for certain what will happen. • We cannot know for certain what will happen. L i n k i n g S c i e n c e t o S o c i e t y

  9. Probability and Uncertainty • When probability is 1, the event will definitely occur. It is impossible for the event not to happen. • When probability is 0, the event will definitely not occur. It is impossible for the event to happen. • When the probability is between 0 and 1, the event may or may not happen. But how can we quantify uncertainty? L i n k i n g S c i e n c e t o S o c i e t y

  10. Probability • When probability is close to 1, the event is more likely to occur than not to occur. • When probability is close to 0, the event is more likely not to occur than to occur. • When the probability is 0.5, the event is just as likely to happen as not to happen. L i n k i n g S c i e n c e t o S o c i e t y

  11. Odds • When the probability of an event, E, is 0.75, the probability that the event will not happen, the complement of the event, , is: 1 – 0.75 = 0.25 • When the probability is 0.75, the event is three times more likely to happen than not to happen: L i n k i n g S c i e n c e t o S o c i e t y

  12. Probability But how do we determine how likely the event is compared to its complement? How do we obtain / calculate probabilities? L i n k i n g S c i e n c e t o S o c i e t y

  13. Interpretations of Probability: I How do we obtain / calculate probabilities? • What is the probability that it will rain in San Diego (at Lindberg Field) on January 31, 2005? • How often has it rained on the same day in previous years (1927 – 2003)? Climatology. L i n k i n g S c i e n c e t o S o c i e t y

  14. Interpretations of Probability: I • What is the probability that it will rain in San Diego (at Lindberg Field) on January 31, 2005? L i n k i n g S c i e n c e t o S o c i e t y

  15. Probability as Relative Frequency What is the probability that it will rain in San Diego on January 31, 2005? • Look for similar / identical situations. • Repeat the experiment many times – only “unimportant” things are allowed to change. Note that there may be sampling errors in the relative frequencies – uncertainty about the uncertainty! (The distribution of these sampling errors can be obtained using the binomial distribution). L i n k i n g S c i e n c e t o S o c i e t y

  16. Interpretations of Probability: II How do we obtain / calculate probabilities? • What is the probability that it will rain in San Diego (at Lindberg Field) tomorrow? • How often has it rained on the same day in previous years with similar atmospheric conditions? Only unimportant things are allowed to change. • This experiment has no precedent – today’s initial conditions are “important”, and they are unique. L i n k i n g S c i e n c e t o S o c i e t y

  17. Probability as Subjective Belief The probability that it will rain in San Diego is best estimated by conditioning upon the current atmospheric state, a set of conditions that are unique. • Make a forecast based on the physics of the atmosphere, and expert knowledge / experience. • Produce an ensemble of forecasts based on sampling of known uncertainties in the physics of the atmosphere and / or in the initial conditions (Bright). • The probability now represents the degree to which we believe that it will rain in San Diego tomorrow. L i n k i n g S c i e n c e t o S o c i e t y

  18. Interpretations of Probability So two interpretations of probability are: • relative frequency interpretation: how often the event has occurred in similar situations in the past; • subjective interpretation: how confident we are the event will occur this time. But all probabilities could be defined as subjective because of he subjectivity in defining which situations are “similar”. L i n k i n g S c i e n c e t o S o c i e t y

  19. Probability as Relative Frequency The 77-year climatology does not provide a good estimate of the probability that it will rain in San Diego tomorrow because there are some “important” differences between the 77 instances of January 10 and January 10, 2005. Sometimes we can improve upon climatological forecasts because of access to “important” information. But how do we know whether the information is important? Is the probability of the event different when these conditions are present compared to when they are not? L i n k i n g S c i e n c e t o S o c i e t y

  20. Conditional Probabilities The relative frequency of rainfall on January 10 could be obtained by considering only those January 10s on which January 9 rainfall occurrence was the same as on January 9, 2005. If January 9 is wet: P(January 10 is wet)? If January 9 is dry: P(January 10 is wet)? This conditional probability is different from a compound event P(E1E2), because we know that E2 has (or has not) occurred already. L i n k i n g S c i e n c e t o S o c i e t y

  21. Conditional Probabilities L i n k i n g S c i e n c e t o S o c i e t y

  22. Conditional Probabilities Venn diagram showing compound event: For conditional probabilities, the outcome of Jan 9 is known already, and so the sample space is reduced: L i n k i n g S c i e n c e t o S o c i e t y

  23. L i n k i n g S c i e n c e t o S o c i e t y

  24. Conditional Probabilities What is the probability that it will rain in San Diego (at Lindberg Field) on January 10, 2005, given that it has (or has not) rained on January 9, 2005? L i n k i n g S c i e n c e t o S o c i e t y

  25. Conditional Probabilities What is the probability that it will rain in San Diego (at Lindberg Field) on January 10, 2005, given that it has has rained on January 9, 2005? L i n k i n g S c i e n c e t o S o c i e t y

  26. Conditional Probabilities What is the probability that it will rain in San Diego (at Lindberg Field) on January 10, 2005, given that it has not rained on January 9, 2005? L i n k i n g S c i e n c e t o S o c i e t y

  27. Updating Probabilities Based on the occurrence of January 9 rainfall, the probability of rainfall on January 10 has been updated from 0.22 to 0.54. What if we now obtain a model forecast that states it will rain tomorrow, E3? All we know about the model is that it has given a correct forecast 90% of the time over the last few days. How can we update our probability for rain tomorrow? L i n k i n g S c i e n c e t o S o c i e t y

  28. Bayes’ Theorem What is the probability that it will rain in San Diego (at Lindberg Field) on January 10, 2005, given that it has has rained on January 9, 2005 AND that the model forecasts rain? (To simplify, conditions on E2 are dropped.) L i n k i n g S c i e n c e t o S o c i e t y

  29. L i n k i n g S c i e n c e t o S o c i e t y

  30. L i n k i n g S c i e n c e t o S o c i e t y

  31. L i n k i n g S c i e n c e t o S o c i e t y

  32. L i n k i n g S c i e n c e t o S o c i e t y

  33. Bayes’ Theorem All terms on the right are unknown The priors, at least, are known … L i n k i n g S c i e n c e t o S o c i e t y

  34. Bayes’ Theorem are likelihoods: they tell us how likely it is that rain was forecasted, assuming that it will / will not rain, respectively. Or: how often are rain days successfully forecasted / dry days unsuccessfully forecasted? L i n k i n g S c i e n c e t o S o c i e t y

  35. Bayes’ Theorem We do not have exact values for the likelihoods on the right side of the equation, but if we assume that the model has no bias, given that it has been correct 90% of the time, we can infer that 90% of the rain days have been forecasted. L i n k i n g S c i e n c e t o S o c i e t y

  36. Bayes’ Theorem L i n k i n g S c i e n c e t o S o c i e t y

  37. Bayes’ Theorem Bayes’ theorem allows us to update probabilities (posterior probabilities): The prior probabilities are imply the the best estimate of the probabilities before considering the new information. The may already have been previously updated. The likelihoods indicate how likely the new information is, assuming a specific outcome. For example: how likely is it that the forecast would be for wet conditions assuming that it is going to be wet / dry. (I.e., the hit and false alarm rates of the ROC.) L i n k i n g S c i e n c e t o S o c i e t y

  38. Conditional Probabilities A problem with conditional probabilities is that the sample space is reduced, and so the errors in estimating the relative frequencies increases. These errors increase as the number of conditions is increased, and it is easy to reach the extreme case of having no previous cases with only “unimportant” differences from which to calculate the relative frequencies. (number of possible states = 2n). In numerical weather prediction the infinite dimensions of the current atmospheric state are important, and so the current initial conditions are unique. L i n k i n g S c i e n c e t o S o c i e t y

  39. Conditional Probabilities What if the probability of rainfall tomorrow depends on how much rainfall there is today rather than just its occurrence? In this case the outcome of the current event is not dependent upon another event measured on a binary scale. Jolliffe – some statistical models for calculating probabilities of events that are functions of continuous variables. L i n k i n g S c i e n c e t o S o c i e t y

  40. Conditional Probabilities Similarly, many forecast verification procedures are based on conditional probabilities: • reliability: given a forecast of 90% chance of rain, how often does rain occur? P(E|F=f) • NB – notice that this involves a subjective interpretation of probability – we are verifying forecasts with similar levels of confidence, not forecasts with similar boundary / initial conditions. L i n k i n g S c i e n c e t o S o c i e t y

  41. Conditional Probabilities • resolution: can we expect a different outcome given a different forecast? Note reliability and resolution are often confused. Resolution: is P(E) conditional upon the forecast? Reliability: if F=f does P(E|F=f)=f? L i n k i n g S c i e n c e t o S o c i e t y

  42. REL = (BC)2 REL = (EC)2 RES = (AC)2 Note the y-axis gives the conditional probability of the event given the forecast. L i n k i n g S c i e n c e t o S o c i e t y

  43. Reliability and Resolution RELIABILITY: Are the forecast probabilities correct? Do the forecast probabilities reflect an appropriate level of confidence? RESOLUTION: Does the outcome depend on the forecast? Do different forecast probabilities imply actual differences in the probability of an event? L i n k i n g S c i e n c e t o S o c i e t y

More Related