1 / 35

A Brief Introduction to Statistical Forecasting

A Brief Introduction to Statistical Forecasting. Kevin Werner. Outline. Principle Component Theory Applications Z Score VIPER. Basic Forecast Methods. Simulation modeling. Statistical regression. S Fork Rio Grande, Colo. Snow. Rainfall. Heat. Apr-Jul streamflow % avg. Snowpack.

aglasco
Download Presentation

A Brief Introduction to Statistical Forecasting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Brief Introduction to Statistical Forecasting Kevin Werner

  2. Outline • Principle Component Theory • Applications • Z Score • VIPER

  3. Basic Forecast Methods Simulation modeling Statistical regression S Fork Rio Grande, Colo Snow Rainfall Heat Apr-Jul streamflow % avg Snowpack Runoff Soil water May 1 snowpack % avg Credit: Tom Pagano

  4. The General Linear Regression Model where: Y = dependent variable Xi = independent variables bi = regression coefficients n = number of independent variables Credit: Dave Garen

  5. The Problem If X’s are intercorrelated, they contain redundant information, and the b’s cannot be meaningfully estimated. However, we don’t want to have to throw out most of the X’s but prefer to retain them for robustness. Credit: Dave Garen

  6. Example Streamflow = bo + b1 * (Snotel A) + b2 * (Snotel B) -> Snotel sites are very well correlated -> An optimal b1 and b2 will be difficult to determine since the correlation is so strong

  7. The Solution Possibilities: 1) Pre-combine X’s into composite index(es), e.g., Z-score method 2) Principal components regression These are similar in concept but differ in the mathematics. Credit: Dave Garen

  8. Principal Components Analysis Principal components regression is just like standard regression except the independent variables are principal components rather than the original X variables. Principal components are linear combinations of the X’s. Credit: Dave Garen

  9. Principal Components Analysis Each principal component is a weighted sum of all the X’s: . . . Credit: Dave Garen

  10. Principal Components Analysis The e’s are called eigenvectors, derived from a matrix equation whose input is the correlation matrix of all the X’s with each other. Principal components are new variables that are not correlated with each other. The principal components transformation is equivalent to a rotation of axes. Credit: Dave Garen

  11. Principal Components Analysis Credit: Dave Garen

  12. Principal Components Analysis The eigenvectors (weights) are based solely on the intercorrelations among the X’s and have no knowledge of Y (in contrast to Z-score, for which the opposite is true). Principal components can be used for purely descriptive purposes, but we want to use them as independent variables in a regression. Credit: Dave Garen

  13. Credit: Dennis Hartmann

  14. Principal Components Analysis -- Example Independent Variables: X1 – X5 Snow water equivalent at 5 stations X6 – X10 Water year to date precipitation at 5 stations X11 Antecedent streamflow X12 Climate teleconnection index Credit: Dave Garen

  15. Correlation Matrix Credit: Dave Garen

  16. First Five Eigenvectors Credit: Dave Garen

  17. Principal Components Regression Procedure • Try the PC’s in order • Test for regression coefficient significance (t-test) • Stop at first insignificant component • Transform regression coefficients to be in terms of original variables • Sign test – coefficient signs must be same as correlation with Y Credit: Dave Garen

  18. Summary • Principal components analysis is a standard multivariate statistical procedure • Can be used for descriptive purposes to reduce the dimensionality of correlated variables • Can be taken a step further to provide new, non-correlated independent variables for regression • PC’s taken in order, subject to t-test and sign test • Final model is expressed in terms of original X variables Credit: Dave Garen

  19. Soil Moisture at the interannual timescale • Another example demonstrating importance of land surface processes in the climate system: Werner, 1999: • GCM run with and without active land surface model in South America to explore the importance of land surface processes in the climate system variability in the Nordeste region. • Both simulations include full atmospheric model, slab ocean model (no ocean dynamics), and dynamic land surface model everywhere except tropical South America in the Data Land simulation.

  20. Soil Moisture at the interannual timescale • Modeled variability • Full dynamic land surface model simulation contains variability resembling observed variability with connection between NH and SH SSTs. • Fixed land surface model shows no connected variability between NH and SH SSTs

  21. Resources • Dave Garen VIPER slides • Dennis Hartmann lecture notes (http://www.atmos.washington.edu/~dennis/)

  22. What does z-score regression do? 1. Combines predictors into weighted indices, emphasizing good stations, minimizing bad ones. 2. Compensates for missing data with remaining data. 3. Regresses index against target predictand Credit: Tom Pagano

  23. What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation Credit: Tom Pagano

  24. What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation Credit: Tom Pagano

  25. What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation avg stdev 135 30 60 15 Credit: Tom Pagano

  26. What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation avg stdev 135 30 60 15 Z = (90 – 60)/15 = +2 Credit: Tom Pagano

  27. How good are the results Under conditions of serially compete data, and relatively “normal” conditions PCA and Z-Score are effectively indistinguishable* Skill and behavior is similar to the official published outlooks** However… Any tool is a weapon if you hold it right. (aka “A fool with a tool is still a tool”) Credit: Tom Pagano *Viper technical note - 1 basin ** Pagano dissertation – 29 basins

  28. Super Quick Primer on VIPER

  29. The Viper Main Interface Layout and interpretation Credit: Tom Pagano

  30. The Viper Main Interface Layout and interpretation Selecting predictors and predictands Global month changes Credit: Tom Pagano

  31. The Viper Main Interface Layout and interpretation Selecting predictors and predictands Global month changes Predictors quality, availability Historical statistics Credit: Tom Pagano

  32. The Viper Main Interface Layout and interpretation Selecting predictors and predictands Forecast vs observed time series Station availability, weights Global month changes Predictors quality, availability Historical statistics Credit: Tom Pagano

  33. The Viper Main Interface Layout and interpretation Selecting predictors and predictands Forecast vs observed time series Station availability, weights Global month changes Predictors quality, availability Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Historical statistics Credit: Tom Pagano

  34. The Viper Main Interface Layout and interpretation Selecting predictors and predictands Forecast vs observed time series Station availability, weights Global month changes Predictors quality, availability Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Settings Probability bounds Historical statistics Credit: Tom Pagano

  35. Selecting predictors and predictands Forecast vs observed time series Station availability, weights Global month changes Predictors quality, availability Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Settings Probability bounds Historical statistics The Viper Main Interface Layout and interpretation There’s more if you scroll right: Relate any variable to another Credit: Tom Pagano

More Related