980 likes | 1.17k Views
Bayesian Hierarchical Modeling of Hydroclimate Problems. Balaji Rajagopalan Department of Civil, Environmental and Architectural Engineering And Cooperative Institute for Research in Environmental Sciences (CIRES) University of Colorado Boulder, CO, USA
E N D
Bayesian Hierarchical Modeling of Hydroclimate Problems Balaji Rajagopalan Department of Civil, Environmental and Architectural Engineering And Cooperative Institute for Research in Environmental Sciences (CIRES) University of Colorado Boulder, CO, USA Bayes by the Bay Conference, Pondicherry January 7, 2013
Co-authors & Collaborators • Upmanu Lall and Naresh Devineni – Columbia University, NY • Hyun-Han Kwon, Chonbuk National University, South Korea • Carlos Lima, Universidade de Brasila, Brazil • Pablo Mendoza James McCreight & Will Kleiber – University of Colorado, Boulder, CO • Richard Katz – NCAR, Boulder, CO • NSF, NOAA, USBReclamation and Korean Science Foundation
Outline • Bayesian Hierarchical Modeling • Introduction from GLM • Hydroclimate Applications • BHM • Contrast with near Bayesian models currently in vogue • Stochastic Rainfall Generator • BHM (Lima and Lall, 2009, WRR) • Latent Gaussian Process Model (Kleiber et al., 2012, WRR) • Riverflow Forecasting (Kwon et al., 2009, Hydrologic Sciences) • Seasonal Flow • Flow extremes • PaleoReconstruction of Climate (Devineni and Lall, 2012, J. Climate)
Linear Regression Models Suppose the model relating the regressors to the response is In matrix notation this model can be written as
Linear Regression Models where
Linear Regression Models We wish to find the vector of least squares estimators that minimizes: The resulting least squares estimate is
12-1 Multiple Linear Regression Models 12-1.4 Properties of the Least Squares Estimators Unbiased estimators: Covariance Matrix:
12-1 Multiple Linear Regression Models 12-1.4 Properties of the Least Squares Estimators Individual variances and covariances: In general,
Generalized Linear Model (GLM) Bayesian Perspective • Linear Regression is not appropriate • when the dependent variable y is not Normal • Transformations of y to Normal are not possible • Several situations (rainfall occurrence; number of wet/dry days; etc.) • Hence, GLM • Linear model is fitted to a ‘suitably’ transformed variable of y • Linear model is fitted to the ‘parameters’ of the assumed distribution • of y Likelihood
Generalized Linear Model (GLM) Bayesian Perspective Exponential family PDF, parameters All distributions Arise from this Normal, Exponential, Gamma Binomial, Poisson, etc • Noninformative prior on β • Assuming Normal distribution for Y, g (.) is identity Linear Regression
Generalized Linear Model (GLM) Bayesian Perspective • Log and logit – Canonical Link Functions
Generalized Linear Model (GLM) Bayesian Perspective Inverse Chi-Square
Summary • GLM is hierarchical • Specific Distribution • Link function • With a simple step – i.e., Providing priors and computing likelihood/posterior BHM • Assuming Normal distribution of dependent variable and uninformative priors • BHM collapses to a standard Linear Regression Model • Thus BHM is a generalized framework • Uncertainty in the model parameters and model • Structure are automatically obtained.
Generalized Linear Model (GLM) Example - Bayesian Hierarchical Model • Hard to sample from posterior • - Use MCMC
Stochastic Weather Generators Precipitation Occurrence, Rain Onset Day (Lima and Lall, 2009) Precipitation Occurrence and Amounts (Kleiber, 2012)
Users most interested in sectoral/process outcomes (streamflows, crop yields, risk of disease X, etc.) • Need for a robust spatial weather generator Historical Data Synthetic series – Conditional on Climate Information Process model 28.5 … … … 12.4 23.1 … … … 10.2 29.1 … … … 11.4 25.8 … … … 9.7 … Frequency distribution of outcomes
Need for Downscaling • Seasonal climate forecasts and future climate model projections often have coarse scales: • Spatial: regional • Temporal: seasonal, monthly • Process models (hydrologic models, ecological models, crop growth models) often require daily weather data for a given location • There is a scale mismatch! • Stochastic Weather Generators can help bridge this scale gap.
Precipitation Occurrence 504 stations in Brazil (Latitude & Longitude shown in figure) Lima and Lall (WRR, 2009) Modeling of rainfall occurrence (0 = dry, 1 = rain, P = 0.254mm threshold) using a probabilistic model (logistic regression):
Modeling Occurrence at a Site • where yst(n) is a non-homegeneous Bernoulli random variable for station s, day n and year t, being either 1 for a wet state or 0 for a dry state. • pst(n) is the rainfall probability for station s and day n of year t. The seasonal cycle is modeled through Fourier harmonics:
Results from Site #3 Outlier?
Bayesian Hierarchical Model (BHM) • But rainfall occurrence is correlated in space – how to model? - partial • BHM • Shrinks paramters towards a common mean, reduce uncertainty since we are use more information to estimate model parameters; • Parameter uncertainties are fully accounted during simulations
Bayesian Hierarchical Model (BHM) Likelihood Function Posterior Distribution – Bayes theorem MCMC to obtain posterior distribution
Clusters on average day of max probability Max Probability of Rainfall Day of Max Probability of Rainfall • Max Probability of rainfall correlated • With climate variables – ENSO, etc. • Characterize rainfall ‘onset’ • Prediction of ‘onset’ • Lima and Lall (2009, WRR)
Space-time Precipitation Generator Latent Gaussian Process (Kleiber et al., 201, WRR)
Latent Gaussian Process • Fit a GLM for Precipitation Occurrence and amounts at each location independently • Occurrence logistic regression-based • Amounts Gamma link function • Spatial Process to smooth the GLM coefficients in space • Almost BayesianHierarchicalModeling • Alpha, gamma – shape and scale parameter of Gamma
Latent Gaussian Process Occurrence Model
Latent Gaussian Process • Parameter Estimation MLE, two step
GLM + Latent Gaussian Process Kleiber et al. (2012)
For Max and Min Temperature Models Conditioned on Precipitation Model - Using Latent Gaussian Process Kleiber et al. (2013, Annals of App. Statistics, in press)
Outline • Bayesian Hierarchical Modeling • Introduction from GLM • Hydroclimate Applications • BHM • Contrast with near Bayesian models currently in vogue • Stochastic Rainfall Generator • BHM (Lima and Lall, 2009, WRR) • Latent Gaussian Process Model (Kleiber et al., 2012, WRR) • Riverflow Forecasting (Kwon et al., 2009, Hydrologic Sciences) • Seasonal Flow • Flow extremes • PaleoReconstruction of Climate (Devineni and Lall, 2012, J. Climate)
Seasonal average and maximum Streamflow Forecasting (Kwon et al.,2009, Hydrologic Sciences)
Identify Predictors • Correlate seasonal streamflow with large scale climate variables from preceding seaons • JJA flow with MAM climate • Select regions of strong (Grantz et al., 2005) correlation • predictors Streamflow Forecasting at Three Gorges Dam
BHM for Seasonoal Streamflow • Model Data showed mild nonlinearity Quadratic terms in the model is distributed as half-Cauchy with parameter 25 “mildly informative” Gelman (2006, Bayesian Analysis) MCMC is used to obtain the posterior distributions
Predictors 2, 3, 4 and 5 Show tighter Bounds Uncertainty in predictors (i.e. model) is obtained and propogated in the forecacsts You can use PCA or stepwise etc. to reduce the number of predictors (this can be crude) Streamflow Forecasting at Three Gorges Dam
Maximum Seasonal Streamflow Extreme Value Analysis – Floods (Kwon et al.,2010, Hydrologic Sciences)
American River at Fair Oaks - Ann. Max. Flood 100 yr flood estimated from 21 & 51 yr moving windows
Floods • The time varying (nonstationary) nature of hydrologic (flood) frequency (few examples) • Climate Variability and Climate Change • Climate Mechanisms that lead to changes in flood statistics • Adaptation Strategy • ‘Adaptive’ Flood Risk Estimation • Nonstationary Flood Frequency Estimation • Seasonal to Inter-annual Forecasts & Climate Change • Improved Infrastructure Management • Summary / Climate Questions and Issues related to Hydrologic Extremes
Flood Variance given DJFNINO3 and PDO Flood mean given DJF NINO3 and PDO NINO3 NINO3 PDO PDO Derived using weighted local regression with 30 neighbors Correlations: Log(Q) vs DJF NINO3 -0.34 vs DJF PDO -0.32 Jain & Lall, 2000
Russian River, CA Flood Event Russian River, CA Flood Event of 18-Feb-04 Atmospheric River generates flooding CZD Slide from Paul Neiman’s talk Russian River flooding in Monte Rio, California 18 February 2004 IWV (cm) GPS IWV data from near CZD: 14-20 Feb 2004 Bodega Bay Atmospheric river IWV (cm) IWV (inches) Cloverdale photo courtesy of David Kingsmill 10” rain at CZD in ~48 hours
Flood Estimation Under Nonstationarity • Significant interannual/interdecadal variability of floods • Stationarity assumptions (i.i.d) are invalid • Large scale climate features in the Ocean-Atmosphere-Land system orchestrate floods at all time scales • Need tools that can capture the nonstationarity • Incorporate large scale climate information • Year-to-Year time scale (Climate Variability) • Flood mitigation planning, reservoir operations • Interdecadal time scale (Climate Variability and Change) • Facility design, planning and management