1 / 43

Bayesian Spatial Modeling of Extreme Precipitation Return Levels

Bayesian Spatial Modeling of Extreme Precipitation Return Levels. Daniel COOLEY, Douglas NYCHKA, and Philippe NAVEAU (2007, JASA). Background. July 28, 1997, a rainstorm in Fort Collins, Colorado killed five people caused $250 million in damage 1976 Big Thompson flood near Loveland, Colorado

cheche
Download Presentation

Bayesian Spatial Modeling of Extreme Precipitation Return Levels

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Spatial Modeling of ExtremePrecipitation Return Levels Daniel COOLEY, Douglas NYCHKA, and Philippe NAVEAU (2007, JASA)

  2. Background • July 28, 1997, a rainstorm in Fort Collins, Colorado • killed five people • caused $250 million in damage • 1976 Big Thompson flood near Loveland, Colorado • Killed 145 people • 1965 South Platte flood • $600 million in damages around Denver

  3. extreme precipitation events • understanding their frequency and intensity is important for public safety and long-term planning • Challenges • limited temporal records • extrapolate the distributions to locations where observations are not available • Data • Precipitation amount at some stations • Possibly some other covariates

  4. Measure of extreme events • Return level • The r-year return level is the quantile that has probability 1/r of being exceeded in a particular year. P(X>tr) = 1/r • Precipitation return levels • given in the context of the duration of the precipitation event • The r-year return level of a d-hour (e.g., 6- or 24-hour) duration interval is reported. • The standard levels for the NWS’s most recent data products are quite extensive with duration intervals ranging from 5 minutes to 60 days and with return levels for 2–500 years. • This article focuses on providing return level estimates for daily precipitation (24 hours)

  5. Most recent precipitation atlas for Colorado • Produced in 1973 • the atlas provides point estimates of 2-, 5-, 10-, 25-, 50-, and 100-year return levels for duration intervals of 6 and 24 hours. • Shortcoming • it does not provide uncertainty measures of its point estimates

  6. Extreme value theory (EVT) • Statistical models for the tail of a probability distribution • Univariate case: generalized extreme value (GEV) distribution • Given iid continuous data Z1,Z2, . . . ,Zn and letting Mn = max(Z1,Z2, . . . ,Zn), it is known that if the normalized distribution of Mn converges as n→∞, then it converges to a GEV

  7. Generalized Pareto distribution (GPD) • Using the maxima only disregards other extreme data that could provide additional information. • GPD • based on the exceedances above a threshold • Exceedances (the amounts that observations exceed a threshold u) should approximately follow a GPD as u becomes large and sample size increases

  8. GPD • Tail of the distribution • Scale parameter • Shape parameter controls the tail

  9. More EVT Exceedance rate

  10. Extreme of spatial data • Weather describes the state of the atmosphere at a given time • Extreme weather events can be modeled by theory on the dependence of extreme observations • Climate at a given location is the distribution over a long period of time • climatological quantities, such as return levels, and their spatial dependence must be modeled outside of the framework above • How does the distribution of precipitation vary over space?

  11. Goal • Let Z(x) denote the total precipitation for a given period of time (e.g., 24 hours) and at location x. • The goal is to provide inference for the probability P(Z(x) > z + u) for all locations, x, in a particular domain and for u large • Given this function, one can compute return levels and other summary measures • To produce a return level map with measure of uncertainty

  12. Basic idea • In the GPD model, we add a spatial component by considering all parametersto be functions of a location x in the study area. • We assume that the values of result from a latent spatial process that characterizes the extreme precipitation and arises from climatological and orographic effects. • The dependence of the parameters characterizes the similarity of climate at different locations

  13. A Bayesian study • A study of 24-hour precipitation extremes for the Front Range region of Colorado • Estimate potential flooding • Apr 1 – Oct 31 • 75% of Colorado’s population lives in this area

  14. Study Region

  15. Data • 56 weather stations • Daily total precipitation amounts during 1948-2001 • 21 stations have over 50 years of data • 14 stations have less than 20 years of data • All stations have some missing values • Covariates • Elevation • Mean precipitation (MSP) • Remark: covariate information is needed for the entire region to interpolate over the study region and produce a precipitation map

  16. Boulder Station

  17. Data Precision • Boulder Station • prior to 1971, precipitation was recorded to the nearest 1/100th of an inch (.25 mm) • after 1971, recorded to the nearest 1/10th of an inch (2.5 mm) • All but three stations similarly switched their level of precision around 1970 • Low precision data is a discretization of the high precision data

  18. Treatment to discretization • True value is uniformly distributed around the observed value • What is the effect of such an assumption? • Adjust the likelihood • d is the length of the interval

  19. How to choose the threshold u? • Bias-variance trade off • If u is large, distribution is close to GPD • If u is large, less data can be used • Finally, the threshold is taken as 0.55 inches • a threshold sensitivity analysis of model runs indicates that the shape parameter is more consistently estimated above this threshold • 7789 exceedances (2% of the original data)

  20. Residual dependence • Assumption • the precipitation observations are conditionally independent spatially and temporally given the stations’ parameters • the spatial dependence is accounted for in the stations’ parameters • This conditional independence may not be true, though.

  21. temporal independence • Temporal dependence • When dependence is short range and extremes do not occur in clusters, maxima still converges to GEV in distribution • If a station had consecutive days that exceeded the threshold, we declustered the data by keeping only the highest measurement • Declustering actually did not change the results much

  22. Spatial dependence • The authors tested for spatial dependence in the annual maximum residuals of the stations • there was a low level of dependence between stations within 24 km (15 miles) of one another and no detectable dependence beyond this distance. • there are very few stations within this distance that record data for the same time period

  23. Seasonal effects • Restricting our analysis to the nonwinter months reduces seasonality • inspecting the data from several sites showed no obvious seasonal effect

  24. Model for Threshold Exceedance • Hierarchical model • Layer 1: data at each station • Layer 2: the latent process that drives the climatological extreme precipitation for the region • Layer 3: the prior distributions of the parameters that control the latent process

  25. Data layer for return level • A GPD distribution • Reparametrization • Let be the kth recorded precipitation amount at location density

  26. Process layer • A structure that relates the parameters of the data layer to the orography and climatology of the region. • Spatial (longitude/latitude) space  climate (elevation/MSP) space • Stations are sparse in the spatial space • Stations far away spatially can be close in the climate space • MSP: mean precipitation

  27. Scale parameter • : A Gaussian process with

  28. Shape parameter • A single value for the entire study region with a Unif(-Inf, Inf) prior • Two values • One for the mountain stations • One for the plain stations • A Gaussian process with structure similar to the scale parameter

  29. Process layer

  30. Priors of • Prior independence • Regression parameter: noninformative • Spatial parameter • Noninformative leads to improper posterior • Informative priors from MLE • Shape parameter

  31. Priors

  32. Model for Exceedance Rate • To know the return level, we need to know both the model parameters and the exceedance rate • Assume each station’s number of exceedances is binomial with probability parameter • Logit transformation • Assume the logit transformed parameter as a Gaussian process • Similar prior specification

  33. MCMC • Metropolis within Gibbs • Proposal distribution is obtained using normal approximation or random walk • Three parallel chains • Each chain has 20,000 iterations • 2000 burn-in steps • Test for convergence: Gelman<1.05 • Draws are used to perform spatial interpolation and inference

  34. Point estimate for log-transformed GPD scale parameter

  35. Point estimate for 25-year return level for daily precipitation

  36. 0.025 and 0.975 quantile of the 25-year return level

  37. Sensitivity analysis • Sensitivity of the inference to prior of • Ran Model 7 with • Original prior for : Unif[6/7,12] • Alternative prior : Unif[0.214,6] • Posterior of is sensitive to the prior • But the product is less sensitive, and it is what is important for interpolation

  38. Conclusions • A Bayesian analysis for spatial extremes • Model for exceedances • Model for threshold exceedance rate parameter • By performing the spatial analysis on locations defined by climatological coordinates, the authors were able to better model regional differences for this geographically diverse study area. • Produce a map of return levels with features not well shown by the 1973 atlas • an east–west region of higher return levels north of the Palmer Divide • a region of lower return levels around Greeley • region-wide uncertainty measures

More Related