250 likes | 267 Views
Efficient estimation of flood quantiles using GEV distribution fit on censored flood records with Bayesian GLS regression for shape parameter. GLS model error analysis and regional estimation of parameters.
E N D
GEV Flood Quantile Estimators with Bayesian Shape-Parameter GLS Regression Dirceu Silveira Reis Jr., Jery R. Stedinger and Eduardo Savio Martins Fundação Cearense de Meteorologia e Recursos Hídricos– FUNCEME, Fortaleza, Brazil, and Cornell University, Ithaca, NY, USA.
The Challenge Wish to estimate extreme flood quantiles (such as 99 percentile) and flood risk distribution by fitting 3-parameter GEV distribution using annual maximum flood records of length 25 to 100 years. Some records have censored values - only know floods were below or above a threshold, due to “zero” values and recording limitations for historical records.
The Problem Would like to use maximum likelihood estimators because of asymptotic properties, and ability to incorporate censored data. • But absurd MLEs of can result in moderate samples • Large uncertainty in extreme flood quantile estimates • BEWARE: Statisticians often call - .
To reduce uncertainty • Use regression to derive regional shape parameter and its variance using basin characteristics. • At-site estimates obtained with L-moments. • Small-sample variance and cross-correlations of -estimators derived by Monte Carlo analysis. • Combine at-site data & regional information as a prior probability distribution on to define Generalized Maximum Likelihood Estimators (GMLE)
Outline • GLS Regression Model • Classical model error estimator • Bayesian analysis • GLS Regional for Illinois River Basin • GEV for Illinois Sites with Regional • Conclusions
GLS Regression GOAL: Obtain efficient estimators of a hydrologic statistic as a function of physiographic basin characteristics. MODEL: = a+b1log(Area)+b2(Slope) +. . . + Error
GLS Model Why use Generalized Least Squares: different record lengths mi => different precisions PLUS cross-correlation among estimators Basic Model for k ^ L() = Cov{ i, j } = I + where computed using average with record lengths mi and cross-correlations i,j between concurrent flows. ^
GLS Analysis: Solution GLS regression model (Stedinger & Tasker, 1985, 1989) = X b + with parameter estimator b for b { XTL()-1 X } b = XTL()-1 Estimate model-error using moments ( - X b)TL()-1 ( - X b) = n - p with L() = I + n = dimension of vector; p = dimension of b ^ ^ ^ ^ ^ ^
Measures of Precision Model error variance Var[] = E{ [ i - xiT ]2 } = 2 Parameter (sampling) error variance Var[b] = (XT-1X)-1 Variance of Predictionfor new site VPnew= E{ [0 - x0T b]2 } = 2 + x0T (XT-1X)-1 x0
Pseudo R2 for GLS Not interested in total error that includes sampling error which cannot hope to explain; how much of critical model error can we explain, where Var[] = ? Consider the GLS model:
Key Lesson For range of problems, GLS provides a flexible procedure for regionalization when only imperfect estimates of hydrologic parameters are available. Generally provides unbiased estimators of model error variance and Var(b), and efficient estimators b of
Moment Estimator’s Drawbacks • Estimated 2 often equals zero when sampling variance is dominant error. • Moments procedure does not provide natural measure of precision of the model error variance 2 • Asymptotic MLE approximations have trouble with bound at zero.
Likelihood function - model error 2(Tibagi River, Brazil, n=17) Maximum of likelihood may be at zero, but larger values are very probable. Zero clearly not in middle of likely range of values.
Advantages of Bayesian Analysis Provides posterior distribution of parameters model error variance 2, and predictive distribution for dependent variable Bayesian Approach is a natural solution to the problem
Bayesian GLS Model • Prior distribution: x(, 2) • Parameter b are multivariate normal (Q) • Model error variance 2 • Exponential dist. (); E[2 ] = = 24 Likelihood function: Assume data is multivariate N[ X, ]
Quasi-Analytic Bayesian GLS • Joint posterior distribution • Marginal posterior of sd2 where integrate analytically normal likelihood & prior to determine f in closed-form.
Quasi-Analytic Result From joint posterior distribution can compute marginal posterior of b and moments by 1- dimensional num. integrations
Example – Illinois River Basin Stations: 62 stations in midwest USA Record length: 14 to 90 years Covariates: Drainage area Main channel slope Length Area of lakes Forest cover Soil permeability index Dummy variables Z1 and Z2 for regions
Best Regional Models of kIllinois River basin (USA) ASV = average sampling variance = Avg{ xT Var[] x } AVP = average variance of prediction = 2 + Avg{ xT Var[] x }
What did we learn? • OLS results in too large a model error variance estimate because includes sampling error. • Weighted & Generalized Least Squares produce different results: cross-correlation matters. • Moment versus Bayesian estimator of model error variance was important: different model error variances result; different models selected. • Obtained informative regional distribution. Pseudo-R2 = 55%.
GEV -estimates using prior information(variances in parentheses) Sites are in Illinois Basin. Geophysical prior has mean = -0.1, var = (0.1222).
Conclusions • GMLE - regional prior for site with n = 90, reduced by 25% uncertainty variance in 100-yr flood; for site with n = 27, reduced 100-year flood by 50% and variance 60 times! • GLS reflects precision due to different record lengths plus cross-correlations among hydrologic statistics; thus provides more realistic description of sampling errors. • Bayesian GLS: • quasi-analytic procedure • full realistic posterior of and plus moments • provides regional estimates of and their precision
References Coles, S.G. and M.J. Dixon, Likelihood-Based Inference for Extreme Value Models, Extremes, 2(1), 5-23, 1999. Martins, E.S. and J.R. Stedinger, “Generalized Maximum Likelihood GEV Quantile Estimators for Hydrologic Data,” Water Resour. Res., 28(11), 3001-3010, 2000. Martins, E.S., and J.R. Stedinger, “Cross-correlation among estimators of shape”, Water Resour. Res., 38(11), doi:10.1029/2002WR001589,2002. Reis, D.S., Jr., Flood Frequency Analysis Employing Bayesian Regional Regression and Imperfect Historical Information, Ph.D. Thesis, Cornell University, Ithaca, NY, USA, 2005. Reis, D. S., Jr., J.R. Stedinger, and E.S. Martins, Bayesian GLS Regression with application to LP3 Regional Skew Estimation, Proceedings World Water & Environmental Resources Congress 2003, Editors P. Bizier and P. DeBarry, Philadelphia, PA, American Society of Civil Engineers, June 23-26, 2003. Reis, D. S., Jr., J.R. Stedinger, and E.S. Martins, Bayesian GLS Regression with application to LP3 Regional Skew Estimation, accepted Water Resour. Res., , May 2005. Stedinger, J.R., and G.D. Tasker, Regional Hydrologic Analysis, 1. Ordinary, Weighted and Generalized Least Squares Compared, Water Resour. Res., 21(9), 1421-1432, 1985. [correction, Water Resour. Res. 22(5), 844, 1986.]