540 likes | 1.06k Views
Errors and uncertainties in measuring and modelling surface-atmosphere exchanges. Andrew D. Richardson University of New Hampshire NSF/NCAR Summer Course on Flux Measurements Niwot Ridge, July 17 2008. Outline. Introduction to errors in data Errors in flux measurements
E N D
Errors and uncertainties in measuring and modellingsurface-atmosphere exchanges Andrew D. Richardson University of New Hampshire NSF/NCAR Summer Course on Flux Measurements Niwot Ridge, July 17 2008
Outline • Introduction to errors in data • Errors in flux measurements • Different methods to quantify flux errors • Implications for modeling
Errors are unavoidable,but errors don’t have to cause disaster (Gare Montparnasse, Paris, 22 October 1895)
Errors in data • Why do we have measurement errors?
Errors in data • Why do we have measurement errors? • Instrument errors, glitches, bugs • Instrument calibration errors • Imperfect instrument design, less-than-ideal application • Instrument resolution • “Problem of definition”: what are we trying to measure, anyway? • Errors are unavoidable and inevitable, but they can always be reduced • Errors are not necessarily bad, but not knowing what they are, or having an unrealistic view of what they are, is bad
Contemporary political perspective “There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know.” Donald Rumsfeld (February 12, 2002)
What measurement uncertainty? • A measurement is never perfect - data are not “truth” (corrupted truth?) • “Uncertainty” describes the inevitable error of a measurement: x´ = x + + • x´ is what is actually measured; it includes both systematic (d) and random (e) components • typically, e is assumed Gaussian, and is characterized by its standard deviation, (e)
Types of errors • Random error • Unpredictable, stochastic • Scatter, noise, precision • Cannot be corrected (because they are stochastic) • Example: noisy analyzer (electrical interference) • Systematic error • Deterministic, predictable • Bias, accuracy • Can be corrected (if you know what the correction is) • Example: mis-calibrated analyzer (bad zero or bad span)
Propagation of errors • Random errors: • true value, x, measure xi’ = x + ei, where ei is a random variable ~N(0,si) • ‘average out’ over time, thus errors accumulate ‘in quadrature’ • expected error on (x1’+ x2’) is , which is • Systematic errors: • fixed biases don’t average out, but rather accumulate linearly • measure xi’ = x + di,where di isnota random variable • expected error on (x1’+ x2’) is just (d1 + d2) • So random and systematic errors are fundamentally different in how they affect data and interpretation
Precision and Accuracy:Target analogy • Accuracy: how close to center • Precision: how close together High accuracy, low precision High precision, low accuracy
Evaluating Errors • Random errors and precision • Make repeated measurements of the same thing • What is the scatter in those measurements? • Systematic errors and accuracy • Measure a reference standard • What is the bias? • Not always possible to quantify some systematic errors (know they’re there, but don’t have a standard we can measure)
What can we do about errors? • (Honestly) quantify sources of error • Random • Systematic • Minimize/eliminate them • Identify ways to reduce specific sources of error • Examples: more frequent calibration, better QC procedures, reduce instrument noise • Correct for biases where possible • Evaluate reductions in error
Why do we care about quantifying errors? • How much confidence do we have in the data? • How certain are we about a particular measurement? • How close to the “true” value are we? • Are our data biased? In which direction? • Errors influence our interpretation of data • Errors reduce the usefulness of the information in our data • Errors in data propagate to subsequent analyses
Random errors lead to “statistical universes” • The observed data are just one realization, drawn from a “statistical universe of data sets” (Press et al. 1992) • Different realizations of the random draw lead to different estimates of “true” model parameters (or other statistics calculated from data) • Parameters estimates are therefore themselves uncertain (but we want to describe their distributions!) A statistical universe
Monte Carlo Example • Two options for propagating errors • Complicated mathematics, based on theory and first principles • “Monte Carlo” simulations • Characterize uncertainty • Generate synthetic data (model + uncertainty) i.e. new realization from “statistical universe” • Estimate statistics or parameters, P, of interest • Repeat (2 & 3) many times • Posterior evaluation of distribution of P
Viva Las Vegas! • “Offered the choice between mastery of a five-foot shelf of analytical statistics books and middling ability at performing statistical Monte Carlo simulations, we would surely choose to have the latter skill.” • William H. Press, • Numerical Recipes in Fortran 77 Why you should learn a programming language (even fossil languages like BASIC): It’s really easy and fast to do MC simulations. In spreadsheet programs, it is extremely tedious (and slow), especially with large data sets (like you have with eddy flux data!).
What we want to know • What are the characteristics of the error • What are the sources of error, and do the sources of error change over time? • How big are the errors (10±5 or 10.0001±0.0001), and which are the biggest sources? • Are the errors systematic, random, or some combination thereof? • Can we correct or adjust for errors? • Can we reduce the errors (better instruments, more careful technician, etc.?)
Characteristics of interest • How is the error distributed? • What is the (approximate) pdf • What are its moments? • First moment: mean (average value) • Second moment: standard deviation (how variable) • Third moment: skewness (how symmetric) • Fourth moment: kurtosis (how peaky)
Random error distributions • Assumptions: • Normal distribution • Constant variance • Independent errors • Reality: • Other distributions are possible! • Variance may not be constant • Independence? Normal Laplace Lognormal Uniform
Constant variance? • Homoscedastic (constant variance) vs. Heteroscedastic (nonconstant variance) • Assume homoscedastic, but commonly: error variance scales with measurement • Large outliers more likely when error variance is large
Independence • Measurement error in period (t) are uncorrelated with errors in period (t-1) • Independent • coin toss (white noise) • Negative autocorrelation • growth rate estimated from size measurements • Positive autocorrelation • weekly biomass estimates confounded by seasonal variation in other factors (e.g. summer drought and shrinking xylem) • Difficult to detect or test for independence without a very good underlying model (with poor model, apparent autocorrelation may be due to model structure and not error structure!)
Take-home messages • For modelling, • More noise in data = more uncertainty in estimated model parameters • Systematic error in data = biased estimates of model parameters • Monte Carlo simulations as an easy way to evaluate impact of errors
Why characterize flux measurement uncertainty? • Uncertainty information needed to compare measurements, measurements and models, and to propagate errors (scaling up in space and time) • Uncertainty information needed to set policy & for risk analysis (what are confidence intervals on estimated C sink strength?) • Uncertainty information needed for all aspects of data-model fusion (correct specification of cost function, forward prediction of states, etc.) • Small uncertainties are not necessarily good; large uncertainties are not necessarily bad: biased prediction is ok if ‘truth’ is within confidence limits; if truth is outside of confidence limits, uncertainties are under-estimated
Challenge of flux data • Complex • Multiple processes (but only measure NEE) • Diurnal, synoptic, seasonal, annual scales of variation • Gaps in data (QC criteria, instrument malfunction, unsuitable weather, etc.) • Multiple sources of error and uncertainty (known unknowns as well as unknown unknowns!) • Random errors are large but tolerable • Systematic errors are evil, and the corrections for them are largely uncertain (sometimes even in sign) • Many sources of systematic error (sometimes in different directions)
Systematic errors Random errors
Systematic errors Random errors • examples: • nocturnal biases • imperfect spectral response • advection • energy balance closure • operate at varying time scales: fully systematic vs. selectively systematic • variety of influences: fixed offset vs. relative offset • cannot be identified through statistical analyses • can correct for systematic errors (but corrections themselves are uncertain) • uncorrected systematic errors will bias DMF analyses • examples: • surface heterogeneity and time varying footprint • turbulence sampling errors • measurement equipment (IRGA and sonic anemometer) • random errors are stochastic; characteristics of pdf can be estimated via statistical analyses (but may be time-varying) • affect all measurements • cannot correct for random errors • random errors limit agreement between measurements and models, but should not bias results
Why focus on random errors? • Systematic errors can’t be identified through analysis of data • Systematic errors are harder to quantify (leave that to the geniuses) • For modeling, must correct for systematic errors first (or assume they are zero) • Knowing something about random errors is much more important from modeling perspective
Two methods • Repeated measurements of the “same thing” • Paired towers (rarely applicable) • Hollinger et al., 2004 GCB ; Hollinger and Richardson, 2005 Tree Phys • Paired observations (applicable everywhere) • Hollinger and Richardson, 2005 Tree Phys; Richardson et al. 2006 AFM • Comparison with “truth” • Model residuals (assume model = “truth”) • Richardson et al., 2005 AFM; Hagen et al. JGR 2006; Richardson et al. 2008 AFM
Paired tower approach Repeated measurements: Use simultaneous but independent measurements from two towers, x1 and x2 Howland: Main and West towers - same environmental conditions - located in similar patches of forest - non-overlapping footprints (independent turbulence) Main West ~800m
Paired measurements Assume we have measurements x1, x2 from two towers var(x1 – x2) = var(x1) + var(x2) – 2 covar (x1, x2) Since x1 and x2 are assumed independent, covar(x1, x2) = 0 Also, var(x1) = var(x2) = var(e), where e is the random error So var(x1 – x2) = 2 var(e) And thus Use multiple pairs x1, x2 to infer distribution of e!
Alternatively… • Earlier, suggested quantifying random error by the standard deviation (s) of multiple independent measurements of the same thing (xi): • For i = 1,2, reduces to or • If following this approach, mean s across multiple x1, x2 pairs is calculated as(i.e. as a geometric mean, or square root of the mean variance).
Another paired approach… • Two tower approach can only rarely be used • Alternative: substitute time for space • Use x1, x2 measured 24 h apart under similar environmental conditions • PPFD, VPD, Air/Soil temperature, Wind speed • Tradeoff: tight filtering criteria = not many paired measurements, poor estimates of statistics; loose filtering = other factors confound uncertainty estimate
Model residuals • Common in many fields (less so in “flux world”) to conduct posterior analyses of residuals to investigate pdf of errors, homoscedasticity, etc. • Disadvantage: • Model must be ‘good’ or uncertainty estimates confounded by model error • Advantages: • can evaluate asymmetry in error distribution (not possible with paired approach) • many data points with which to estimate statistics
A double exponential (Laplace) pdf better characterizes the uncertainty Strong central peak & heavy tails (leptokurtic) • non-Gaussian pdf Better: double-exponential pdf, f(x) = exp(|x/|)/2 The double-exponential is characterized by the scale parameterb
The standard deviation of the uncertaintyscales with the magnitude of the flux • Larger fluxes are more uncertain than small fluxes • Relative error decreases with flux magnitude (even when flux = 0 there is still some uncertainty) • Large errors are not uncommon • 95% CI = ± 60% • 75% CI = ± 30% To obtain maximum likelihood parameter estimates, cannot use OLS: must account for the fact that the flux measurement errors are non-Gaussian and have non-constant variance.
Generality of results • Scaling of uncertainty with flux magnitude has been validated using data from a range of forested CarboEurope sites: y-axis intercept (base uncertainty) varies among sites (factors: tower height, canopy roughness, average wind speed), but slope constant across sites (Richardson et al., 2007) • Similar results (non-Gaussian, heteroscedastic) have been demonstrated for measurements of water and energy fluxes (H and LE) (Richardson et al., 2006) • Results are in agreement with predictions of Mann and Lenschow (1994) error model based on turbulence statistics (Hollinger & Richardson, 2005; Richardson et al., 2006)
Generality of results • s(H) = 19.5 W m-2 • s(LE) = 16.5 W m-2 • s(FCO2) = 2.0 mmol m-2 s-1 Uncertainties of all fluxes increase with flux magnitude.
Comparison of approaches • Error estimates vary by ≈ 10% across models, are ≈20% lower for paired approach than for ‘best’ model • Errors are “more Gaussian” for large uptake fluxes and “less Gaussian” for fluxes ≈0 mmol m-2 s-1
Uncertainty at various time scales • Systematic errors accumulate linearly over time (constant relative error) • Random errors accumulate in quadrature (so relative uncertainty decreases as flux measurements are aggregated over longer time periods) • role of Central Limit Theorem as fluxes are aggregated • Monte Carlo simulations suggest that uncertainty in annual NEE integrals uncertainty is ± 30 g C m-2 y-1 at 95% confidence (combination of random measurement error and associated uncertainty in gap filling) • Biases due to advection, etc., are probably much larger than this but remain very hard to quantify
"To put the point provocatively, providing data and allowing another researcher to provide the uncertainty is indistinguishable from allowing the second researcher to make up the data in the first place." • Raupach et al. (2005). Model data synthesis in terrestrial carbon observation: methods, data requirements and data uncertainty specifications. Global Change Biology 11:378-97.
Why does it matter for modeling? • Cost function (Bayesian or not) depends on error structure • likelihood function: the probability of actually observing the data, given a particular parameterization of model • appropriate form of likelihood function depends on pdf of errors • maximum likelihood optimization: determine model parameters that would be most likely to generate the observed data, given what is known or assumed about the measurement error • Ordinary least squares generates ML estimates only when assumptions of normality and constant variance are met
Maximum likelihood paradigm: “what model parameters values are most likely to have generated the observed data, giventhe model and what is known about measurement errors?” Assumptions about errors affect specification of the ML cost function: Other cost functions are possible—depends on error structure! For Gaussian data (“weighted least squares”) For double exponential data (“weighted absolute deviations”)