160 likes | 293 Views
Some thoughts on error handling for FTIR retrievals. Prepared by Stephen Wood and Brian Connor, NIWA with input and ideas from others. Overview/Introduction.
E N D
Some thoughts on error handling for FTIR retrievals Prepared by Stephen Wood and Brian Connor, NIWA with input and ideas from others...
Overview/Introduction With the possibility of archiving profiles in an NDSC data base there's a need to consider how we present the errors in these profiles. We need either some consistency in how this is done or a way of flagging how the errors that are archived were arrived at. There are a number of issues this raises. This presentation is aimed at outlining some of these, and hopefully promoting a lively discussion!
Recap of error terms Measurement error: Smeas=GySεGyT Smoothing error: Ssmooth= (A–I) Sx (A–I)T Model parameter error: Spar= Gy(∑KbSbKbT)GyT Forward model error (from things not modelled properly in the forward model): difficult to quantify For reference Make the distinction between covariance values given to the retrieval Sa,Se which “define” the retrieval and how it maps things, (i.e. Kx, Kb, Gy, A), and then more realistic estimates of covariances like Sx , the best estimate of covariance of x. This applies to all the error calculations
Expand on model parameter errors There are several model parameters which can be treated independently in error calculations, rather than lumping them all together in one vector b. Various b can be single parameters (e.g. a line strength, SZA) or vectors (e.g. temperature profile). Whether or not a model parameter has temporarily been put in the vector of retrieved values, the first step in error evaluation is calculating Kbfor a given b. If a given b can be put in as a retrieved variable then Kb can often be calculated directly by the retrieval code. If this can’t be done, Kb can be calculated from forward model runs with Kb=[F(x,b+Δb)–F(x,b)]/Δb (but mention the catch with sfit2 normalisation)
Combining model parameter errors Once you have a Kb (vector or matrix) then estimating the error involves estimating Sb (single value or square matrix) and then calculating GyKbSbKbTGyT These contributions can then be summed over all b. While the various Sbmight be different dimensions, all the products KbSbKbTdo have the same dimensions. Since the measurement error also has the same dependence on Gy, a further possibility is to add all the KbSbKbT terms directly to an Sε (measurement covariance) and transform with Gy to give a combined error that covers measurement and model parameter errors
How have we been doing error analysis and characterisation? • Perturbation methods. Run the retrieval with some input changed by a typical amount, evaluate the change in the retrieval (difficult to generalise to multiple dimensions) • Tools developed to aid with error evaluation. For sfit2, one is written in IDL (sfit2ers, error simulation) and there has been an equivalent MATLAB tool produced. These have taken the approach of evaluating some error terms for a typical case, the intention being to apply the results to all retrievals. There are limitations to this • Not all errors are handled. Usually only measurement and smoothing error • The quality of spectra may vary and so be different from the typical case • The simulation of what the retrieval does may not be perfect
Batched error calculations We have tried one approach at Lauder for producing error and kernel calculations for each retrieval in a dataset. We run batches of sfit2 with K-files (contain Kx, Se-1, Sa-1) from each retrieval saved. We then have a post-processing tool (in IDL) that picks this information up and does kernel and error calculations. As it doesn’t handle all model parameter errors, this currently has no more capability than an error code built-in to sfit2 could have, but is more easily modified and extendable to evaluate more errors on individual retrievals and to match the differing requirements of various projects. It could read in information that quantified other errors to include in the calculations.
Which errors should be included in an archive? • It is unrealistic to expect a full error analysis to be included with each retrieval in an archive. Such a complete analysis would include errors from • A priori uncertainty (smoothing error) • Random spectral noise (Measurement error) • Interfering gases • ILS and how it is modelled. • Other instrumental effects (detector non-linearities, channelling, filter or background shape) • Spectroscopic data (line strengths, widths) • The assumed value of solar zenith angle • Assumed pressure or temperature profiles • the refraction and ray tracing calculations • forward model errors • Which of these should be included in an archived error analysis?
Our suggestion for deciding what errors to include The idea of not including smoothing error in archives is a good one, as long as the information to evaluate it or to apply similar smoothing to comparative data is included. One suggestion is to include only those errors that contribute random or uncorrelated errors to the measured spectrum and hence the retrieved profiles. This includes measurement noise, temperature uncertainties, interfering gases. Errors that produce fully correlated errors in the spectrum and are likely to produce a similar error in retrieved profiles. These errors could be characterised in metadata within the archive, giving likely magnitudes and effects on the profile, and not included in the error calculation with each retrieval. One example is line parameter errors.
Traps There are some easy mistakes that can be made • consider a model parameter b , like an interfering gas, the error is Spar= Gy(KbSbKbT)GyT • Should Sb is the variability covariance of b, or should be the remaining uncertainty (covariance) in b ? • For example: think about how T error is handled. Is the covariance needed the full variability in T or the uncertainty in the T profiles that are used?
Specific problems/issues with sfit2ers Last year a number of bugs and issues with sfit2ers were found and fixed. Since then a few more have surfaced which have not yet been addressed • point spacing dependence. For use as an evaluation tool, sfit2ers needs to be independent of having a measured spectrum and so sets its own point spacing, but this means it's not matching what sfit2 does with a real spectrum. The current point spacing is too dense and will be changed. Is the solution to have a switch to choose how point spacing is set? • Evaluation of off-diagonal values of Sx(climatological covariance) doesn't match the way Sa is evaluated internally in sfit2. • Input of non-diagonal Se doesn't work for multiple windows yet.
discussion points from IRWG '06 • Several error sources to consider & evaluate • An inclusive error analysis tool? – difficult • An estimate for each retrieval? • Error evaluation might use different S to retrieval • Have considered errors as independent, but … • Suggestion of a “how-to” guide and examples • Do data users also need a guide, an error “tool”? • Small group to produce a guide?
How many errors is enough? • Noise Smeas=GySεGyT • FM model parameters. Spar= Gy(∑KbSbKbT)GyT • ILS, SZA, other gases, line parameters ... which? • are they fit routinely or fixed? consider both, but treat accordingly • Derivatives Kb may be easy (from code) or difficult (finite difference of two FM runs). • Temperature – no derivatives in SFIT2, I have a finite difference method set up.
Decisions about archiving errors • How much error analysis is desirable or adequate for archiving? • how much should errors be broken down in an archive or is it OK to just record one covariance? Perhaps 2, total random and total systematic?, in some case the difference is clear, not in others... • Perhaps a description field for errors • seems to be good support for leaving out smoothing error, but provide A, and xa so user can calculate it