620 likes | 772 Views
3D-Var Revisit ed and Quality Control of Surface Temperature Data. Xiaolei Zou Department of Meteorology Florida State University zou@met.fsu.edu. June 11, 2009. Outline. Part I:. 3D-Var Formulation Statistical Formulation Analysis Practical Applications. Part II:. Motivation
E N D
3D-Var RevisitedandQuality Control of Surface Temperature Data Xiaolei Zou Department of Meteorology Florida State University zou@met.fsu.edu June 11, 2009
Outline Part I: • 3D-Var Formulation • Statistical Formulation • Analysis • Practical Applications Part II: • Motivation • EOF analysis • QC for Ts
Facts All background fields, observation operators and observations have errors. 2) There is no truth. Errors in background, observation operator and observations can only be estimated approximately. The Goal Produce the best analysis by combining all available information.
Questions 1) What is the measure of the best analysis? 2) How to combine all available information?
Variational Formulation A scalar cost function is defined: where
Statistical Formulation Write the PDFs for all three sources of information as: Available information PDF Joint PDF: PDF of the a posteriori state of information
The Bayes Theorem The marginal PDF of the a posteriori state of information: is the PDF of the a posteriori state of information in model space.
Application of Bayes Theoremto Data Assimilation Data assimilation derives some features of the PDF, which is the a posteriori state of information in model space. • The maximum likelihood estimate ~ analysis • The covariance matrix of this estimate ~ analysis error covariance A
Assuming All Errors Are Gaussian, The PDF for yobs: The PDF xb: The PDF for H(x0):
Maximum Likelihood Estimate The PDF of the a posteriori state of information in model space: Maximizing Minimizing Statistical Estimate Variational Calculus
Gaussian and Non-Gaussian signals The signals are sampled at 10000 points. PDFs are constructed at an interval of
3D-Var & 3D-Var Analysis The 3D-Var data assimilation solves a general inverse problem using the maximum likelihood estimate under the assumptions that all errors are Gaussian. The 3D-Var analysis is the maximum likelihood estimate if all errors are Gaussian.
Zero Gradient: A necessary Condition a linear operator a nonlinear operator H
Analytical Expression of Solution with a Linear Model H is linear:
Analytical Expression of Solution with an Approximate Linear Model
Analysis Error When linear approximation is valid, the a posteriori PDF is approximately Gaussian, with the analysis as its mean and the following covariance matrix:
3D-Var Analysis A-1 is referred to as an information content matrix. When the analysis error is small, the value of ||A-1|| is large, the information content is large. The information content of the 3D-Var analysis is greater than the information content in either the background field or the observations that were assimilated.
3D-Var Practice • Develop System Decision on variables and resolutions Estimate of background error covariance • Assimilate Data Decision on observations to be assimilated Understanding of the observations Estimate of observation errors Comparison between observations and background Development of the observation operator Estimate of model errors • Obtain Solution Minimization (preconditioning, scaling) Advanced computing (parallelization, data intensive computing platforms)
What does 3D-Var data assimilation involve? What data to assimilation? Which model to use? Choice of analysis variable What background to start with? + How to quantify it? How to estimate elements in B? Where to find their values? 3D-Var analysis Model Space Observed Space
What need to be done before and after conducting 3D-Var experiments? 3D-Var Output Analysis Input Data Diagnosis of Analysis Quality Control
What need to be done before and after conducting 3D-Var experiments? • Quality Control Knowing the data Knowing the major difference between data and background field Remove errorneous data Eliminate data that render errors non-Gaussian • Diagnosis of 3D-Var analyses Check the convergence Examine the analysis increments Estimate analysis errors Assess forecast impact Provide physical and dynamical explanations to the numerical results one obtains
When Working with Real-Data, The key things are • Knowing the data before inputting them into a 3D-Var system by a careful QC! • Kowing the system after a 3D-Var experiment by a careful analysis of the 3D-Var results!
Differences between model and obs.before and after a 3D-Var experiment pb - pobs and pa-pobs
calculated Inferred from and
Motivations • Surface data are abundant • Very little surface data are assimilated in operational systems • Surface data are important to thunderstorm prediction Challenges • Existing data assimilation systems have short or no memory of surface data • Diurnal cycle dominants the variability of surface variability and is not described with sufficient accuracy in large-scale analysis which is used as background in mesoscale forecast • Background errors are non-Gaussian
A Total of 3197 Surface Stations The number of missing data at each station in January 2008 is indicated by color bar.
Key steps: Inclusion of more surface data Improved QC Vertical interpolation based on the atmospheric structures within the boundary layer Surface layer Mixed layer 3) Incorporation of dynamic constraint Improving Surface Data Assimilation
EOF Modes for Ts Constructed from Station Observations First Third Second Fourth Sixth Fifth
EOF Modes for Ts Constructed from Station Observations (cont.) Seventh Eighth Ninth Tenth
Explained Variances Surface Data(blue) NCEP analysis (red)
Dominant Oscillationsin January 2008 Obs. Period (unit: hour) Period (unit: hour) Period (unit: day) NCEP EOF mode EOF mode EOF mode Longer-period oscillation Shorter-period oscillation Diurnal oscillation
Diurnal OscillationandLonger-Period Oscillations Phase difference Amplitude difference
Third Second Fourth Fifth Time (unit: day) Sixth Time (unit: day) PC Differences between Surface Data and NCEP Analysis Blue line: First Week Red line: Last Week
Frequency Distributions of Diurnal Cycle Modes Last Week Last Week First Week First Week Third Second Frequency Frequency Second Third Sixth Fourth Fourth Fifth Fifth Frequency Fourth Frequency Fourth Fifth Tobs-TNCEP (unit: K) Frequency Sixth Sixth January 2008 Tobs-TNCEP (unit: K)
Frequency Distributions (modes 2-6) First Week Last Week Frequency Frequency Tobs-TNCEP (unit: K) Tobs-TNCEP (unit: K) Entire Month Frequency Sum of Modes 2-6 Tobs-TNCEP (unit: K)
Statistical Measures Mean Variance Kurtosis Skewness
QC Procedure Step 1: • Historical extremum check T > The average of NCEP analysis of each station pluses (minuses) 15-times its variance • Temporal consistency check T > 50℃ in 24-hours interval • Bi-weight check Z-score > 3 • Spatial consistency check T > The average of linear fit to highly correlated stations pluses (minuses) 4-times its variance
QC Procedure (cont.) Step 2: The Z-score of the difference between station observation and background field must less than 4 Step 3: The Z-score of the difference between station observation and background field excluding the contribution from diurnal cycle must less than 2
Step 2 Obs. Step 3 Background Obs. Background
Frequency Distribution before and after QC First Week Last Week Frequency Frequency Tobs-TNCEP (unit: K) Tobs-TNCEP (unit: K) Entire Month Frequency Tobs-TNCEP (unit: K)