380 likes | 617 Views
Data Quality Assessment: What Is It, Why Use It, and What’s in It For Me?. Presenters: Jill Lundell, Debbie Lacroix, Berta Oates Date: May 13, 2009. What Is A Data Quality Assessment (DQA)?.
E N D
Data Quality Assessment: What Is It, Why Use It, and What’s in It For Me? Presenters: Jill Lundell, Debbie Lacroix, Berta Oates Date: May 13, 2009
What Is A Data Quality Assessment (DQA)? The scientific and statistical evaluation of environmental data to determine if it meets the planning objectives of the project, and thus are the right type, quality, and quantity to support their intended use.
What Does That Mean? • Data Quality Assessment is performed after the data are collected • Data Quality Assessment should answer two primary questions • Are the numbers reliable? • What conclusions can be drawn from the data?
Are The Numbers Reliable? • Data verification and data validation are performed to determine if the numbers are reliable • Data Quality Assessment can also be used to determine the reliability of an analytical method (such as XRF) if it is built into the sampling design
What Conclusions Can be Drawn from the Data? • Review the Objectives and Sampling Design • Conduct a Preliminary Data Review • Select a Statistical Method • Verify the Assumptions of the Statistical Method • Draw Conclusions from the Data
Review the Project Objectives (Data Quality Objectives) It is essential to keep in mind the primary and secondary objectives of the project during the entire DQA process to ensure appropriate tests are used and applicable conclusions are drawn from the data
Conduct a Preliminary Data Review • Review the validated data to determine completeness and reliability • Construct graphs and summary statistics to get a feel for the structure of the data • This step is essential to ensure the data user applies the appropriate statistical tests and methods to the data
Select a Statistical Test or Method • This selection should be based on the objectives of the project • Typically there are several statistical tests or methods that can be used to answer the questions of the study • Results of the preliminary data review will aid the data user in determining which tests should be used
Verify the Assumptions of the Statistical Test or Method • All statistical tests and methods have assumptions that must be met in order to obtain reliable and defensible results • Determine the distribution of the data and the presence of outliers • The preliminary data analysis should provide the information needed to determine if the assumptions are met
Perform the Tests and Draw Conclusions from the Data • Perform calculations, evaluate the results, and draw conclusions • If project objectives are carefully defined and the sampling design is deftly planned a great deal more can be gleaned from the data than discussed in the DQA guidances (EPA G-9R and G9-S)
Why Use DQA? • Ensures data are used to their full extent and appropriate decisions are made with the data • Ensures conclusions are defensible • This is particularly important if a site is under close scrutiny • Saves stake holders money in the long and/or short term
Review of A Completed DQA • Demonstrates components of a DQA • Highlights the importance of careful examination and analysis of data • Provides a real-world example of the DQA process
Site Background • Several large soil piles were discovered at a facility • Origin and contamination levels at site were unknown • Area had been open to public recreational use for several years • Litigation risk was very high
Primary Objectives • Determine the nature and extent of contamination in the soil piles and surrounding soils and to determine the risk to human health • Determine if action is required and if so determine the appropriate action
Secondary Objectives • Determine if soils in piles are different than surrounding soils • Determine if chemicals are present that can help predict the presence of other chemicals of interest (indicator chemicals) • Determine the accuracy and applicability of field methods in the area
Secondary Objectives • Developed as a result of having the data analysts involved during DQO and sampling scheme development • Data analysts proposed options to the client of which the client was previously unaware • Several were developed to aid development of sampling plans for neighboring cleanup sites
Sampling Plan Sampling plan was developed to allow all primary and secondary objectives to be met Each composite sample within the cluster was split. Laboratory analysis was performed on one portion of the sample and field analysis was performed on the other portion of the sample. This allowed for a determination of how well field screening methods performed compared to fixed laboratory methods
Sampling Plan Additional field samples were collected between the clusters of fixed laboratory samples to provide additional insight along the length of the piles The comparison of fixed laboratory methods and field screening methods allowed for better interpretation of these data
Primary Objectives • None of the soils were contaminated to an extent that posed a risk to human health • One point of elevated contamination was discovered where the two piles met, but it was defensibly determined that it also did not pose a threat to human health • Results and methods were defensible
Secondary Objectives • Soils in the piles were compared to soils on the banks of the outfall and creek as well as other soils surrounding the piles • Soils in the piles did have higher levels of contamination than surrounding soils; however, soil piles did not pose a threat to human health
Secondary Objectives • Correlation analysis was performed to determine if indicator chemicals were present • It was of particular interest to know if the presence of Uranium-235 or Uranium-238 could be used to determine the presence of PCBs • A useable correlation was not found because PCBs were detected in very few samples
Secondary Objectives • Field and fixed laboratory methods were compared to determine how field data could be used to aid in sampling other sites • A multi-tiered approach was used to determine the strength of the relationship between the two methods in the area
Field and Fixed Laboratory Results Comparison • Correlation analysis was used to determine if field and fixed lab methods directly correlated • False-negative and false-positive rates were determined around field detection limits, background levels, and no action limits • Means, standard deviations, and upper confidence limits (UCLs) were computed from both sets of data and compared • Bubble plots were generated to determine how field and lab measurements corresponded in the soils
Field and Fixed Lab Methods Comparison Results • Field results and laboratory results are not usably correlated in a mathematical sense • Several of the analytes could not be detected at, or below, background with field methods • Means, standard deviations, UCLs did not compare well for the analytes of primary interest • Bubble plots indicated that for most analytes of concern, higher concentrations in the lab methods were associated with higher concentrations in the field methods
Defensibility • Site had a high risk of litigation so defensibility was very important! • Data were analyzed to ensure clustered design was handled appropriately • Appropriate methods were used to handle undetected data (using the detection limit or ½ of the detection limit for undetected values is not a defensible method) • Outliers were identified and their impact discussed through all phases of statistical analysis
What’s in it for Me? • Saves stake holders money on current and/or future projects • Results can withstand close scrutiny • Reduces the risk of having to resample a site • Minimizes the chance of remediating a clean site or failing to remediate a contaminated area