210 likes | 399 Views
Key Factors Influencing Data Quality. Region V ERP Auto Body Training Chicago, IL November 18, 2009. Purpose of Presentation. Review data quality indicator (DQI) concepts Review data quality objective (DQO) concepts Illustrate data quality issues that might arise in this project.
E N D
Key Factors Influencing Data Quality Region V ERP Auto Body Training Chicago, IL November 18, 2009
Purpose of Presentation • Review data quality indicator (DQI) concepts • Review data quality objective (DQO) concepts • Illustrate data quality issues that might arise in this project
Two Types of Data • Data whose quality you can affect (primary data). E.g., • New random inspection data collected by your program • New certification data collected by your program • Data whose quality of collection you cannot affect (secondary data) • Data collected by others To be accepted, all data must meet your agreed-upon quality objectives.
Review of Data Quality Indicators (DQIs) • Definition • Everyday example • Examples meaningful to this project
Precision Measure of agreement among repeated measurements of the same property under identical or substantially similar conditions. Source: EPA Introduction to Data Quality Indicators http://epa.gov/quality/trcourse.html
Examples of Precision Issues • Measuring a child: How did she get shorter? • Ambiguous questions: “Has the facility made efforts to train its paint technicians?” • Statistical sampling: Confidence level and margin of error
Sensitivity Measure of the capability of a method or instrument to discriminate between measurement responses representing different levels of the variable of interest. • How fine are the units of measurement? Source: EPA Introduction to Data Quality Indicators http://epa.gov/quality/trcourse.html
Examples of Sensitivity Issues • Cooking: In your dish, can you taste the difference one grain of salt makes? 1 cup? • Quantitative questions: "How many barrels of MeCl does your shop use in a year?" vs. "Is less than 180 gallons of MeCl used per year?"
Bias Systematic or persistent distortion of a measurement process that causes errors in one direction. Source: EPA Introduction to Data Quality Indicators http://epa.gov/quality/trcourse.html
Examples of Bias Issues • “Dewey Beats Truman”: A telephone poll is biased in favor of telephone owners • Data collector’s style: "Harsh" inspector in State A and "Easy" inspector in State B • Different data collectors: SBEAPs collecting baseline inspection data. EPA collecting post baseline inspection data. • Interested party: Facility-reported data, relative to inspector-collected data
Representativeness Degree to which a sample accurately and precisely represents the larger context. Lack of representativeness can… • Be a source of bias Source: EPA Introduction to Data Quality Indicators http://epa.gov/quality/trcourse.html
Examples of Representativeness Issues • Mixing: Stir a fluid before taking a sample (e.g., cooking a dish) • Defining your Indicator: Hazardous waste questions – consider defining waste type • RCRA, Used Oil, PCBs, Petroleum Contaminated wastes • RCRA Only • Randomness: Random sample is representative (but of what?) • What population are you trying to measure?
Completeness Measure of the amount of valid data needed to be obtained from a measurement system. • Incompleteness can be a source of bias Source: EPA Introduction to Data Quality Indicators http://epa.gov/quality/trcourse.html
Examples of Completeness Issues • Cooking: Do you have enough of all the ingredients to make do? • Universe: Have all eligible facilities been identified? • Response rate: • What percentage of surveys are returned? • How many questions are left blank?
Comparability Measure of confidence that the underlying assumptions behind two data sets are similar enough that the data sets can be compared and/or combined to inform decisions. Key comparisons in this project: • Region wide baseline performance compared over time Source: EPA Introduction to Data Quality Indicators http://epa.gov/quality/trcourse.html
Examples of Comparability Issues • Timing: Compare data collected in the spring with data collected in the fall? • Bias: Are data collectors collecting data the same way?
Data Quality Objectives (DQOs) • Role of DQOs: Identify minimum standards for data acceptability • How many DQOs? Set DQOs for each critical DQI issue • Perfection? Not necessary or expected. • KEY FOR DQOs: Sufficient for needs and Achievable by all
DQO Examples • Completeness: 90% of certification forms returned; 95% of responses completed for each measure • Precision: 90% confidence that survey results are accurate within +/- 10% • Representativeness: Fluid will be mixed thoroughly before an analytical sample is taken • Sensitivity: MeCl will be reported in gallons or pounds per year
How Strict Should DQOs Be? Depends on: • Data Use: What kinds of decisions will they inform? Planning? Enforcement? Budgetary? • Types of Analyses: Do the DQOs support the questions you want to answer? Think ahead! • Resources: What can be achieved with available resources?
Rules of Thumb for DQOs • No surprises: Make sure quality will be good enough for your needs. • Transparency: Report all unresolved, important quality issues. • Achievability: Too onerous, and data won't be collected or data will be rejected.
For more information • Tara Acker, NEWMOA • taraacker@gmail.com • (413) 549-5309 • or • Renee Bashel, WI Department of Commerce • renee.bashel@wisconsin.gov • (608) 264-6153