250 likes | 277 Views
Uncertainty. How “certain” of the data are we? How much “error” does it contain? Also known as: Quality Assurance / Quality Control QAQC. Definitions. Rigor Manage uncertainty from collection to publication and dissemination Due diligence Document the uncertainties as best you can.
E N D
Uncertainty • How “certain” of the data are we? • How much “error” does it contain? • Also known as: • Quality Assurance / Quality Control • QAQC
Definitions • Rigor • Manage uncertainty from collection to publication and dissemination • Due diligence • Document the uncertainties as best you can
Staircase of Knowledge Wisdom Judgment Understanding Comprehension Integration Knowledge Organization Interpretation Information Human value added Selection Testing Data Observation And Measurement Verification Increasing Subjectivity Environmental Monitoring and Characterization, Aritola, Pepper, and Brusseau
No GIS Data is Perfect • What is the uncertainty of the data? • Gross Errors: • Datums • Wrong area • Accuracy: • how well does it represent reality? • Precision: • how repeatable is it? • How much have things changed? • Was there bias in the sampling? • Visit the site to see what really happened!
Required Uncertainty • “All models are wrong but some are useful” • George E. P. Box • “All data is wrong but some is useful” • Jim Graham • If a DEM is accurate to 30 meters: • You can’t use it to design a road • You can use it to predict large land slides
Accuracy and Precision • Uncertainty: • If I just use my GPS to get somewhere, how close will I get? • Accuracy (correctness): • Does the GPS take me to the correct location? • Precision (repeatable & exactness): • If I do this over and over again, how close do I get to the same place?
Accuracy and Precision High Accuracy Low Precision Low Accuracy High Precision http://en.wikipedia.org/wiki/Accuracy_and_precision
Bias (Accuracy) Bias Truth Mean Bias = Distance from truth
Standard Deviation (Precision) Each band represents one standard deviation Source: Wikipedia Standard Error: Standard Deviation of Samples
Sources of Uncertainty Real World Uncertainty? Protocol Errors, Sampling Bias, and Instrument Error Measurements Storage Unintended Conversions Digital Copy Uncertainty increases with processing, human errors Processing Incorrect method, interpretation errors Analysis Results Representation errors Decisions Human errors
Protocol • Rule #1: Have one! • Step by step instructions on how to collect the data • Calibration • Equipment required • Training required • Steps • QAQC • See Globe Protocols: • http://www.globe.gov/sda/tg00/aerosol.pdf
Protocol Error Is there a protocol? What is being measured? Is it complete: How large? How small? Unexpected circumstances (illness, weather, accidents, equipment failures, changing ecosystems) engadget.com
Sampling Bias • How was the sampling done? • Whales below water? • Plant seeds? • Small streams? • Night vs. Day? • Time of Year? • “Most data is collected near a road, a port-a-potty, and a restaurant!” • Tom Stohlgren saawinternational.org
User Measurement Errors Wrong Datum Data in wrong field/attribute Transcription errors Observer error: Accuracy: How close to “truth”? Precision: How repeatable? Drift: Changes over time
Instrument Errors GPS has “Delusion of Precision” Calibration & Drift Take calibration measurements throughout the sampling period Humans as instruments: DBH Weight Humans are almost always involved! You can calibrate everything!
Calibration Sample a portion of the study area repeatedly and/or with higher precision GPS: benchmarks, higher resolution Measurements: lasers, known distances Identifications: experts, known samples ecd.com
Storage Errors: Excel • 10/2012 -> Oct-2012 • However, Excel stores 10/1/2012! • 1.00000000000001 -> 1 • However, Excel stores 1.00000000000001 • 1.000000000000001 -> 1 • Excel stores 1
Storage Errors: Database • Dates: • 2012 -> 2012-01-01 00:00:00.00 • January 1st at midnight, exactly • Numbers • Varies with the database
Documenting Uncertainty • Record accuracy and precision in metadata! • Add uncertainty to your outputs • Data sources • Sampling Procedures and Bias • Processing methods • Estimated uncertainty • Accuracy and precision • 95% confidence interval
FGDC Standards • Federal Geographic Data Committee FGDC-STD-007.3-1998 • Geospatial Positioning Accuracy Standards • Part 3: National Standard for Spatial Data Accuracy • Root Mean Squared Error (RMSE) from HIGHER accuracy source • Accuracy reported as 95% confidence interval http://www.fgdc.gov/standards/projects/FGDC-standards-projects/accuracy/part3/chapter3 Section 3.2.1
Significant Digits (Figures) • Limits the precision • Can be interpreted by others as the precision and accuracy of the data! • Which “feels” more accurate: • 1.234323 • 1.2
Significant Digits (Figures) How many significant digits are in: 12 12.00 12.001 12000 0.0001 0.00012 123456789 Only applies to measured values, not exact values (i.e. 2 oranges)
Significant Digits Cannot create precision: 1.0 * 2.0 = 2.0 12 * 11 = 130 (not 131) 12.0 * 11 = 130 (still not 131) 12.0 * 11.0 = 131 Can keep digits for calculations, report with appropriate significant digits
Rounding If you have 2 significant digits: 1.11 -> ? 1.19 -> ? 1.14 -> ? 1.16 -> ? 1.15 -> ? 1.99 -> ? 1.155 -> ?
Other Approaches • Confidence Intervals • +- Some range • Min/Max • Need a confidence interval • “Delusion of Precision” • Defined by the manufacturer