280 likes | 616 Views
How to describe Accuracy And why does it matter. Jon Proctor, PhotoTopo GIS In The Rockies: October 10, 2013. introduction. Accuracy. We think we understand it But, there are more details than you think… This discussion will help you understand what is needed to describe accuracy
E N D
How to describe AccuracyAnd why does it matter Jon Proctor, PhotoTopo GIS In The Rockies: October 10, 2013
introduction • Accuracy. • We think we understand it • But, there are more details than you think… • This discussion will help you understand what is needed to describe accuracy • And Why it matters…
Horizontal Accuracy • Horizontal accuracy shall be tested by comparing the planimetric coordinates of well-defined points in the dataset with coordinates of the same points from an independent source of higher accuracy. (NSSDA) • http://www.fgdc.gov/standards/projects/FGDC-standards-projects/accuracy/part3/chapter3
What to measure • Measure and record the x and y coordinate of the feature in the product. • Ensure that the coordinates are in the same projection as the control Control Chip Product GCP (control) coordinate Observed location
Accuracy Spreadsheet • For each point • record the control ID, the observed (product) x and y coordinate • List the control x and y coordinate • Calculate the difference (observed – control) for x and y • Plot errors • Calculate radius
Is this the accuracy? • So far, we only now the errors of the points that were measured • We don’t have the accuracy for the product • How do we describe the accuracy?
What is Positional Accuracy? • Positional accuracy is a statistical measure of a features location. • We are not saying that the product is off by 4 meters to the East. • Rather we have 4 meters of uncertainty. • And we need to describe that uncertainty. • It represents the probability that a feature is within a given distance of its true location. • Probability and distance • Not a specific direction, not a specific distance • This probability, statistical description can be represented in numerous ways such as: • CE90, CE95, RMSE, StDev, and many more…
What is needed to Describe accuracy? • We need 3 descriptors: • Measure: 4, 8.5, 10 • Unit: meter, feet • Statistical description: RMSEr, StDevr, CE90
Definitions • RMSE: Root Mean Square error • RMS of the absolute error • StDev: Standard Deviation • Deviation from the average • For non-biased datasets (where the average is 0), StDev and RMSE will be the same
RMSE vsStDev • RMSEr • Measure radial distance from control (0,0) to data point • StDevr • Measure radial value from cluster center to data point
RMSE vsStDev • For an Un-biased dataset, where the Center of scatter plot is near 0,0 • Measures are almost exactly the same • RMSE = StDev
(more) Definitions • CE90 • A CE90 of 10.0 meters is an accuracy in which 90% of the well defined, measured image points are statistically expected to be within 10.0 meters from their surveyed locations. • assumes that the survey or truth point is (much) more accurate than the dataset being sampled. • CE95 • The distance for a 95% probability • CEP • Circular Error Probable. • Circle of Equal Probability. • The distance for a 50% probability. • Or CE50
Example of 10m CE90 • But what if we forget the statistical descriptor of CE90? • If the accuracy was only described as 10m, what would the scatter look like?
Why it matters • These plots are all at 10m • The scatter size changes based on the statistical descriptor • CE99.99 • CE99 • CE95 • CE90 • CE50 • RMSEr • RMSExy • Without the descriptor, we would not know the tolerance for error on the project
Why it matters (2) • Another way to show why the statistical description matters is shown here. • These are equivalent values for this error scatter plot • 20.00 m CE99.99 • 14.14 m CE99 • 12.91 m CE95 • 10.00 m CE90 • 8.27 m CE50 • 6.59 m RMSer • 4.66 m RMSexy
xy compared to r • When calculating RMSE or StDev, you can measure the x, y offset or the radial offset, and use these offsets in the equations • RMSEr equals the horizontal radial RMSE, • Use notation that clarifies which value is being measured, such as: • RMSE(xy), StDev(xy) • RMSE(r), StDev(r) Or • RMSExy, StDevxy • RMSEr, StDevr
xy Compared to r offset with biased dataset • Measure xy or r? • Note: radial values are always positive • can’t have a negative radius • RMSEr = • Note: histogram for RMSE is only for the x measures
xy Compared to r offset with unbiased dataset • With an unbiased dataset, the x values are centered on 0 • Radial values are all positive
Types of Error • Blunder: A gross error. A careless mistake. An obvious error. Blunders are individual errors that effect each measurement differently. Blunders that can be identified should be dis-regarded, or removed from the report or solution. • Marking a control point at the wrong corner for an intersection. • Bad image correlation • Systematic: An error that tends to shift all measurements in a systematic way. If the systematic error is identified, it can be removed from the results, or the project can be re-processed with the corrected information. • A bold of shadowed line used to depict a feature • A burr at the end of a measuring stick • Sensor calibration error • Incorrect interior orientation • Incorrect re-projection parameters • Random: An error that shifts measurements in a haphazard on inconsistent manner. • Rounding of significant figures • Coarse DEM postings, not capturing terrain change • Noise in signal processing To goal is to identify and remove the blunders, and remove systematic errors. As a result, only Random errors are left in the project.
If we could say the positional accuracy was 4 meter to the east, then we could just shift the data set to the east, and improve the accuracy. • But in a good project, you will remove the blunders, and bias. • All that is left is random error
Random error at 10m CE90 • Example of Random error • Observations = • 10 • 100 • 1,000 • 10,000 • Distribution is circular and non-biased
Conclusions • Specify a measure: 4, 8.5, 10 • Specify Unit: meter, feet • Use a statistical description • RMSE • StDev • CE50 • CE90 • Etc… • And, be precise • RMSExyvs RMSEr • StDevxyvs StDevr • Important to know because the conversion factor from Xrto CE90 is different than the conversion factor of Xxy to CE90
Conversion factors • Since these descriptors are all describing 2-dimensional error offsets, we can convert between the different values.
3 methods for measuring CE90 • Direct rank • XY • Radius • Show the difference with 5 check points, 10, 20 • Really, we should say “I am 73% confident that 90% of the well defined points are within 10 meters of their true location” • Where my confidence level is based on the sample size, and variance measured from the population
DRAFT ASPRS accuracy standards • Tested __ (meters, feet) Non-vegetated Vertical Accuracy (NVA) at 95 percent confidence level in all open and non-vegetated land cover categories combined using RMSEz x 1.96,” and • “Tested __ (meters, feet) Vegetated Vertical Accuracy (VVA) at the 95th percentile in all vegetated land cover categories combined using the absolute value 95th percentile error.”
Variation in estimated CE90 • These plots show variation in the estimated CE90 based on 3 different methods and a sample size of 20 • Estimated CE90 ranges from 6.5 and 12 m CE90 in these 5 samples