1 / 45

Accuracy, Support, and Interoperability

Explore the concepts of accuracy, support, and interoperability in spatial data, addressing challenges in data integration and error management. Learn about novel methods like co-Kriging and Pycnophylactic interpolation.

Download Presentation

Accuracy, Support, and Interoperability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Accuracy, Support, and Interoperability Michael F. Goodchild University of California Santa Barbara

  2. The traditional view • Every object has a true position and set of attributes • with enough time and resources we could build a perfect geographic database in any thematic domain • It is the responsibility of the appropriate government agency to construct and disseminate the database

  3. A contemporary view • There will be many potential sources of data on any theme • varying in format • varying in the meaning of terms • varying in all dimensions of quality • positional accuracy, attribute accuracy, logical consistency, lineage, currency • varying in spatial support • sampling design

  4. Sampling a field • f(x) • point sample • weather data • linear transect • bathymetry • reporting zone • social data • averaging proportions/ratios • integrating densities • Accuracy of measurement of f • f* = f + δf • Accuracy of measurement of x

  5. point sample transect sample reporting zone sample

  6. For example… • f(x) = elevation • DEM • a raster of point samples • DLG • digitized isolines • Spot heights • irregularly spaced point samples • Triangular mesh (TIN), elevation polygons, splines, means over cells, finite elements,…

  7. Support • The objects used to characterize a field • points, lines, areas • possibly overlapping • though not normally in a single dataset

  8. A traditional solution • Force everything to a common support • downscaling? • DLG to raster point sample • congruent with DEM • contour-to-grid (CTOG) interpolation • Spot heights to raster point sample • congruent with DEM • Weighted average • weights should vary spatially

  9. A spatial web solution Query/analysis environment DEM data DLG data Spot height data

  10. The areal interpolation problem • Given attributes of a set of reporting zones • e.g. counties • Estimate attributes of a set of incompatible reporting zones • e.g. watersheds • e.g. to integrate county data with watershed data

  11. 1 target zone 4 source zones 15% of B B A C 10% of A D 5% of C 50% of D PopTARGET = 0.10 PopA + 0.15 PopB + 0.05 PopC + 0.50 PopD

  12. Newer methods • co-Kriging • hard data • point observations of variable z • sparse but accurate • soft data • point observations of a covariate y • dense but inaccurate or imperfectly correlated • elevation field used to help interpolate a temperature field • Pycnophylactic interpolation • hard data are areal, e.g. population count • interpolate a field of density • ensure integral over areas equals hard data

  13. Organizing the methods • Assumptions about characteristics of fields • homogeneous over zones • smoothly varying • Dependence on scale and region • A comprehensive solution • offered as a remotely invokable Grid service

  14. The nominal case • c(x) • every point in the plane assigned to a class • the area-class map • maps of land cover, land use, vegetation class, habitat, ownership, county name

  15. Soils Classification Schemes • Petén – Simmons (1959). USDA. • Only data set with soil attributes for Drainage and Fertility • Mexico – Modified FAO/UNESCO (1969) • Belize – Wright (1959). British Honduras Land Use Survey Team.

  16. Empirical comparison • Areas of overlap • Aij = area that is Class i on Map A and Class j on Map B • permute classes to obtain diagonal or block-diagonal matrix • Areas of adjacency • Lij = length of boundary that is Class i on Map A and Class j on Map B

  17. Drainage Finished product

  18. Fertility Finished product

  19. Positional uncertainty • The fundamental item of geographic information <x,z> • uncertainty in x • Geographic location • absolute • stored at point, object, data set level • return absolute position

  20. Measurement of position • Position measured • x = f(m) • Position interpolated • between measured locations • surveyed straight lines • registered images • The inverse function • m = f -1(x)

  21. Errors of position • Location distorted by a vector field • x' = x + (x) • (x) varies smoothly • Database with objects of mixed lineage • different vector fields for each group of objects • lineage may not be apparent • e.g. not all houses share same lineage

  22. Absolute and relative error • Two points x1, x2 • perfect correlation of errors, (x1) = (x2) • no error in distance • zero correlation of errors • maximum error in distance • Absolute error for a single location • measured by (x) • Relative error for pairs of locations • value depends on error correlations

  23. Implications • Most GIS operations involve more than one point • e.g. distance, area measurement, optimum routing • knowledge of error correlations is essential if error is to be propagated into products • joint distributions are needed • statistics such as the confusion matrix provide only marginal distributions

  24. The inverse f -1 • An error is discovered in x • error at x1 is correlated with error at x2 • both errors are attributed to some erroneous measurement m • to determine the effects of correcting x1 on the value of x2 it is necessary to know f and its inverse f -1

  25. Definitions • Coordinate-based GIS • locations represented by x • f, f -1 and m are lost during database creation • Measurement-based GIS • f and m available • x may be determined on the fly • f -1 may be available

  26. Partial correction • The ability to propagate the effects of correcting one location to others • preserving the shapes of buildings and other objects • avoiding sharp displacements in roads and other linear features • Partial correction is impossible in coordinate-based GIS • major expense for large databases

  27. Updating a street database through transactions

  28. The geodetic model • Equator, Poles, Greenwich • Sparse, high-accuracy points • First-order network • Dense, lower-accuracy points • Second-order network • Interpolated positions of even lower accuracy • Locations at each level inherit the errors of their parents

  29. Formalizing measurement-based GIS • Structured as a hierarchy • levels indexed by i • locations at level i denoted by x(i) • locations at level (i+1) derived through equations of the form x(i+1) = f(m,x(i)) • locations at level 0 anchor the tree • locations established independently (GPS but not DGPS) are at level 0

  30. An example • A utility database • Pipe's location is measured at 3 ft from a property boundary • m = {3.0,L} • property at level 3, pipe at level 4 • Property location is later revised or resurveyed • new m = {2.9,L} • effects are propagated to dependent object

  31. Beyond the geodetic model • National database of major highways • 100m uncertainty in position • sufficient for agency • relative accuracies likely higher, e.g. highways are comparatively straight, no sudden 100m offsets • Local agency database • 1m accuracy required • two trees with different anchors

  32. Merging trees • Link with a pseudo-measurement • displacement of 0 • standard error of 100m • revisions of the more accurate anchor can now be inherited by the less accurate tree • but will normally be inconsequential

  33. Conclusions • Almost universal adoption of coordinate-based GIS • assumes it is possible to know location exactly • design precision greatly exceeds actual accuracy • in practice exact location is not knowable • attempts at partial correction lead to unacceptable topological and geometrical distortions

  34. Measurement-based GIS • Retains measurements and derivation functions • may obtain absolute locations on the fly • Supports incremental update and correction • Supports merger of databases with different inheritance hierarchies • Legacy GIS designs are not optimal

More Related