1 / 28

Visualization Tools Change the Whole Game on Locational Quality

Visualization Tools Change the Whole Game on Locational Quality . Standardization and Automation Pat Garvey Office of Environmental Information. The Game Has Changed in Web Mapping Data Quality. With new web mapping tools, close is not always good enough

ponce
Download Presentation

Visualization Tools Change the Whole Game on Locational Quality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visualization Tools Change the Whole Game on Locational Quality Standardization and Automation Pat Garvey Office of Environmental Information

  2. The Game Has Changed in Web Mapping Data Quality • With new web mapping tools, close is not always good enough • Visualization takes it to a new level of accuracy • Users are not that forgiving to close enough results

  3. Old Visualization – Just Road Map

  4. Next Improvement - Satellite Imagery

  5. Now – Aerial Photography and Street Level Views

  6. But it is not “real time” Results from recent explosion and fire!

  7. The New Game • Pin Point Accuracy • Will anything less be acceptable?

  8. Old Geocoding • The past trend in geocoding was to “geocode” to the street.

  9. New Geocoding • The new trend is to now geocode to the “rooftop” to provide a more accurate location. • Most urban areas now have “rooftop” accuracy.

  10. Need for Good Facility Locations • Visualization – map facilities of environmental interest • Analysis – identify associations between facilities and environmental conditions • What regulated facilities discharge into an impaired (polluted) water? • What air emission stacks are upwind from schools? • What regulated facilities are upstream from drinking water intakes? • What are the socio-economic characteristics of neighborhoods around Superfund sites?

  11. Special Activities • Visualization of Facilities • Envirofacts • TRI Explorer (future) • Cross Media Analysis-TRI/AFS/NEI/PCS/RCRA • RSEI Support (Risk Sensitive Environmental Indicator) • RCRA/TRI Waste Management Studies • Homeland Security • 6 Layers (and counting)

  12. March 2006 Locational Data Cleanup Effort This map represents both US and International points that were in the locational database prior to March 2006 Mass Cleanup. The Mass clean up removed the International land and Sea sites in order that only points in the US were located in the database. It also removed ZIP CODE centroid points previously entered by Contractors.

  13. June 2008 Locational Data Cleanup Effort Here are the results of the cleanup effort!

  14. Why a New QA Approach is Needed Results from a Zip Code Search

  15. Steps to Improvement • Standardize – Identify trusted sources/standards and adopt them • Trust but Verify – Take data from all known sources but perform “quality” checks on everything • Automate Processes • With over 2.5 million facilities automated verification is needed • Establish Feedback Loops

  16. Facility Locations • Facilities can have two types of location: Address and Geographic Coordinate(s) • Common way to represent an Address is: • Street Address (House number, Street Name, Street Type, Direction) • City (really post office) • State • Postal Code (5 or 9 digit Zip Code) • Common way to represent a Geographic Coordinate is with latitude and longitude: • Latitude – angle between equator plane, center of the earth, and the earth’s surface (think north/south) • Longitude – angle between prime meridian plane (through Greenwich UK), center of earth, and the earth’s surface (think east/west)

  17. Address Problems • Inconsistent Address Formats (123 S Main St verses 123 South Main Street) • Misspelled Cities, Street Names • Bad Zip Codes • Post office verses locality (Orlando example) • Missing data • Multiple accepted cities for a single address (example Boston and Andover) • Address may change over time • Change in Zip Code • County annexations by City

  18. Address Solution • Definitive source for any valid street address: city/post office name(s), street name and type, zip code, CO name, congressional district • With a valid street address and city • validate/find zip, • conversely with valid street address and zip, validate/find city. • Solution: USPS Zip 4 Dataset (updated monthly – cost $900/year • List of all addresses in country • For each address provides preferred and alternate post offices, 9 digit zip code, county, congressional district

  19. Address Formatting Solution • Need a method/procedure to consistently format a street address • Solution: USPS Certified Address Parser (software) – cost $1,200/year • Corrects commonly misspelled cities, streets • Provides standardized address formats • Parses house number, pre and post directions, street name, street type, post office name, state and zip code into separate elements • Validates address against ZIP 4 dataset • Some addresses are too bad to fix! • Spelling is just too bad • Not a real address (state-wide permits)

  20. Bad Addresses • Almost ½ of all Facility addresses have problems (examples) • Chisolm Feeders, Not Available, KS 67073 • Tri Cities Termite, Unknown, Dover, DE 19091

  21. Geographic Coordinate Problems • Different formats (DMS, DD) • Different Datums (NAD27, NAD83, WGS84) • Different Collection Methods with differing accuracies • Missing metadata • Data transcription errors (one degree = 111 KM at the equator)

  22. Geographic Coordinate Solutions • Standardize on one format (DMS) • Standardize on one datum (NAD83) • Use Federal Standard NADCON grids of verified locations to convert between NAD27 and NAD83 (more accurate than mathematical datum transformations) • Verify state or EPA program provided geographic coordinates against facility address • Use Teleatlas and 2008 Tiger spatial layers to spatially “look up” the state, county, zip code, post office and verify against standardized address. (Future will be reverse Google geocoding)

  23. Address Geocoding • Geocoding – with a valid street address geocoding software returns a geographic coordinate along with “corrected address” • Google Geocoder picked because it was easy to program and because it was free within EPA/Google license. • Google returns geocoding accuracy (building, address, intersection, street, zip, city, etc.) • We use only geocodes with building, address intersection or street accuracy

  24. Geocoding Problems/Solutions • Too optimistic, sometimes returns addresses in San Francisco when confused • “Corrected” address verified against ZIP 4 dataset • FRS Geocoded addresses set a minimum standard for a facility geographic coordinate – mixed into EPA and State provided coordinates to determine “best” coordinate based on metadata and accuracy.

  25. Automated Processes • Current FRS processes modified to identify changes in either address or geographic coordinates. • Modifications in address kick off address validation/standardization and new geocode • Modifications in a program geographic coordinate kick off geographic validation • Geographic validation incorporated into “best pick” algorithm

  26. Final QA Process

  27. Results of QA

  28. Future Improvements • Provide Feedback to data providers • Provide Regional Data Stewards results of QA findings with regard to data problems with provided addresses and geographic coordinates • Provide Feedback to the public • Attach QA findings to all L/L coordinates • Use corrected address values

More Related