1 / 22

Assessing Geocoding Quality: Florida Registry Experience

This overview explores the quality of geocoded data in the Florida Registry, including identifying errors and monitoring for problems. It covers the components of geocoding quality, geocoding precision and accuracy, and the impact on data analysis. The text is in English.

andreah
Download Presentation

Assessing Geocoding Quality: Florida Registry Experience

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assessing Quality of Geocoded Data The Florida Registry Experience

  2. Overview • What is geocoding quality? • Florida’s geocoding experience • Identifying geocoding errors • Results • before and after improved geocoding • Monitoring for geocoding problems

  3. What is Geocoding? • Spatially enable • Assign geocode • Latitude/Longitude • FIPS—Census Units • Match address to street file • Batch (automated) • Interactive (manual 5-10%)

  4. Geocoding Quality Components • Match rate • Coverage, % with spatial location • Precision • Scale • County center versus census block • NAACCR Items #366,#364,#365 GIS Coordinate Quality, Census Tract Certainty • Accuracy • Correct location

  5. Geocoding Match • Software • Deterministic, Probabilistic • Parsing algorithm, Assumptions (ties) • “Black box” • Underlying street files • Quality of address data • Batch versus manual 133 NE 2nd, Miami, FL Did you mean: 133 NE 2nd St, Miami, 133 SE 2nd Ave, Miami, 133 NW 2nd Ave, Miami, 133 SW 2nd St, Miami, 133 SW 2nd Ave, Miami, 133 SE 2nd St, Miami,

  6. Geocoding Precision • Parcel match • “gold standard” • Match to building footprint • Street level match • Most common • Interpolate along street segment • Centroid • Center of polygon • Block, tract, zipcode, county • Population center, physical

  7. Geocoding Accuracy

  8. FCDS Geocoding • Proprietary, local vendor • Problems found via use • Reported county does not match geocoded county • Representativeness of cases • Cases assigned to invalid or zero population block groups • Problems found via scrutiny • Cases in nautical areas (not islands) • Vendor assumptions

  9. Geocoding Project • Test file • Created “gold standard” files • FIPS (cancer cases) • Long/Lat (well locations) • Selected a vendor • Based on logistics rather than quality • New vendor re-geocoded entire registry • Compared Results – Before and After 9

  10. Old versus Improved Vendor: County Match Problem

  11. Old versus Improved Vendor: County Match Problem

  12. Old versus Improved Vendor: % Matched

  13. Old versus Improved Vendor: Representativeness of Cases • Environmental Health • Re-geocoded our data • Census Data • 96% Black • Old Geocoding Vendor • 15% Black Cases • New Geocoding Vendor • 85% Black Cases

  14. Old versus Improved Vendor: Nautical, Invalid, Zero pop • Cases assigned to the sea • 0 cases from new vendor •  Cases assigned to invalid bg • 0 cases from new vendor • Cases assigned to 0, 1, 10 population bgs • 5,765 cases • 743 cases (3+ more years of data) • SF1 vs. SF3; Overlay

  15. Specificity ? Old Data: Improved Data:

  16. Sensitivity ? Old Data: Improved Data:

  17. Validity ? Old Data : Oral Cancer by SES New Data : Oral Cancer by SES Wealthy 37.3 ref Mid High 40.1 RR 1.08 Mid Low 45.4 RR 1.22 Poorest 49.2 RR 1.32 • Wealthy • 34.0 ref • Mid High • 36.6 RR 1.08 • Mid Low • 39.1 RR 1.15 • Poorest • 46.3 RR 1.36

  18. Monitoring Geocoding Quality • % County match • Florida zipcodes; military addresses geocoded to NJ • % Contiguous counties • Incorrect FIPS • Nautical FIPS • # Zero Pops • Representativeness

  19. Impact • Fewer, smaller, lower risk clusters • Greater % ungeocodable • More accurate • Less specific • Ungeocodable • Rural, Poor, Old • Potential bias • Manual geocoding

  20. Addressing Ungeocodables • Address quality? • Implemented edits • Software development • Improve matching algorithm • Specific to our data • Link with administrative databases • DMV, Medicaid, Medicare • Geo-imputation • Kevin Henry • Requires institutional priority ! 20

  21. Acknowledgements • Dr. Greg Kearny • Environmental Health, FL DOH • N. Dean Powell • FCDS • Jackie Button • FCDS • Dr. Monique Hernandez • FCDS • We acknowledge the CDC for financial support under cooperative agreement U58/DP000844 • Contents are responsibility of authors and do not represent views of CDC, FL DOH, or FCDS 22

More Related