1 / 29

Preserving Geospatial Data at Sub-National Level: Challenges and Solutions

Explore geospatial data preservation challenges at the state level using the North Carolina experience. Learn about risks, value in older data, and potential solutions for sustainable data archiving.

jknott
Download Presentation

Preserving Geospatial Data at Sub-National Level: Challenges and Solutions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geospatial Data Preservation Challenges at the Sub-National Level:The North Carolina ExperienceSteve MorrisHead of Digital Library InitiativesNorth Carolina State University Libraries Cambridge Conference July 18, 2007

  2. Outline • Project background • Targeted geospatial content • Risks to data • Value in older data • Challenges (Technical and organizational) • Solutions (?) • Next steps

  3. NC Geospatial Data Archiving Project • Partnership between university library (NCSU) and NC Center for Geographic Information & Analysis • Part of the Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP) • Focus on state and local geospatial content in North Carolina (statedemonstration) • Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories • Objective: engage existing state/federal geospatial data infrastructures in preservation Serve as catalyst for discussion within industry

  4. NCGDAP Goals • Repository Goal • Capture at-risk data • Explore technical and organizational challenges • Project End Goal • Data Producers: Improved temporal data management practices • Archives: More efficient means of acquiring and preserving data; Progress towards best practices Temporal data management vs. long-term preservation

  5. Collection Focus: State and Local Government Geospatial Data • 96 of 100 North Carolina Counties have GIS systems as do many municipalities • Over 30 state agency data producers • Exceptional value • Detailed, current, accurate • Exceptional risk • Inconsistent or nonexistent archiving practices • Complicated formats and complex objects Source: NC OneMap

  6. Carrboro, NC : Population 17,797 (2005 est.) 22 downloadable GIS data layers 10 web mapping applications 3 OGC WMS services (web services) 9 downloadable PDF map layers

  7. NCGDAP Data Types – Vector GIS • County, municipal, state • Detailed, accurate, current • Frequently updated • Cadastral (tax parcels) • Street centerlines • Zoning • Topographic contours • School, sheriff, fire • Voting precincts • More …

  8. NCGDAP Data Types – Digital Orthophotography • All 100 NC counties with orthos • 1-5 flight years per county • 30-300 gb per flight

  9. NCGDAP Data Types – Cartographic • GIS Software • Software project file (.mxd, .apr, …) • Data layer file (.avl, .lyr, …) • PDF map exports • Web Services-based representations Note: Percentages based on the actual number of respondents to each question

  10. Other Data Types – Place-based Data Oblique Imagery • Mobile, LBS, and, social networking applications • Long-term cultural heritage value in non-overhead imagery: more descriptive of place and function Street View Images Tax Dept. Photos Road Videologs Note: Percentages based on the actual number of respondents to each question

  11. Digital Preservation Points of Failure • Data is not saved, or … • can’t be found, or … • media is obsolete, or … • media is corrupt, or … • format is obsolete, or … • file is corrupt, or … • meaning is lost Solutions: Migration Emulation Encapsulation XML

  12. Risks to Geospatial Data • Producer focus on current data • Data overwrite as common practice • Future support of data formats in question • No open, supported format for vector data • Shift to web services-based access • Data becoming more ephemeral • Inadequate or nonexistent metadata • Impedes discovery and use • Increasing use of spatial databases for data management • The whole is greater than the sum of the parts

  13. Value in Older Data: Solving Business Problems Land use change analysis Site location analysis Real estate trends analysis Disaster response Resolution of legal challenges Impervious surface maps Suburban Development 1993/2002 Near Mecklenburg-Cabarrus County border

  14. Value in Older Data: Cultural Heritage Future uses of data are difficult to anticipate (as with Sanborn Maps)

  15. Challenge: Vector Data Formats • No widely-supported, open vector formats for geospatial data • Spatial Data Transfer Standard (SDTS) not widely supported • Geography Markup Language (GML) – diversity of application schemas and profiles a challenge for “permanent access” • Spatial Databases • The whole is more than the sum of the parts, and the whole is very difficult to preserve • Can export individual data layers for curation, but relationships and context are lost • Some thinking of using the spatial database as the primary archival platform

  16. Challenge: Cartographic Representation Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc.

  17. Challenge: Geospatial Web Services • How to capture records from decision- • making processes? • Possible: Atlas collections from automated • image capture • Web 2.0 impact: Emerging tiling and • caching schemes (archive target?)

  18. Challenge: Preservation Metadata Results from a 2006 survey of all 100 NC counties and 25 largest NC municipalities

  19. Challenge: Data Capture 2006 Frequency of Capture Survey targeting North Carolina counties and municipalities Response: yes = 65.3%, no = 34.7%* (out of 57.6% response rate)

  20. Data Capture Survey Results: Overview • Two-thirds of responding agencies create and retain periodic snapshots • Long-term retention more common in counties with larger populations • Storage environments vary, with servers and CD-ROMs most common • Offsite storage (or both onsite and offsite) is used by nearly half of the respondents • Popularity of historic images has resulted in scanning and geo-referencing of hardcopy aerial photos among one-third of the respondents

  21. Solutions: Content Exchange Infrastructure • Volume of state/federal requests for local data (“contact fatigue”) spurs rethinking of archive strategy for data acquisition • Leveraging more compelling business reasons to put the data in motion (disaster preparedness, highway construction, census, …) • Content exchange networks: • Minimize need to make contact • Add technical, administrative, descriptive metadata • Establish rights and provenance

  22. Informing and Leveraging Other Infrastructure • NC GIS Inventory • Efficient data identification • Adding preservation elements Orthophoto Data Distribution System Efficient transfer of large quantities of imagery • NC OneMap Data Download and Viewer • Public access • Data visualization Street Centerline Data Distribution System Efficient transfer of data from 100 counties, with metadata and clarified rights

  23. Solutions: Engaging Standards Efforts • Partnered with EDINA (UK) and NARA to approach the Open Geospatial Consortium (OGC) in 2005-2006 • Working Group charter approved by OGC Technical Committee plenary Dec. 2006

  24. Points of Engagement with the OGC • GML for archiving • Geo Rights Management – adding archive use cases • Content packaging • Saving data state in web services Interactions • Content replication (OGC/Open Grid Forum talks) • Persistent identifiers • Data versioning (metadata and catalog support) • Cartographic representation Cross-fertilize between library/archives and geospatial communities

  25. Role of Commercial Data Providers Project Status Cultivating a commercial market for older data. Part of “permanent access” is marketing, advertising, and putting older data into the path of the user

  26. Signs of Hope • Software vendors are more keenly aware of temporal data management as a customer problem • Consulting firms increasingly see temporal data management and archiving as a business opportunity • Innovative practices emerging at local and state level to complement and inform national level activities Viral adoption of archiving practices vs. mandated archiving practices: which will have more effect?

  27. Next Steps • Technical • Refining repository ingest workflow (currently using DSpace) • Further investigation into use of METS (Metadata Encoding and Transmission Standard) and PREMIS (Preservation Metadata Standard) • Content exchange tests with other organizations • Organizational • OGC Data Preservation Working Group • Engaging State Archives: Local records outreach and records retention practices • Work towards formulating best practices for data capture practices for local agencies • Content exchange networks

  28. Questions? Steve Morris Head, Digital Library Initiatives NCSU Libraries ph: (919) 515-1361 Steven_Morris@ncsu.edu http://www.lib.ncsu.edu/ncgdap

  29. Note: Percentages based on the actual number of respondents to each question

More Related