1 / 26

Risks to Digital Geospatial Data

Long-Term Preservation of At-Risk Digital Geospatial Data: The North Carolina Geospatial Data Archiving Project Steve Morris NCSU Libraries. Risks to Digital Geospatial Data. .shp. .mif. .gml. .e00. .dwg. .dgn. .bsb. .bil. .sid. Risks to Digital Geospatial Data.

bishop
Download Presentation

Risks to Digital Geospatial Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Long-Term Preservation of At-Risk Digital Geospatial Data: The North Carolina Geospatial Data Archiving ProjectSteve MorrisNCSU Libraries

  2. Risks to Digital Geospatial Data .shp .mif .gml .e00 .dwg .dgn .bsb .bil .sid Note: Percentages based on the actual number of respondents to each question

  3. Risks to Digital Geospatial Data • Producer focus on current data • Also, archiving data does not guarantee “permanent access” • Future support of data formats in question • Need to migrate formats or allow for emulation • Data failure • “Bit rot”, media failure • Preservation metadata requirements • Descriptive, administrative, technical, DRM • Shift to “streaming data” for access Note: Percentages based on the actual number of respondents to each question

  4. Note: Percentages based on the actual number of respondents to each question

  5. Note: Percentages based on the actual number of respondents to each question

  6. Note: Percentages based on the actual number of respondents to each question

  7. Note: Percentages based on the actual number of respondents to each question

  8. Today’s geospatial data as tomorrow’s cultural heritage Note: Percentages based on the actual number of respondents to each question

  9. Time series – vector data Parcel Boundary Changes 2001-2004, North Raleigh, NC Note: Percentages based on the actual number of respondents to each question

  10. Time series – Ortho imagery Vicinity of Raleigh-Durham International Airport 1993-2002 Note: Percentages based on the actual number of respondents to each question

  11. NC Geospatial Data Archiving Project • Partnership between university library (NCSU) and state GIS agency (NCCGIA) • Focus on state and local geospatial content in North Carolina (statedemonstration) • Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventory information • Objective: engage existing state/federal geospatial data infrastructures in preservation Note: Percentages based on the actual number of respondents to each question

  12. Local Government GIS in NC • Data resources are highly distributed and subject to frequent update • More detailed, current, accurate than federal/state data resources • North Carolina local agency GIS environment • 100 counties, 95 with GIS • 85 counties with high resolution orthophotography • Growing number of municipal systems • Hundreds of millions of dollars investment Note: Percentages based on the actual number of respondents to each question

  13. NCGDAP Targeted Content • Resource Types • GIS “vector” (point/line/polygon) data • Digital orthophotography • Digital maps • Tabular data (e.g. assessment data) • Content Producers • Mostly state, local, regional agencies • Some university, not-for-profit, commercial • Selected local federal projects Note: Percentages based on the actual number of respondents to each question

  14. Work plan in a Nutshell • Work from existing data inventories • NC OneMap Data Sharing Agreements as the “blanket”, individual agreements as the “quilt” • Partnership: work with existing geospatial data infrastructures (state and federal) • Technical approach: blend emerging digital library technologies with geospatial technologies • Metadata: METS, FGDC, PREMIS?, GeoDRM? • Repository: Dspace and others Note: Percentages based on the actual number of respondents to each question

  15. Big Challenges • Format migration paths • Management of data versions over time • Preservation metadata • Harnessing geospatial web services • Preserving cartographic representation • Keeping content repository-agnostic • Preserving spatial databases • More … Note: Percentages based on the actual number of respondents to each question

  16. Vector Data Format Issues • Vector data much more complicated than image data • ‘Archiving’ vs. ‘Permanent access’ • An ‘open’ pile of XML might make an archive, but if using it requires a team of programmers to do digital archaeology then it does not provide permanent access • Piles of XML need to be widely understood piles • GML: need widely accepted application schemas (like OSMM?) • The spatial database conundrum • Export feature classes, and lose topology, annotation, relationships, etc. • … or use the spatial database as the primary archival platform (some are now thinking this way) Note: Percentages based on the actual number of respondents to each question

  17. Managing Time-versioned Content • Many local agency data layers continuously updated • E.g., some county cadastral data updated daily—older versions not generally available • Individual versioned datasets will wander off from the archive • How do users “get current metadata/DRM/object” from a versioned dataset found “in the wild”? • How do we certify concurrency and agreement between the metadata and the data? Note: Percentages based on the actual number of respondents to each question

  18. Preservation Metadata Issues • FGDC Metadata • Many flavors, incoming metadata needs processing • Cross-walk elements to PREMIS, MODS? • Metadata wrapper • METS (Metadata Encoding and Transmission Standard) vs. other industry solutions • Need a geospatial industry solution for the ‘METS-like problem’ • GeoDRM a likely trigger—wrapper to enforce licensing (MPEG 21 references in OGIS Web Services 3) Note: Percentages based on the actual number of respondents to each question

  19. Harnessing Geospatial Web Services Note: Percentages based on the actual number of respondents to each question

  20. Harnessing Geospatial Web Services • Automated content identification • ‘capabilities files,’ registries, catalog services • WMS (Web Map Service) for batch extraction of image atlases? • last ditch capture option • preserve cartographic representation • retain records of decision-making process • … feature services (WFS) later. • Rights issues in the web services space are ambiguous … GeoDRM in development Note: Percentages based on the actual number of respondents to each question

  21. Preserving Cartographic Representation Note: Percentages based on the actual number of respondents to each question

  22. Preserving Cartographic Representation • The true counterpart of the old map is not the GIS dataset, but rather the cartographic representation that builds on that data: • Intellectual choices about symbolization, layer combinations • Data models, analysis, annotations • Cartographic representation typically encoded in proprietary files (.avl, .lyr, .apr, .mxd) that do not lend themselves well to migration • Symbologies have meaning to particular communities at particular points in time, preserving information about symbol sets and their meaning is a different problem Note: Percentages based on the actual number of respondents to each question

  23. Repository Architecture Issues • Interest in how geospatial content interacts with widely available digital repository software • Focus on salient, domain-specific issues • Challenge: remain repository agnostic • Avoid “imprinting” on repository software environment • Preservation package should not be the same as the ingest object of the first environment • Tension between exploiting repository software features vs. becoming software dependent Note: Percentages based on the actual number of respondents to each question

  24. Preserving Spatial Databases • Spatial databases in general vs. ESRI Geodatabase “format” • Not just data layers and attributes—also topology, annotation, relationships, behaviors • ESRI Geodatabase archival issues • XML Export, Geodatabase History, File Geodatabase, Geodatabase Replication • Growing use of geodatabases by municipal, county agencies • Some looking to Geodatabase as archival platform (in addition to feature class export) Note: Percentages based on the actual number of respondents to each question

  25. NCGDAP Philosophy of Engagement Provide feedback to producer organizations/ inform state geospatial infrastructure Take the data as in the manner In which it can be obtained “Wrangle” and archive data Note the ‘Project’ in ‘North Carolina Geospatial Data Archiving Project’– the process, the learning experience, and the engagement with geospatial data infrastructures are more important than the archive Note: Percentages based on the actual number of respondents to each question

  26. Questions? Contact: Steve Morris Head of Digital Library Initiatives NCSU Libraries Steven_Morris@ncsu.edu Phone: (919) 515-1361 NCGDAP website: http://www.lib.ncsu.edu/ncgdap/ Note: Percentages based on the actual number of respondents to each question

More Related