200 likes | 211 Views
Explore digital geospatial data preservation issues, technical solutions, and organizational/cultural approaches in NCGDAP project partnership aiming to safeguard state and local geospatial content in North Carolina.
E N D
Preservation Strategies in the North Carolina Geospatial Data Archiving Project (NCGDAP)NCSU LibrariesSteve Morris Head of Digital Library Initiatives Digital Preservation in State Government: Best Practices Exchange 2006
Overview • Digital geospatial data preservation issues • Technical solutions • Organizational/cultural solutions Note: Percentages based on the actual number of respondents to each question
NC Geospatial Data Archiving Project • Partnership between university library (NCSU) and state agency (NCCGIA), with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP) • One of 8 initial NDIIPP partnerships (only state project) • Focus on state and local geospatial content in North Carolina (statedemonstration) • Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories • Objective: engage existing state/federal geospatial data infrastructures in preservation Note: Percentages based on the actual number of respondents to each question
Targeted Content • Resource Types • GIS “vector” data • Digital orthophotography • Digital maps • Tabular data • Content Producers • Mostly state, local, regional • Some university, commercial • Selected local federal projects Note: Percentages based on the actual number of respondents to each question
Today’s geospatial data as tomorrow’s cultural heritage Future uses of data are difficult to anticipate (as with Sanborn Maps). Note: Percentages based on the actual number of respondents to each question
Risks to Digital Geospatial Data • Producer focus on current data • Time-versioned content generally not archives • Future support of data formats in question • Vast range of data formats in use--complex • Shift to “streaming data” for access • Archives have been a by-product of providing access • Preservation metadata requirements • Descriptive, administrative, technical, DRM • Geodatabases • Complex functionality Note: Percentages based on the actual number of respondents to each question
Different Ways to Approach Preservation • Technical solutions: How do we preserve acquired content over the long term? • Cultural/Organizational solutions: How do we make the data more preservable—and more prone to be preserved—at point of production? Note: Percentages based on the actual number of respondents to each question
Vector Data Format Options • Option A: use an open format and have a really unfortunate transformation and limited vendor support for the output object • Option B: use closed format but retain the original content and count on short- and medium-term vendor support. • Option C: do both to buy time and look for an open, ASCII solution. (watch GML activity) No sweet spot, just an evolving and changing mix of flawed options that are used in combination. Note: Percentages based on the actual number of respondents to each question
Preserving Cartographic Representation Counterpart to the map is not just the dataset but also models, symbolization, classification, annotation, etc. Note: Percentages based on the actual number of respondents to each question
Preserving Geodatabases • Spatial databases in general vs. ESRI Geodatabase “format” • Not just data layers and attributes—also topology, annotation, relationships, behaviors • Growing use of geodatabases by municipal, county agencies • Some looking to Geodatabase as archive platform (in addition to feature class export) • ESRI Geodatabase archiving approaches • Feature Class Export, XML Export, Geodatabase History, File Geodatabase, Geodatabase Replication Note: Percentages based on the actual number of respondents to each question
Harnessing Geospatial Web Services Image atlases from WMS services? Capturing cartographic representation? Recording records from decisions-making processes? Later: data transfer via WFS & GML?, Other? Note: Percentages based on the actual number of respondents to each question
Project Repository Approach • Interest in how geospatial content interacts with widely available digital repository software • Focus on salient, domain-specific issues • Challenge: remain repository agnostic • Avoid “imprinting” on repository software environment • Preservation package should not be the same as the ingest object of the first environment • Tension between exploiting repository software features vs. becoming software dependent Note: Percentages based on the actual number of respondents to each question
Organizational/Cultural Approaches Provide feedback to producer organizations/ inform state geospatial infrastructure Take the data as is, in the manner in which it can be obtained “Wrangle” and archive data Note the ‘Project’ in ‘North Carolina Geospatial Data Archiving Project’– the process, the learning experience, and the engagement with industry and infrastructure are more important than the archive Note: Percentages based on the actual number of respondents to each question
Points of Engagement with Spatial Data Infrastructure • Framework data communities • Snapshot frequency, naming schemes, classification, GML application schemas, format strategies • Metadata standards and outreach • Persistent identifiers, versioning, feedback on metadata quality • Content replication/transfer • For data improvement projects, disaster preparedness, aggregation by regional service providers, … and archives • Where does archiving and preservation fit in? Note: Percentages based on the actual number of respondents to each question
Points of Engagement with the Open Geospatial Consortium (OGC) • Geography Markup Language (GML) for archiving (PDF/A version of GML?) • GeoDRM • Adding preservation use cases • Content Packaging • Will there be an industry solution? • Web Map Context Documents • Can we save data state as well as application state? • Content Replication • Is this a layer in the overall architecture? • Persistent Identifiers Note: Percentages based on the actual number of respondents to each question
Points of Engagement with Industry • Software vendors • Better support for temporal data management • Tools for retrospective data conversion • Web mashup and open source communities • WMS caching schemes • Standard tiling schemes with temporal component? • Data vendors • Cultivate market for older data (scaled pricing?) • Tech transfer on archiving practices? Note: Percentages based on the actual number of respondents to each question
Cultivating a market for older data. Project Status Note: Percentages based on the actual number of respondents to each question
Project Status Cultivating tools for retrospective conversion. Note: Percentages based on the actual number of respondents to each question
Conclusion • Geospatial data is complex, introducing manifold challenges to ingest processes and repository development • Vector data and spatial databases are especially complex • Geospatial data exists in very large quantities and is subject to frequent update • Need to engage industry in the solution • Need to engage point of production Note: Percentages based on the actual number of respondents to each question
Questions? Contact: Steve Morris Head, Digital Library Initiatives NCSU Libraries Steven_Morris@ncsu.edu Web site: http://www.lib.ncsu.edu/ncgdap/ Note: Percentages based on the actual number of respondents to each question