260 likes | 758 Views
Collection Building Processes within the North Carolina Geospatial Data Archiving Project (NCGDAP) NCSU Libraries Steve Morris Head of Digital Library Initiatives Digital Preservation in State Government: Best Practices Exchange 2006 Overview Project context and targeted content
E N D
Collection Building Processes within the North Carolina Geospatial Data Archiving Project (NCGDAP)NCSU LibrariesSteve Morris Head of Digital Library Initiatives Digital Preservation in State Government: Best Practices Exchange 2006
Overview • Project context and targeted content • Engaging spatial data infrastructure • Data exchange methods • Rights issues Note: Percentages based on the actual number of respondents to each question
NC Geospatial Data Archiving Project • Partnership between university library (NCSU) and state agency (NCCGIA), with Library of Congress under the National Digital Information Infrastructure and Preservation Program (NDIIPP) • One of 8 initial NDIIPP partnerships (only state project) • Focus on state and local geospatial content in North Carolina (statedemonstration) • Tied to NC OneMap initiative, which provides for seamless access to data, metadata, and inventories • Objective: engage existing state/federal geospatial data infrastructures in preservation Note: Percentages based on the actual number of respondents to each question
Scale of the Problem • County Digital Orthophotos • 88 counties with estimated 154 flights by 2006 • Estimated 30 gb/flight – 4.6 TB total • County, City, COG Vector Data • Variable mix of layers; some continuous update • 92 of 100 counties with GIS systems • 51 municipalities with GIS systems • State Agency Data • 1993 and 1998 statewide orthos – 800 gb • Terabytes of vector data and other imagery • 17-20 TB of LIDAR data Note: Percentages based on the actual number of respondents to each question
Earlier NCSU Acquisition Efforts • NCSU University Extension project 2000-2001 • Target: County/city data in eastern NC • “Digital rescue” not “digital preservation” • Project learning outcomes • Confirmed concerns about long term access • Need for efficient inventory/acquisition • Wide range in rights/licensing • Need to work within statewide infrastructure • Acquired experience; unanticipated collaboration Note: Percentages based on the actual number of respondents to each question
What is Spatial Data Infrastructure? • National Spatial Data Infrastructure • Content Standards, metadata standards • Data discovery methods (Geospatial One-Stop, NSDI Clearinghouse Z39.50 search) • Cultivating development of framework data layers • State Spatial Data Infrastructures • State level metadata development, content standards, data sharing, data clearinghouses • NC OneMap in North Carolina, from 2003 SDI’s have not been addressing archiving and preservation … Note: Percentages based on the actual number of respondents to each question
SDI and Collection Building • Leverage data sharing agreements • Data content standards – adherence by agencies makes archiving easier • Archive feedback to metadata outreach efforts • Content exchange networks (not a well developed part of SDI) • Goals • Make data more preservable at point of production • Make preservation a seamless part of the data production process Note: Percentages based on the actual number of respondents to each question
Transfer modes - Conventional • CD/DVD • 230 CD-ROMs for 1999 Wake County orthophotos • External drives • Becoming more routine • FTP • Bandwidth intensive: restricted to off hours, or not done • WAN (Wide Area Network) • Network incompatibilities, network load • Web Download • Complex interfaces make automation difficult Note: Percentages based on the actual number of respondents to each question
Complex download interfaces make automated web capture difficult Note: Percentages based on the actual number of respondents to each question
Transfer Modes - Web Services • WMS (Web Map Service) • Can only capture derived static images, losing the underlying data intelligence • Possible use for agent-based image atlas creation • WFS (Web Feature Service) • Transfers actual vector data as GML • Not widely deployed; variation in configuration • Scalability for bulk transfer questionable • Federal Enterprise Architecture Geospatial Profile suggests WMS, WFS, FTP Note: Percentages based on the actual number of respondents to each question
Harnessing Geospatial Web Services Image atlases from WMS services? Capturing cartographic representation? Recording records from decisions-making processes? Later: data transfer via WFS & GML?, Other? Note: Percentages based on the actual number of respondents to each question
SDI Approach to Transfer Problem? • 20 different NC state agencies ask for local data, including at least three units from one agency • Data also transferred to federal agencies (Census, FEMA, …) for data improvement • … and to regional agencies (RPOs, MPOs, COGs) for data aggregation and projects • Archive development should piggyback on existing data transfers • Grass roots effort in NC to coordinate acquisition, formalized in working group, starting April 2006 Note: Percentages based on the actual number of respondents to each question
Intellectual Property Rights Issues • Subject to Public Records Law • Public record: no privacy issues … • … but records for some individuals may be filtered • Disclaimer viewing important (liability) • Restrictions on commercial reuse – desire for downstream control of data • Great deal of variation in access/use policy • Trust between agencies is important • Becoming more common to share data but not sign formal agreements Note: Percentages based on the actual number of respondents to each question
Obtaining NC Local GIS Data Source: NC OneMap Data Inventory 2004 Note: Percentages based on the actual number of respondents to each question
Rights issues in the web services space are ambiguous -- e.g., 39 NC counties allow GIS-based access to ArcIMS, but extraction rights are not clear Note: Percentages based on the actual number of respondents to each question
Conclusion • State and local geospatial data a very large scale problem (data quantity; number of agencies) • Need to engage spatial data infrastructure in collection progresses, piggyback on existing processes • Web services processes may play a role in future collection building efforts • Great deal of variation in interpretation of public records law, even in one state Note: Percentages based on the actual number of respondents to each question
Questions? Contact: Steve Morris Head, Digital Library Initiatives NCSU Libraries Steven_Morris@ncsu.edu Web site: http://www.lib.ncsu.edu/ncgdap/ Note: Percentages based on the actual number of respondents to each question
Note: Percentages based on the actual number of respondents to each question
“Web mash-ups” and the New Mainstream Geospatial Web Services How does temporal data fit into emerging WMS caching and tiling schemes? Capture of tiles and caches for archive? Note: Percentages based on the actual number of respondents to each question
Needed: Efficient Content Replication • Content replication also needed for: • Disaster preparedness • State and federal data improvement projects • Aggregation by regional geospatial web service providers • WFS, e.g.: efficiency in complete content transfer? • Rsync-like function, plus: rights management, inventory processes, metadata management, informed by data update cycles • Archiving delta files vs. complete replication – need to avoid requiring “digital archaeology” in the future Note: Percentages based on the actual number of respondents to each question
Local Government GIS: Archival Issues • Data resources are highly distributed and subject to frequent update • More detailed, current, accurate than federal/state data resources • North Carolina local agency GIS environment • 100 counties, 92 with GIS • 80 counties with high resolution orthophotography • Growing number of municipal systems • Value: $162 million plus investment (est. in 2003) Note: Percentages based on the actual number of respondents to each question
Geospatial Web Services Rights IssuesExample: Desktop GIS-accessible ArcIMS • 39 of 100 NC counties have desktop GIS-accessible ArcIMS services • It is difficult to know how many of these counties actually expect users to either: • A) access data through desktop GIS for viewing only, or • B) extract and download data Note: Percentages based on the actual number of respondents to each question
NC OneMap Data Sharing Agreements • NCCGIA working to clarify legal issues surrounding redistribution • Striking MOAs with local agencies as part of NC OneMap framework for open access • One of stipulations: “…AGREE that the data shared under this agreement may be further redistributed with applicable metadata by either agency WITHOUT FEES in the public domain and without restriction, unless otherwise noted herein and/or unless otherwise subject to public laws of governing authorities … • As of 2004 MOAs distributed to 55 counties in draft form, 24 signing with option for redistribution Note: Percentages based on the actual number of respondents to each question
County Digital Orthophotography Specifics Note: Percentages based on the actual number of respondents to each question Source: NC OneMap Data Inventory 2004
Content Organization (tiling, etc.) • State agency data • Vector data: statewide, river basin, quarter quadrangles, counties • Imagery: quarter quadrangles, local images • County data • Vector data: county, tax map units • Orthophotos: tax map units, county mosaics • Increasing: spatial databases (SDE, PostGIS, etc.) • Municipal data • Vector data: city, tax map units Note: Percentages based on the actual number of respondents to each question
Versioning and Updating • Orthophotos • County digital orthophotos reflown every 2-7 years • Statewide digital orthophoto plan: every 5 years (alternating B&W and color infrared) • Vector Data • State agency vector data: some static, some periodically updated, relatively fewer continuously updated • County/City/COG vector data: many data layers continuously or periodically updated • Old versions supplanted, exist on relatively inaccessible backups Note: Percentages based on the actual number of respondents to each question