1 / 20

Making and Identifying Digital Objects: the CUGIR 2.0 Approach

Making and Identifying Digital Objects: the CUGIR 2.0 Approach. Metadata Working Group 10/25/02. Jon Corson-Rikert Elaine Westbrooks Adam Chandler. Overview. CUGIR 1.0 Moving Towards Solutions CUL Internal Grant Web Map Cap Grant Web Mapping Geodatabase-ORACLE SDE CUGIR 2.0 Risks

elu
Download Presentation

Making and Identifying Digital Objects: the CUGIR 2.0 Approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Making and Identifying Digital Objects: the CUGIR 2.0 Approach Metadata Working Group 10/25/02 Jon Corson-Rikert Elaine Westbrooks Adam Chandler

  2. Overview • CUGIR 1.0 • Moving Towards Solutions • CUL Internal Grant • Web Map Cap Grant • Web Mapping • Geodatabase-ORACLE SDE • CUGIR 2.0 • Risks • Goals & Summary

  3. CUGIR 1.0 Assumptions • 1995 assumptions are inadequate • Geographic data in NY available by county or USGS quadrangle map • Hosts limited set themes • hosts single versions of themes • Digital resources indexed and retrieved based on file naming conventions

  4. CUGIR 1.0 Implementation • File naming convention combines location, thematic content, and data format • 001hya.gz • Same naming conventions for metadata • Data, metadata & documentation zipped together • Directory structure • One directory per county (62) • Not practical for all data

  5. Stresses to the CUGIR 1.0 Structure • Asymmetrical growth • Tompkins County is unique • File naming conventions stretched, adapted • Proliferation of related data themes • 7 themes could be hydrography • Inability to handle versioning • NAD27 vs NAD83 wetlands data

  6. Final Motivations for CUGIR 2.0 (1) • Availability new data series- watersheds • Requests to handle more localized data • Errors in zipping data, metadata • Difficulty maintaining large numbers of metadata in multiple formats

  7. Final Motivations for CUGIR 2.0 (2) • Difficulty collecting web usage statistics • Need to improve info retrieval • Need to improve CUGIR website/interface • Need to temporarily restrict access to data under requirements of the Patriot Act

  8. Solutions- Library Internal Grant (2001-02) • Enhancing access to CUGIR: converting FGDC metadata to MARC, Dublin Core • Goals: • Make metadata discoverable via OPAC, WorldCat, & OAI • Provide persistent URLs for CUGIR metadata, data • Modeled on Nelson/Maly paper, “Smart Objects for Digital Libraries” SODA

  9. Solutions- FGDC Web Map CAP Grant 2001-02 • Supplementing CUGIR metadata with online linkage pointers to web mapping service(s) • Users can display a map in a standard web browser without requiring a data download or GIS software • Interoperation with other web mapping services (WMS) • Vendor-neutral WMS format for HTTP requests • Display data in CUGIR from other WM services • Make CUGIR data available for other WM services

  10. Changes Under the Hood • Relational database to support the OAI buckets • Currently MySQL; ORACLE next • Primary table uses the 3 column unique key for buckets • Mapsheet (location) – Monroe County, Town of Danby • Coverage (data theme) – Roads, Landfills • Data format – Shapefile, Arc Export, DRG, DEM, CAD • Updated versions treated as new data themes • Census 1995, 1998, and now 2000 data • Dynamically generate web pages • Java, JDBC, Tomcat Servlets and Java Server Pages

  11. Bucket is the Digital Object • Web interface in CUGIR guides users to buckets table via searching/browsing • Functionality to display metadata, preview map, download data • Different “verbs” in the SODA model • NSDI Clearinghouse, OAI, and OPAC searches lead to same database table

  12. Datafiles and Metadata in CUGIR 1.5 • Buckets table stores server, directory, data file name, and metadata file names • Data and metadata can be moved individually or en masse without requiring change to any of the front-end access modalities • Still requires that files exist on disk • We need to migrate away from file-based data storage for CUGIR 2.0

  13. GIS Technology for Data Storage • CUGIR is small potatoes • Largest current vector data ~30,000 records statewide (2000 Census Blocks) • Not considered “large” until > 10,000,000 records • Image data is larger, but services are available on national scale at the level of detail we plan for NY • Technology driven by military requirements

  14. Storing CUGIR data in GeoDatabase (1) • For each data theme, merge individual county or quad data files into large statewide datasets • Data stored in Oracle tables managed by ESRI’s ArcSDE software • ArcSDE manages spatial indices and rewrites SQL to allow Oracle to do primary filtering based on spatial area of interest – very fast • Serve to web via ArcIMS • Datasets grouped into logical map services • 2000 Census data, agricultural data, elevation data

  15. Storing CUGIR Metadata in GeoDatabase (2) • Metadata Server included in ArcIMS product • Includes Z39.50 service manager • “NSDI Clearinghouse Node in a Box” • Not clear yet whether it will do what we need • Needs to support all elements and make them searchable • We need dynamic export to multiple formats (xml) • Adoption depends on how much spatial localization of metadata is important

  16. Advantages of GeoDatabases to Users (1) • Seamless coverage • Original geographic unit of compilation or distribution no longer relevant • Easily overlay data gathered by quad or watershed with data from county-based sources • Extraction server permits download of dynamically-generated zip files by arbitrary area as well as by attribute • Data may be accessed directly from users’ ESRI desktop mapping applications

  17. Advantages of GeoDatabases to Library (1) • No longer maintaining thousands of data files • No longer maintaining 4x metadata files (text, html, sgml, xml) • Improved management and backup of data via Oracle and ArcSDE tools

  18. Advantages of GeoDatabases to Library (2) • Support for serving raster & vector data • Raster catalogs of aerial photos even if not spatially rectified • Comparable speed to wavelet-compressed file formats • Access GIS data from ordinary Oracle queries • Gazetteer service to link data with maps using place names and geographic features • Supplement EnCompass functionality

  19. Risks of GeoDatabase Approach • Many pieces must working together • Oracle, ArcSDE, ArcIMS, Apache, Tomcat Servlet Engine • Version inconsistencies may inhibit upgrades • Complex to set up • Web mapping is a very different interface requiring lots of client-side programming • ArcSDE and ORACLE require training • Uncertain server load from web mapping

  20. Summary and Goals • Migrating away from a the notion of CUGIR as a geographic data repository • Was a web front end with fixed back-end deliverables • CUGIR 2.0 will be a spatially-enabled service provider • Buckets are persistent digital objects providing access to dynamically updated metadata, data, and maps • CUGIR data may appear “live” in CUL/external solutions • Allow users to create their own digital resources drawing on a mix of CUGIR and externally-served data

More Related