580 likes | 897 Views
Alexandria Digital Library (ADL). What is ADL? What is it suppose to do?. ADL Mission. To provide a distributed spatially searchable digital library of geographically referenced materials.
E N D
Alexandria Digital Library (ADL) What is ADL? What is it suppose to do? Alexandria digital Library – Davidson Library, UCSB
ADL Mission • To provide a distributed spatially searchable digital library of geographically referenced materials. • The library's components may be distributed (spread across the Internet) or coexist within a single network or desktop. Geographically-referenced means that all the information objects in the library will be associated with one or more regions ("footprints") on the surface of the Earth. Alexandria digital Library – Davidson Library, UCSB
Alexandria Digital Library (ADL) • NSF funded digital library project 1994-98 • New method to organize & search for information • Focused on geographical information • Internet searching and data delivery • Operational library 1999-Present • 2.8million bibliographic records • 5.5 million place names records • 7.5 terabytes of on-line data • Available to the public via the Internet Alexandria digital Library – Davidson Library, UCSB
What Is Spatial Information? Museum Artifacts Art about … Zoological Habitat Study Geographical Data Archives Botanical Survey Earth Science Data Books about … Archeological Digs Alexandria digital Library – Davidson Library, UCSB
Museum Artifacts Earth Art ADL Library of Distributed Spatial Information Objects Other Digital Archives Zoological Habitat Study If it has a latitude and longitude then it can be in the ADL library Botanical Study Ocean Science Data Archeological Dig What information do you have about here? Alexandria digital Library – Davidson Library, UCSB
ADL Organization The ADL project has: • An operational library run by the Davidson Library, • A research component (ADEPT) funded by NSF and others, and • A gazetteer (place name index and geocoder) run by the Davidson Library Alexandria digital Library – Davidson Library, UCSB
Operational Partners • Implementers • AUT – (Auckland University of Technology)Software implementation and content builder • DLESE – (Digital Library for Earth Systems Education) Software implementation and content builder • CNR – (Center for National Research, Pisa Italy)Content Builders • ADEPT – Educational classroom content • CASS – (Center for the Analysis of Sacred Sites) – Video, sound, imagery text • ESSW – MODIS real-time spacecraft imagery • Scripps – SIOExplorer Oceanographic Data Alexandria digital Library
Alexandria Digital Library (ADL) History Alexandria digital Library – Davidson Library, UCSB
Prototypes • Rapid Prototype (CD ROM + Arc View) • Java Application • Marc & FGDC Union Catalog • Web Version 1 • Search Optimized Fields, AKA “Search Buckets” • Java Application • CDL Web Client Alexandria digital Library – Davidson Library, UCSB
Marc & FGDC Web Prototype (1995) Alexandria digital Library – Davidson Library, UCSB
Java Application Prototype (1997) Alexandria digital Library – Davidson Library, UCSB
“Webclient” Interface (2002) 1/2 Alexandria digital Library – Davidson Library, UCSB
“Webclient” Interface (2002) 2/2 Alexandria digital Library – Davidson Library, UCSB
ADL - Web Gazetteer Printed Report Alexandria digital Library – Davidson Library, UCSB
Alexandria Digital Library (ADL) Current ADL Architecture Alexandria digital Library – Davidson Library, UCSB
Common Features of the Prototypes • Map • Place name search • Search definition frame/panel/tab • Vocabulary support where appropriate • Standardized citation & metadata display/formatting Alexandria digital Library – Davidson Library, UCSB
ADL Architecture Goals (1/2) • Catalog separate from the data distribution • Metadata agnostic search methodology • Data center reliability • Collection level metadata • Search buckets • Strongly typed aggregated search field based on library concepts • Facilitate quick/easy ingest of collections • Abstract, searchable indexes Alexandria digital Library – Davidson Library, UCSB
ADL Architecture Goals (2/2) • Digital library for georeferenced information • distributed • heterogeneous • rich services • scalable • many providers • collections, large and small • Standard components, interfaces Alexandria digital Library – Davidson Library, UCSB
collection registry thesaurus collection-level search shared vocabularies library content gazetteer item-level search, metadata management data access maps placenames to locations map background imagery, layering capability Components/services collection collection item item item item *many interconnections between services* item Alexandria digital Library – Davidson Library, UCSB
internal collections generic database driver Z39.50 driver proxy driver collection aggregator Library Server Architecture item tracker userinterface metadata mapper harvest loader client interface (XML / Java,HTTP,RMI) middleware access control; query fan-out; query result caching & ranking collection referencing & registration collection interface (XML / Java) Alexandria digital Library – Davidson Library, UCSB
Architecture - Buckets Buckets Alexandria digital Library – Davidson Library, UCSB
What is a bucket? (1/3) • Strongly-typed aggregated search fields based on library concepts • Similar to Dublin Core, but define allowable content and search semantics, and are optimized for geospatial searching • Facilitate quick/easy ingest of collections • Abstract, searchable indexes: • Location, Time, Type, Format, Originator, Assigned terms, Subject related text and Identifiers Alexandria digital Library – Davidson Library, UCSB
What is a bucket? (2/3) • Strongly typed, abstract metadata category with defined search semantics to which source metadata is mapped • Key properties • name • Coverage date • semantic definition • The time period to which the item is relevant. • data type (strictly observed) • calendar date or range of calendar dates • syntactic representation (strictly observed) • ISO 8601 Alexandria digital Library – Davidson Library, UCSB
What is a bucket? (3/3) • Source metadata is mapped to buckets • buckets hold not just simple values • “2001-09-08” • but rather, explicit descriptions of those values • (FGDC, 1.3, “Time period of content”, “2001-09-08”) • multiple values may be mapped per bucket • Bucket definition includes search semantics • defines query terms • ISO 8601 date range • defines query operators • contains, overlaps, is-contained-in • semantics are slightly fuzzy in certain cases to accommodate multiple implementations Alexandria digital Library – Davidson Library, UCSB
Standard buckets ADL • Subject-related text • Title • Assigned term • Originator • Geographic location • Coverage date • Object type • Format • Identifier Dublin Core • DC.Subject • DC.Title • DC.Subject (qualified) • DC.Creator + DC.Publisher • DC.Coverage.Spatia • DC.Coverage.Temporal • DC.Type • DC.Format • DC.Identifier Alexandria digital Library – Davidson Library, UCSB
Bucket Motivation • Heterogeneous metadata • Uniform client services • Spatial search requires • Strongly typed search fields • Optimized for geospatial searching Alexandria digital Library – Davidson Library, UCSB
Summary • A bucket is a strongly typed, abstract metadata category with defined search semantics to which source metadata is mapped • Supports discovery/search across distributed, heterogeneous collections that use metadata structures of their choosing • Supports high-level searching across collections and supports “drill-down” searching to the item-level metadata elements Alexandria digital Library – Davidson Library, UCSB
Benefits of the Architecture • Standard Readily-Optimized Search Methodology • Simplifies Design: • Provides a client with a standard API for searching different data sources. • Provides a way to discover a changed data locations. • Scalability • Scale by upgrading the database • Scale by distributing the databases Alexandria digital Library – Davidson Library, UCSB
ADL Metadata Metadata Ingest Alexandria digital Library – Davidson Library, UCSB
Collection Ingest Procedure Alexandria digital Library – Davidson Library, UCSB
Extract Metadata • Programming is used • to automate repetitive • and time-consuming • processes, extract portions of metadata • and to change the • format of metadata. • ACCESS, PERL, SQL • UNIX shell script 1:24,000 Query Geodex Records -118 to -116 34 to 32 • Total Records: ~330,000 • By Scale :…………...40,000 • Within San Diego • Boundary area:…………700 • Eliminate duplicate • and dirty:…………………89 A_1 a1 a.1 A-1 Clean Metadata Massage ADL Record Sys. control num.: o32117e1 Geodex Record “TITLE”: Imperial Beach, Ca. ; 32117-E1 Imperial Beach, Ca. Digital Raster Graphic, DRG, of 7.5 minute topographic quadrangle. ADL Title San Diego DRG Metadata Processing Alexandria digital Library – Davidson Library, UCSB
Processes Organize Metadata Publisher Separate shared and unique values for every record. Stephen P. Teale Data Center Shared (Parent) Assign adl control number Title of Particular DRG Unique (Child) Digital Raster Graphic, DRG of Otay Mesa, CA, 7.5 minute topographic quadrangle. Create Metadata Creation of values for required fields for which we don’t have info/metadata. Search (Visit Teale web pages for DRG production information) Original cataloging (access path) Calculation (determining resolution and footprint) ADL Metadata Alexandria digital Library – Davidson Library, UCSB
Object Type cartographic works maps images photographs aerial photographs • • • Count 324,876 324,876 2,014,799 484,083 484,083 Collection-level metadata Alexandria digital Library – Davidson Library, UCSB
Alexandria Digital Library Future Directions Alexandria digital Library – Davidson Library, UCSB
Core Service Directions • Lowering the barrier • metadata management services • OAI harvest loader • improved packaging • Service aggregation via harvesting • Content-based searches, ranking • text IR, image texture • Collection discovery • Display results over the map - layering • Storage of user result sets on server Alexandria digital Library – Davidson Library, UCSB
The Ideal ADL Entry Portal The Portal will be: • Easy to use - allows patron to search collection w/out knowing keywords or jargon • Flexible - to allow users of differing levels of geographic knowledge to find the data they seek in the minimal amount of time • Help oriented - if user does not find what s/he wants, we in MIL will find out and use that knowledge to develop the collection • Dynamic - so that the user will want to return to see the latest features, collections and tools • Educational - so that the user can learn to use the site more effectively • Interesting – uncluttered, new data, featured events Alexandria digital Library – Davidson Library, UCSB