150 likes | 325 Views
CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel herbertv@cs.cornell.edu. Lab : federated searching: searching distributed data & searching harvested data. Access to PC orpcuser orpcpw. A&I. image. FTXT. OPAC. e-print.
E N D
CS 502 Computing Methods for Digital LibrariesCornell University – Computer ScienceHerbert Van de Sompelherbertv@cs.cornell.edu Lab : federatedsearching: searching distributed data & searching harvested data Access to PC orpcuser orpcpw
A&I image FTXT OPAC e-print federated services
federated searching • Distributed search approach ~ Z39.50, SDLIP, ... • today: MetaLib • commercial product by Ex Libris • searches repositories using “whichever” technique • normalizes results before presenting them to the user • can merge results after initial presentation • Harvesting approach ~ OAI • today: ARC, the first OAI service provider
MetaLib • Goal: Unique, consistent interface across library resources (think Google for library collection) • Broadcastsearching over a large collection of heterogeneous resources: • Different metadata syntax (MARC, EAD, Dublin Core, TEI, unspecified, ...) • Different protocols (Z39.50, HTTP, native ALEPH, screen scraping) • Linking via integrated SFX server • User personalization • Administration of resources
Application Level: Technology Level: Information Gateway Universal Gateway User Admin Resources’ administering Accurate, target-sensitive searching Customized, personalized services Context-sensitive linking
Universal Gateway Information Gateway User Admin Resources’ administering Accurate, target-sensitive searching Customized, personalized services Context-sensitive linking
C o l l e c t i o n s The Information Gateway: The KnowledgeBase The KnowledgeBase includes all the library Resources • Cataloging Information (per collection): • Name of collection, owner, subject, services, language • Configuration Information (per MetaLib resource): • Interfacing protocol, internal format, rules of conversion R e s o u r c e s
KnowledgeBase Example of cataloged collection USMARC 245 Title: Queen Elizabeth II Library 270 Location: Memorial University of Newfoundland | St. John’s, Newfoundland | Canada | AC15S7 307 Access Times: Monday-Thursday 8:30-20:45 Closed Saturdays 520 Description: Main library covers humanities, science, computers, physical ed, social sciences, and engineering 546 Language: English 531 Access: Open to the public 901 Administrator: Iscott@mun.ca 650 Subject: Computer Science 650 Subject: Pure Science 650 Subject: Humanities
Universal Gateway Information Gateway User Admin Database of catalogs and databases Accurate, target-sensitive searching Customized, personalized services Context-sensitive linking
Search Processed to Conform to Information Resources Universal Gateway ALEPH OTHER HTTP Z39.50 D i v e r s e I n f o r m a t i o n R e s o u r c e s
WAU=Kryger, Meir AND WTI=sleep (%20sleep[TITL]%20)%20AND%20(%20Kryger[AUTH]%20Meir[AUTH]%20) 1003=Kryger-M? AND 4=sleep 1=Kryger, Meir AND 4=sleep WAU=Kryger, Meir AND WTI=sleep ALEPH ALEPH Z39.50 Z39.50 Z39.50 Z39.50 HTTP HTTP KOBV KOBV Library of Congress Library of Congress PubMed MedLine MedLine PubMed Search Command Adapted to Various Resources Kryger, Meir Author: Title: sleep
The Universal Gateway enables the use of basic components via API UniversalGateway FIND PRESENT COMBINESET FINDDUPLICATES U n i v e r s a l G a t e w a y F u n c t i o n s
URLs Distributed approach: MetaLib http://metalib01.exlibris-usa.com/V Harvesting approach: ARC http://arc.cs.odu.edu/
Pop quiz: reference linking papers • Go http://63.70.76.27:8080/cs502/ • Logon • Box 1 : Firstname • Box 2 : Lastname • Box 3 : netid • Click take • Take Quiz • Submit all responses at once
Make-up Pop quiz: SODA and FEDORA papers • Go http://63.70.76.27:8080/cs502/ • Logon • Box 1 : Firstname • Box 2 : Lastname • Box 3 : netid • Click take • Take Quiz • Submit all responses at once