500 likes | 598 Views
An introduction to the NSDL William Y. Arms Cornell University. Acknowledgement and Disclaimer. The NSDL is a program of the National Science Foundation's Directorate for Education and Human Resources, Division of Undergraduate Education.
E N D
An introduction to the NSDL William Y. Arms Cornell University
Acknowledgement and Disclaimer The NSDL is a program of the National Science Foundation's Directorate for Education and Human Resources, Division of Undergraduate Education. The NSDL Core Integration is a collaboration between the University Center for Atmospheric Research (Dave Fulker), Columbia University (Kate Wittenberg) and Cornell University (Bill Arms). The ideas discussed in this talk do not represent the official views of the NSF or the Core Integration team.
The NSDL project 1996Vision articulated by NSF's Division of Undergraduate Education 1997National Research Council workshop 1998Preliminary grants through Digital Libraries Initiative 2 1998SMETE-Lib workshop 1999NSDL Solicitation 20006 Core Integration System projects + 23 others funded 2001Further collection and service projects + 1 Large Core Integration System project (total about $25 million/year) 2002 Formal release 2006 End of formative phase
Continuing questions (a) Science education: How broadly defined? (b) Funding: How much with how few dollars? (c) Education: How can the NSDL have an impact? (d) Management: How can a diverse community provide shared services?
Scientific and technical information in digital form Materials used in education Materials tailored to education Science education: scope of a digital library
NSDL collections funded by the NSF (a) Focused collections
NSDL-collections funded by the NSF (b) Aggregates and federations
How big might the NSDL be? All branches of science, all levels of education, very broadly defined: Five year targets • 1,000,000 different users • 10,000,000 digital objects • 10,000 to 100,000 independent sites
The Core Integration task ... ... to provide a coherent set of services across great diversity.
Resources Core Integration Budget $4 million Staff 25 - 30 Management Diffuse How can a small team, without direct management control, create a very large-scale digital library?
Approaches to interoperability The conventional approach Wise people develop standards: protocols, formats, etc. Everybody implements the standards. This creates an integrated, distributed system. Unfortunately ... Standards are expensive to adopt. Concepts are continually changing. Systems are continually changing.
Interoperability is about agreements Technical agreements cover formats, protocols, security systems so that messages can be exchanged, etc. Content agreements cover the data and metadata, and include semantic agreements on the interpretation of the messages. Organizational agreements cover the ground rules for access, for changing collections and services, payment, authentication, etc. The challenge is to create incentives for independent digital libraries to adopt agreements
Function versus cost of acceptance Cost of acceptance Few adopters Many adopters Function
Example: textual mark-up Cost of acceptance SGML XML HTML Function ASCII
Levels of interoperability Level Agreements Example Federation Strict use of standards AACR, MARC (syntax, semantic, Z 39.50 and business) Harvesting Digital libraries expose Open Archives metadata; simple metadata harvesting protocol and registry Gathering Digital libraries do not Web crawlers cooperate; services must and search engines seek out information
Metadata is expensive The NSDL cannot afford to create it manually
The metadata repository Services The metadata repository is a resource for service providers. It holds information about every collection and item known to the NSDL. Users Metadata repository Collections
Metadata strategy• Support eight standard formats • Collect all existing metadata in these formats • Provide crosswalks to Dublin Core • Expose records in the metadata repository for others to harvest • Concentrate on collection-level metadata • Use automatic generation to augment item-level metadata
NSDL metadata options Eight standard formats • Dublin Core • Dublin Core + DC-Ed extensions • LTSC (IMS) • ADL (SCORM) • MARC 21 • Content Standard for Digital Geospatial Metadata (FGDC) • Global Information Locator Service (GILS) • Encoded Archival Description (EAD) For additional information on supported formats: • http://128.253.121.110/NSDLMetaWG/IntroPage.html
Records will be exposed through Open Archives Initiative harvesting protocol. Core Integration team will provide some services based on the metadata repository. The architecture encourages others to build services. The metadata repository as a resource
Information retrieval Basic metadata search Basic content search Combining metadata and content James Allan,Bruce Croft (University of Massachusetts, Amherst)
How search service fits into the NSDL Provides search and discovery functionality to portals Metadata repository Portal OAI SDLIP? Search andDiscoveryServices Portal http? Portal Content
Extending the architecture to support federations • Extending the spectrum of search interoperability • collections with non-DC metadata schemas • distributed and heterogeneous collections • richer search functionality • geospatial search, thesaurus/concept space search, ... • Supporting the creation of new and personalized collections • Providing access to thesaurus and gazetteer services • Terry Smith, Jim Frew(University of California, Santa Barbara)
The ADEPT approach to search interoperability metadata repository harvest OAI portal 2. harvest & interpret 3. h & i metadata ADEPT 1. map ADEPT collection discovery ADEPT client ADEPT per collection provider
User profiles and authentication User authentication User registry Affiliations Privacy User preferences User Interfaces and portals Enable customizable user interface Rights management Kate Wittenberg, David Millman (Columbia University)
Conclusion The NSDL cannot do everything
Opportunities for the NSDL • Categories of material that have been given lower priority by libraries and publishers, e.g., datasets, software, and other dynamic content, ... • Materials that are accessible for automatic processing, e.g., scientific web sites and databases, image collections, ... • Materials designed for education, e.g.,learning objects, curricula, problem sets, ... Less opportunity for the NSDL • Conventional scientific literature with restricted access