1 / 38

FLOW: Federating Libraries on the Web

FLOW: Federating Libraries on the Web. ACM/IEEE Joint Conference on Digital Libraries: Portland, July 17, 2002

inari
Download Presentation

FLOW: Federating Libraries on the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FLOW: Federating Libraries on the Web ACM/IEEE Joint Conference on Digital Libraries: Portland, July 17, 2002 Anna Keller Gold (UC San Diego Libraries); Karen Baker (Scripps Institution of Oceanography, LTER); Kim Baldridge (San Diego Supercomputer Center); Jean-Yves LeMeur (European Center for Nuclear Research, CERN)

  2. Outline: • In theory: defining repository success and developing system requirements to match • In practice: field report and local observations • Next steps: developing for the future JCDL, July 17 2002

  3. 1. In theory: • The individual, team and network have document management needs in common • Building successful research repositories entails active participation by relevant research communities in the full range of repository activities (“GSD”): • Gather • Share • Discover JCDL, July 17 2002

  4. Repository success depends on good match between technical and social design. • E.g., institutional vs. disciplinary repositories • Good social design remains unsolved research problem. See call for participants in October 2002 conference addressing the cultural and management aspects of repository building, and emphasize institutionally-based repositories: http://www.arl.org/ir2002.html JCDL, July 17 2002

  5. FLOW hypothesis: Repository success depends on addressing divergent roles of repository participants and multiple levels of organization, including: • Divergence among ingrained, more-or-less well-functioning workflows and practices • Multiple (and differing) motivations for participation by individuals, groups, networks, institutions, disciplines JCDL, July 17 2002

  6. Practices and motivations, e.g. of: • Individuals • Research groups • Institutions • Disciplines JCDL, July 17 2002

  7. Practices as individuals: • Notebooks, articles, office files • Mail, email, in-person: circulate preprints by mail, email • Personal web pages (multi-format links) • Personal databases (e.g. flat files, citation managers: can extract from, download and import to) • Deposit to/extract from disciplinary repositories (e.g. arXiv) JCDL, July 17 2002

  8. Motivations as individuals: • Tenure (maintain lists of peer-reviewed publications; track citation counts) • Manage knowledge for easy retrieval and discovery • Exchange with key colleagues • Participate in building shared knowledge JCDL, July 17 2002

  9. Practices of research groups: • Internal databases (shared) • Web sites with lists JCDL, July 17 2002

  10. Motivations of research groups: • Manage knowledge • Track output (for funding agencies) • Track impact (greater exposure leads to greater impact) • Discovery JCDL, July 17 2002

  11. Practices of institutions, orgs: • Publish (e.g. tech reports, conf. proceedings, journals) • Create internal databases • Establish repositories • Establish libraries • Hybrid library/repositories JCDL, July 17 2002

  12. Motivations of institutions: • Sharing, discovery, and reputation • Management and reporting (including accountability to funding agencies) • Archiving JCDL, July 17 2002

  13. Practices of disciplines: • Professional society databases, portals • Establish disciplinary repositories (may be distributed & federated or centralized, e.g. NCSTRL, arXiv) JCDL, July 17 2002

  14. Motivations of disciplines: • Sharing • Discovery JCDL, July 17 2002

  15. FLOW: • The distinctive document management tools and practices used within each layer (individuals, group, center, network, discipline) represent boundaries across which information could flow openly if technology and metadata could provide an enabling digital framework (“metadata grid”) JCDL, July 17 2002

  16. 2. Practice: • Field report of progress in creating a prototype repository at the San Diego Supercomputer Center using CERN’s CDSware • Goal is to prototype a system that reconciles the divergent practices and motivations of target repository participants JCDL, July 17 2002

  17. CDSware: reasons for selection • Proven institutional implementation at CERN • Extended features fully implemented (personalization, review) • OAI compliant • Supports hybrid repository / bibliography • Technical support and active development • Open source JCDL, July 17 2002

  18. CDSware: • CERN implementation of CDSware manages over 350 collections of data, consisting of over 550,000 bibliographic records, including 220,000 full-text documents: preprints, articles, books, journals, photographs… http://cdsware.cern.ch/ JCDL, July 17 2002

  19. CDSware: • Configurable portal-like interface for hosting various kind of collections: • Powerful search engine with Google-like syntax. • User personalization, including document baskets and email notification alerts. • Electronic submission and upload of various types of documents. • Runs an OAI data and service provider enabling the metadata exchange between heterogeneous repositories. • Automated citation recognition and linking JCDL, July 17 2002

  20. CDSware: • MySQL database server (adaptable to Oracle) • Apache/{PHP,Python} web application server • Compile-time configuration via GNU Autoconf and WML • Runtime configuration via MySQL configuration tables • Integrates with other platform independent services • E.g. CDS Conversion Server – converts file formats • Extensible: enables the integration of any other installation-specific application. JCDL, July 17 2002

  21. CDSware status: • CDSware is major revision and repackaging of CDS (CERN Document Server) • First public release planned for July 2002 • Announce & users mailing lists released June 2002 • News: • http://cdsware.cern.ch/news/ JCDL, July 17 2002

  22. Why another repository? • Repositories and their design diverge in important ways: • How things get in • How things get out • Who can put things in (and take out) • What things can be put in • What linkages they have to other systems • What protocols/standards they follow JCDL, July 17 2002

  23. Comparing repository tools JCDL, July 17 2002

  24. Comparing repository tools JCDL, July 17 2002

  25. Comparing repository tools JCDL, July 17 2002

  26. CDSware at SDSC: • How things get in: • One-by-one item deposits • Batch uploading from local collections • Goal: to also populate the collection via intelligent spidering of designated open collections/documents (ResearchIndex does this now) JCDL, July 17 2002

  27. CDSware @ SDSC: • How things get out: • Extract to bibliographic software • Extract as XML • Extract as MARC 21 records • Extract as DC • Batch or single item extraction JCDL, July 17 2002

  28. CDSware at SDSC: • Who can put things in (or take out) • Organization affiliates (tracked by personnel database) • Registered affiliates (voluntary deposits), associated by research collaboration, or just research interest • Any interested parties (extract only) JCDL, July 17 2002

  29. CDSware at SDSC: • What things can be put in: • Digital objects plus metadata • Metadata only • Document-like objects • Event records • People records (and associations with organizations and research groups) JCDL, July 17 2002

  30. CDSware @ SDSC • What (data) linkages with other systems? • Now: personnel database at SDSC • Future: • NSF grants database • Open URL • Storage Resource Broker (SRB) JCDL, July 17 2002

  31. CDSware at SDSC: • What protocols / standards followed: • OAI-Protocol for Metadata Harvesting • MARC 21 • Z39.80 (article databases, bibliographic software) • DC JCDL, July 17 2002

  32. Design decisions: • People and digital objects: • Q: Are “creators” authors or people? A: Both. • Integration with personnel database (also enables organization views – “all the people associated with XYZ research group”) • Incorporate records for non-document objects (groups, people, grants) • Allow hybrid system of metadata with or without associated digital objects • End-user uploading from EndNote or similar commercial citation management software a goal • Genre-based views for public; organization views for center JCDL, July 17 2002

  33. Accomplishments: • Formed interdisciplinary team • Assessed available repository software and design choices • Demonstrated upload from test citation management file • Integrated repository database with internal “people” table linking people with organizations • Grounded in both local practices and management demands JCDL, July 17 2002

  34. Next steps: • Complete demonstration of submit and upload functions from citation management software and grants database • Populate database using both individual and batch submissions • Demonstrate internal views of data for program administrators JCDL, July 17 2002

  35. Conclusion: • Further work needed to address integration of repository building with researcher workflow. • Further assess centrality of people and organizations in digital libraries / repositories. • Further assess prospect of creating a metadata grid in which participation and flow is multilateral and multidirectional. In short – continued work toward… JCDL, July 17 2002

  36. D-Repository Grail: • Accommodate current practices at all levels and • Enhance participation at all stages of research / learning process. JCDL, July 17 2002

  37. Programming support: Frank Sudholt and Josh Polterock (SDSC) Integrative Biosciences at SDSC NSF (DBI and OPP) Acknowledgements: JCDL, July 17 2002

  38. References and more information • CDSware: • http://cdsware.cern.ch/ • CDS at SDSC: • agold@ucsd.edu JCDL, July 17 2002

More Related