410 likes | 422 Views
Taking Ownership of Electronic Journals & Books: A Tale of Two Repositories CNI Spring 2012 Meeting. Alan Darnell Director, Scholars Portal. 420,00 FTE. http://www.ocul.on.ca. http://www.scholarsportal.info. Background.
E N D
Taking Ownership of Electronic Journals & Books:A Tale of Two Repositories CNI Spring 2012 Meeting Alan Darnell Director, Scholars Portal
420,00 FTE http://www.ocul.on.ca
Background • since 2002, 21 Ontario University libraries have collaborated to acquire and manage shared collections of electronic journals and electronic books • licensing happens nationally and provincially through Canadian Research Knowledge Network and OCUL • repositories are managed by Scholars Portal, a unit of the University of Toronto Library system
Scope of Content • Journals is a repository of over 25.9M full text documents from 11,400 journals supplemented by article metadata fromJSTOR and Project MUSE, bringing total citations to 31.5M and 12,740 journals • Books is a repository of over 460,000 titles, over 100,000 currentand close to 360,000 digitized from various Canadian collections participating in the OCA
Content by Publisher Journals Books
Costs • about $29.6M of journal content added in 2011 and cumulatively well over $150 M since 2002 • Scholars Portal operation costs are $2.9M annually, with 1/3 of these resources devoted to managing Journals and Books
Goals • Aggregate content for enhanced discovery • Create framework to support long-term preservation of licensed content • Reduce cost through collaborative purchasing and shared infrastructure
View ISO16363 Preservation Metadata
Personal Accounts for Annotations and Bookmarks
Digitized Books with Enhanced Metadata
Isn’t it all just digital content? • Services have broad similarities • License content • Secure local loading and preservation rights • Transfer content from publisher • Develop metadata crosswalk and data loader • Load content and perform Q&A • Set up entitlements • Distribute metadata to allow for discovery • Gather statistics
It’s all in the details • the combination of small differences throughout that workflow results in significantly more effort required to manage ebooks • poorer results when measured in terms of enhanced discovery, long-term preservation and cost savings • highlight some of those differences by looking at a few key elements of both services
Purchasing content • Big deals still prevalent • Wide buy in from libraries • Deal directly with publishers • Annual renewals • Some big package purchases but more one-off purchases • Wide variations in adoption among libraries • Strong role for aggregators and agents
Licensing and DRMs • Very standard license models (OCUL model license) • Wide use of “perpetual access” clauses • Transformation rights generally accepted • No DRMs; unlimited use; ILL rights • Licenses are wildly different from publisher to publisher • Few specific options for “perpetual access” • Transformation rights unclear (e.g. images) • Common requirements for DRMs (downloading, printing, copy and paste, and concurrent use, watermarks)
Publisher Support Infrastructure • Established processes to feed journal content to various channels (A&I, discovery systems) • High volume, fast turnaround • Metadata packaged with content • Direct from publisher • Standard formats (e.g. NLM DTD) • Uneven quality in supporting distribution channels • Slow turnaround • Gap in metadata workflow from publishers to libraries • Intermediaries are common • Internal practices and coporate standards
Entitlements Management • Generally straightforward; can be managed at title and year level (12,000 titles) • Some complications with changes in title ownership and appearance of articles in more than one publisher/provider • Entitlements must be handled at title level (100,000s) • Cherry-picking from collections is common • Tracking DRM and DRM rolling walls • Entitlement is not simply “on” or “off”
Quality Control • Ensure completeness at volume and issue level • Gaps at the article level identified by end-users • Easy resolution with publishers through reference to dataset as shipped • Completeness has to be at the title and chapter level • Matching to MARC records via ISBN is problematic • ISTC not in wide use • Match to cover images • Unreliability of title lists
Preservation Issues • Clear license language on perpetual access and transformation rights • Organizational commitment to preservation of digital copies • Fairly uniform data formats • Publishers have legal authority to grant transformation rights • DRM restrictions are antithetical to preservation (watermarks, concurrent use) • Ebook content does not always replicate print book content (e.g. image rights) • Print-based preservation strategies prevail, but e-only books are the near future
Metadata Standards • Journal metadata is XML based and increasingly converging on NLM DTD • Metadata and data packaged in ways that make linking easy; common source • NLM is common format for both metadata only and full-text • DOI assignment is reliable identifyer among publishers • No dominant XML based metadata format for ebooks (Onix is not uniformly used by scholarly publishers) • No dominant XML format for ebook full-text (ePub is still a format for trade publishing) • DOI assignment is hit and miss (book and chapter level) • MARC is a foreign standard for publishers
Accessibility Issues • Provincial standard is based on WACG Level 2 • Most PDFs, though not tagged, are readable with screen reading software • Full downloads also allow for ingest into Kurweil and other adaptive technologies • Online page readers with no embedded text as invisible • Chapter downloads are more effective but allowed rarely • Full book downloads require controlled access • Older digitized materials can be difficult to read with adaptive technologies
Use • ~50,000 daily visits • Close to 1 M article downloads monthly in peak periods • Split of ~ 50/50 between publisher and SP • Visitor flow: vast majority of traffic comes from OpenURL resolvers • All content represented in OpenURL KBs • ~1800 daily visits • Books accessed in monthy period? • Much lower ratio of SP use compared to publisher • Visitor flow: vast majority of traffic comes from library catalogues • Only 4-5 libraries have loaded MARC records for SP content
Use Drivers • OpenURL resolvers • A-Z lists • Importance of being present in big KBs • Issue of “dual access” • Google indexing has a small role for OCUL users (more for external users) • Not present directly in discovery layers • Library Catalogues • Quality of MARC records is an issue for many • Publishers don’t provide high quality MARC records • Sourcing records and then linking is an issue • Google indexing of metadata • OpenURL resolvers • Working to get presence • Discovery Layer indexing of public domain content
Hope for EBooks? • Secure agreements with publishers to load all content, and not just currently subscribed content • Establish presence in major commercial KBs • Deal with rights issues related to indexing in discovery systems and bypass dependence on MARC • Resist DRM encumbered content – look for other models to deal with lost income due to course adoption • Insist that publishers support ePub 3 for accessible content
Questions? http://journals.scholarsportal.info http://books.scholarsportal.info