210 likes | 358 Views
CrossRef Search. Overview of Issues. Objective of This Presentation. To inform CrossRef members about the issues and get feedback To demonstrate a prototype of cross-publisher full-text search functionality (using DOIs to link to the full text at publisher web sites). Topics of Discussion.
E N D
CrossRef Search Overview of Issues 2002 CrossRef Annual Member Meeting
Objective of This Presentation • To inform CrossRef members about the issues and get feedback • To demonstrate a prototype of cross-publisher full-text search functionality (using DOIs to link to the full text at publisher web sites) 2002 CrossRef Annual Member Meeting
Topics of Discussion • Status • Motivation • Policy issue descriptions • Prototype description • Demonstration 2002 CrossRef Annual Member Meeting
Status • Under discussion by board for over a year • Market research and feasibility study completed in April 2002 • CrossRef Search technical group formed in April 2002: chose prototype vendor and specifications • CrossRef Search policy issues group formed July 2002 – prepared report on issues for the Board • Policy report and prototype completed in September 2002 2002 CrossRef Annual Member Meeting
Status • CrossRef Search is under discussion and no decision has been made about whether CrossRef will proceed with this project • Extensive review and analysis has been done • No decisions have been made about policy, business and functionality issues • Board will discuss CrossRef Search tomorrow at the board meeting 2002 CrossRef Annual Member Meeting
Motivation • For publishers to do collectively what they can’t do individually • Improve access to scholarly content through full text search • Market led initiative: growing demand from librarians and researchers for cross-source and multi-disciplinary searching • Limitation of popular web search services: uneven quality of material 2002 CrossRef Annual Member Meeting
Who Benefits? • Scholars: searching and accessing high-quality, full-text content is easier • Publishers: content is easier to search and discover 2002 CrossRef Annual Member Meeting
Guiding Principles • CrossRef Search will only go ahead if it’s useful to scholars and librarians • CrossRef Search requires the broad support of the CrossRef membership and will be optional • CrossRef Search should not divert resources from CrossRef’s core reference linking service • Adequate funding and dedicated staff are required • External consultants used ($60,000 spent on consulting) • Prototype development outsourced to FAST 2002 CrossRef Annual Member Meeting
Policy Issues • Competition • Governance • Functionality 2002 CrossRef Annual Member Meeting
Policy Issues: Competition • How to handle potential competition with existing member services • Whether a generic search site should be available in addition to search from member sites • If libraries and secondary publishers will be able to integrate the search service into their offerings • What value added services can participating members add to the basic service 2002 CrossRef Annual Member Meeting
Policy Issues: Governance • If the project goes forward, how will decisions be made and how will the service be structured • Deciding on the critical mass of participation for a viable service • Creating the terms and conditions of participation • Working out a realistic business model • Exploring start-up funding sources outside the membership 2002 CrossRef Annual Member Meeting
Preliminary Business Model • Possible fees: per article indexed, per search completed, search volume, click through rates, licensing fees from libraries and others, end user fees • Estimated start up funding required $1.5 million • Projected revenue from participant fees first year: $1.7 million (article deposit fees, annual fee) • Projected income first year 400K 2002 CrossRef Annual Member Meeting
Policy Issues: Functionality • What to show in search results: abstract? hits in context? • Whether to allow searching by subject area • The appropriate type and use of search statistics to collect and share • What to do about material without DOIs or print-only material 2002 CrossRef Annual Member Meeting
The Prototype • Articles from six publishers • Blackwell Publishing • Elsevier Science • Nature Publishing • Springer Verlag • University of Chicago Press • John Wiley & Sons • Close to 700,000 articles • Range from social sciences and humanities to medicine • Includes backfiles and current material • Search engine provided by Fast • Front ends built by Wiley and CrossRef (mockup generic site) 2002 CrossRef Annual Member Meeting
About Fast • Selected from six vendors to create prototype • No cost for prototype if chosen to do a full system implementation • No hardware cost if project doesn’t go forward • Well known for web search engine alltheweb.com and vendor to major companies such as eBay, Dell, IBM and Reuters • Met scheduled delivery date with agreed-upon functionality 2002 CrossRef Annual Member Meeting
Hardware & Software • 3 servers located at Fast data center • Relay node for data collection • Search node • Document processing and administration • Search functionality uses Fast Data Search 3.1 • Front ends built with Python, PHP, Perl, BBEdit and Visual Studio 2002 CrossRef Annual Member Meeting
Data • Prototype data delivered to Fast via ftp, scp, and physical media (crawling not explored) • Data for prototype is static; the production process of updating records has not been explored • Standard data format not requested: publishers sent their own format of headers in SGML and XML and full text in SGML, XML, HTML, or PDF • Data was not validated 2002 CrossRef Annual Member Meeting
DOI Issues • Publishers sent articles without DOIs—Generally non-peer reviewed content • Articles without DOIs were not indexed • Some problems resolving DOIs: improperly registered or registered URLs not working 2002 CrossRef Annual Member Meeting
Prototype • Functionality is for demonstration purposes and does not imply that policy decisions have been reached • Publishers participated in the prototype on a preliminary basis only • Prototype has provided a lot of information to help determine costs for full system and highlighted a number of “problem areas” 2002 CrossRef Annual Member Meeting
Next Steps • Board will discuss policy issues and next steps tomorrow • This is a big project and will not be undertaken lightly • Decision will be difficult for many publishers 2002 CrossRef Annual Member Meeting