380 likes | 474 Views
Metasearching: The Promise and Peril. Roy Tennant. Outline. Why Metasearching? The Problem The Promise Principles Metasearching in Libraries Today Issues Present Challenges and Possible Futures. The Problem.
E N D
Metasearching: The Promise and Peril Roy Tennant
Outline • Why Metasearching? • The Problem • The Promise • Principles • Metasearching in Libraries Today • Issues • Present Challenges and Possible Futures
The Problem • Most users do not care where the information they need comes from, or who provides it…nor should they have to • But our systems presently require them to know: • How to select one or more databases • How to get to them • How to use the unique search options for each • How can we create systems that minimize what the user needs to know to get what they want?
The Promise of Metasearching • The “Holy Grail” of resource discovery: simple to use one-stop shopping • The simplification of a formerly complex activity (put the complexity in the back end, not the front) • Allows the user to focus on evaluating results, not figuring out where to search
Principles • Only librarians like to search, everyone else likes to find • All things being equal, one place to search is better than two or more • “Good enough” is often just that • The size of the result set isn’t as important as how the results are displayed (e.g., relevance)
Principles • Our ability to create effective one-stop searching is dependent on our ability to appropriately target user needs • Services should be placed as close to the user as possible
Source: ARL Statistics http://searchlight.cdlib.org/cgi-bin/searchlight
Lessons from SearchLight • Metasearching is not for everyone or every purpose… • …but metasearching is still worth doing (it serves particular needs and audiences) • For a large research library, metasearching is best focused on particular needs (e.g., “a few good things”) or subject areas (e.g., Biology)
CDL Metasearch Infrastructure Project • Web site
No. Author Title Year Source Actions 1. Watson JD; Molecular structure of nucleic 1953 Nature [via Expanded View full text Crick, FH acids. A structure for Academic ASAP] deoxyribose nucleic acid. [details] [basket] 2. Miller GA The magical number seven plus 1956 Psychol Rev [via Expanded View full text or minus two: some limits on our Academic ASAP] capacity for processing information. [details] [basket] 3. Bush, Vannevar As we may think 1945 The Atlantic [via Google] View full text [details] [basket] | | | Home Library Info Services Research More search options | Search tips Giant squid Search Ask a Librarian for help with research or using FindIt.FindIt is a service of the UC Libraries, powered by the CDL. UCSC home FindIt Basic Search | Advanced SearchHelpSearch less, find more... Current Search Results | Marked Items Sign In | Quit Search Results Best bets for finding articles related to giant squid * BIOSIS Previews * Expanded Academic ASAP * Lexis-Nexis Search for giant squid found 3,345,452 results. The system retrieved 60 results and is displaying 1-50. If you want to wait longer you may wish to try to get more results. To save time, search in only one place: Google 1,234,132 Britannica Online 1,203 Expanded Academic ASAP 345 For background information about giant squid, try Encyclopaedia Britannica. Sort results by: Relevance | Title | Source | Year Previous Next --> Previous Next -->
Interfacesoftware User Initiates search Sends search to MetasearchSoftware Sends mergeddisplay to Sends search to Performs search, identifies top 2-3 DBs, writes out file referenced by results page Launches multiple searches Database Advisor Tool Merges, dedupes results Buildsdisplay Receives results Database Advisor Service
Technical Underpinnings • Structured query/response methods: • Z39.50 • SRU/SRW, the “next generation” (XML Web Services) version of Z39.50 • XML Gateways (proprietary XML APIs) • Unstructured query/response: • URL packing and HTML screen scraping • Record merging and de-duping • Ranking (mostly a dream) • OpenURL support (e.g., SFX)
Software Provider Issues • Access management • Search mapping • Unreliability of targets • Systems that don’t support an API (that must be screen-scraped) • Inadequate result data for good: • Deduping • Ranking
Database Provider Issues • Access control (robust authentication and authorization) • Load • Inappropriate searches (searching databases that don’t apply) • Branding and “unfair” deduping
Library Issues • Selecting the right system • Cost (both upfront and ongoing) • System design and implementation • System maintenance • Ability to add new resources/targets • Ease of interface changes • Ease of upgrades
User Issues • What must I go through before hitting the search button? • How difficult is it to review results? • Are results ranked by relevance? (that will be my assumption) • Will I get buried? (too many sources, too many results?) • Do I have methods to easily focus in on what I want? • Once I find what I want, can I get to the full-text with a click? • Can I copy a citation and put it in my paper?
Present Challenges & Possible Futures • Software still needs improvement (duh) • Some databases are still not searchable • If you create a “family” of portals, how does one find the right portal to search? A meta-metasearch? • We can learn from other systems (e.g., redlightgreen) • Standards are on the way (e.g., NISO)