1 / 21

Leveraging Publisher’s Search Engines to Deliver Relevant Results to Users

Leveraging Publisher’s Search Engines to Deliver Relevant Results to Users. Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC. 28 th Annual Scholarly Publishing Meeting – Virginia – June 9, 2006. Abe’s Background. Earned B.S. and M.S. Computer Science degrees, MIT

andie
Download Presentation

Leveraging Publisher’s Search Engines to Deliver Relevant Results to Users

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Leveraging Publisher’s Search Engines to Deliver Relevant Results to Users Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC 28th Annual Scholarly Publishing Meeting – Virginia – June 9, 2006

  2. Abe’s Background • Earned B.S. and M.S. Computer Science degrees, MIT • 18 years experience developing sophisticated information retrieval applications • Cofounded Verity, 1988 • Consulted to LANL, 1994-2000 • Deployed first “federated search” portal in the Federal government, 1999 • Founded Deep Web Technologies (DWT), 2002 DWT is a New Mexico based company focused on providing state-of-the-art software solutions which search, retrieve, aggregate, and analyze content from web-based databases.

  3. The Problem:Searching a large number of sources can lead to a flood of results

  4. Relevance ranking begins as soon as the user clicks the Search button

  5. Ranking Recipe INGREDIENTS Source Selection Query Language Search Conductor Ranking Algorithms MIX WELL AND SERVE UP RELEVANT RESULTS

  6. Search Conductor Source Selection Optimizer Source Descriptions Previous Results Source Selection Optimizer

  7. Powerful Query Language • Takes advantage of search capabilities of each source • Supports full Boolean operators where possible • Supports fielded search • Translates natural language questions into query syntax

  8. Enough good results? YES Deliver results to user Can I get more results from “good” sources? Search Conductor Select sources to search Perform Search Get Next Results NO YES NO

  9. Challenges in Organizing and Ranking Results Multi-tier Relevance Ranking User-driven Ranking Clustering of Results

  10. Multi-tier Relevance Ranking • QuickRank – Ranks results based on occurrence of search terms in title, author, and snippet • MetaRank – Ranks results utilizing custom algorithms applied to meta-data • DeepRank – Downloads and indexes full-text documents HEAVY LIFTING REQUIRED!

  11. User-driven Ranking Desired: Blending (weighing) of above criteria

  12. Clustering

  13. Attributes of Successful Federated Search • Powerful query language that takes advantage of publisher search capabilities • Source selection optimizer will reduce unnecessary searches • Search conductor gets more results from sources bringing back good results • A tool that highlights best search results • Caching of search results

  14. Advice for Publishers • Use good search engines with good relevance ranking • Return 100 or more results at a time • Return meta-data (author, journal, snippet) as part of result list • Provide access to your content through XML Gateway or Web Services • Speed up search time

  15. Thank You! Abe Lederman 301 N Guadalupe, Ste 201 Santa Fe, NM 87501 abe@deepwebtech.com www.deepwebtech.com

More Related