1 / 20

Coverage and Independence: Measuring Quality in Web Search Results

Coverage and Independence: Measuring Quality in Web Search Results. Panagiotis Takis Metaxas Lilia Ivanova Eni Mustafaraj Department of Computer Science Wellesley College, USA. Precision and Recall in Traditional IR. Precision and Recall in Web IR.

dex
Download Presentation

Coverage and Independence: Measuring Quality in Web Search Results

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coverage and Independence:Measuring Quality in Web Search Results Panagiotis Takis MetaxasLilia Ivanova Eni Mustafaraj Department of Computer Science Wellesley College, USA

  2. Precision and Recall in Traditional IR

  3. Precision and Recall in Web IR High Precision is easy to achieve but does not convey useful information Recall is uninteresting and cannot be computed accurately because of the enormous size of the web 85% of Web Searchers never look past top-10!

  4. But what is Quality?

  5. Quality when searching controversial issues?

  6. Quality when searching Political Issues? But Google is usually so good in finding info… Why does it do that?

  7. Define Search Quality in a web-meaningful way • Comprehensive Coverage = • Lack of bias towards some search results • For a controversial issue (at a minimum): cover the pro, con and balanced opinions • For k opinions, and top-N results: • expected # of results / opinion: N/k • Coverage Bias = total distance from N/k

  8. Define Search Quality in a web-meaningful way • Comprehensive Coverage = • Lack of bias towards some search results • (bad coverage) 0 ≤ C ≤ 1 (good coverage) • Now we can talk about, e.g., 60% coverage

  9. Define Search Quality in a web-meaningful way • Independent search results = • Results that are not dependent due to spamming • u: URL Dependency • r: Redirection Dependency • c: Content Dependency • l: Link Dependency

  10. Example of Dependent Results:Google’s “HGH benefits” Redirection dependency URL dependency Table 1: Top-10 results of Google when given the query ”HGH benefits” for August, 2007 and September, 2007. For each entry we have calculated the size of the backGraph as (|V |, |E |) revealed by the Google API and the change between these two dates.

  11. Example of Dependent Results:Yahoo’s “Is ADHD a real disease” Content dependency Link dependency Table 4: Top-10 results of the Yahoo search engine when given the query ”Is ADHD a real disease” (August and September, 2007).

  12. Define Search Quality in a web-meaningful way • Independent search results = • Results that are not dependent due to spamming • u: URL Dependency • r: Redirection Dependency • c: Content Dependency • l: Link Dependency (total dependence) 0 ≤≤ 1 (total independence)

  13. Evaluating Quality of 3 Search Results • Query with commercial interest: • “Human Growth Hormone (HGH) benefits” • Query with medical interest: • “Is ADHD a real disease?” • Query with political interest: • “Morality of abortions”

  14. Evaluating Quality of 3 Search Results Coverage of Google Coverage of Yahoo Independence Our result show low coverage for controversial questions that are not highly pursued and higher coverage for an issue that is highly pursued (“Abortion”). They also show high independence of results that are not highly pursued and higher independence for an issue that is highly pursued (“Abortion”). There is significant overlap between the top-10 returns of both Yahoo and Google results!

  15. Comparing visible neighborhoods Google Both Yahoo

  16. Thank you! pmetaxas@wellesley.edu Coverage and Independence:Measuring Quality in Web Search Results Panagiotis Takis Metaxas Department of Computer Science Wellesley College, USA

  17. Example of Dependent Results:Yahoo’s “HGH benefits”

  18. Example of Dependent Results:Google’s “Is ADHD a real disease”

  19. Example of Dependent Results:Google’s “Morality of Abortion”

  20. Example of Dependent Results:Yahoo’s “Morality of Abortion”

More Related