1 / 17

Partitioning Search-Engine Returned Citations for Proper-Noun Queries

Partitioning Search-Engine Returned Citations for Proper-Noun Queries. Reema Al-Kamha. Supported by NSF. The Problem. Search engines return too many citations Example: “Bonnie Lake” Google returns around 800 citations Citations ranked best first Many refer to the same object

morrison
Download Presentation

Partitioning Search-Engine Returned Citations for Proper-Noun Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Partitioning Search-Engine Returned Citations for Proper-Noun Queries Reema Al-Kamha Supported by NSF

  2. The Problem • Search engines return too many citations • Example: “Bonnie Lake” • Google returns around 800 citations • Citations ranked best first • Many refer to the same object • Can we partition by same object? • Proper Noun Queries • Discard citations not of the right kind • Partition the rest by same object • Retain the best-first ranking

  3. “Bonnie Lake” Query to Google

  4. The Interface

  5. “Bonnie Lake” Query Result

  6. Solution • Classification • Group 1: those of the chosen kind • Group 2: those not of the chosen kind • Partition • Three facets • Attributes • Links • Page Similarity • Sub-facets for each facet • Confidence Matrix for each sub-facet • (Weighted) Mean for each facet • Final Confidence Matrix

  7. Attributes • Attribute(s) (One-to-One) Latitude and longitude • Single Attribute (Functional Determination) Province with a lake’s name • Multiple Attributes (Functional Determination) Campground name and highway with a lake’s name • Attributes (Nonfunctional Determination) Country with a lake’s name • Distinguishing Attribute State for a lake

  8. Links • Returned citations that link together • Returned citations that have a common URL prefix: same Host, same File name, and same URL. example of Host: http://www.cs.byu.edu/info/dwembley.html http://www.cs.byu.edu/info/directory.php example of File: http://sunsite.unc.edu/javafaq/oldnews.html http://helios.oit.unc.edu/javafaq/oldnews.html

  9. Confidence Matrix for Returned Citations that Link Together 1 4

  10. Page Similarity • Similarity between each two returned citations • Similarity between two citations-referenced documents

  11. Confidence Matrix for Similarity between two Citation-Referenced Documents

  12. Modified Confidence Matrix for Similarity between two Citation-Referenced Documents

  13. 1,4 3,5 5,8 7,8 Final Matrix {1,4} {3,5,7,8} {2} {6}

  14. “Bonnie Lake”—Results

  15. Measurements • Classification ( Percent correctly classified) • Number of Partitions (Precision and Recall) • Each Partition (Precision and Recall)

  16. Current Implementation Status • Interface • Google connection • Citations retrieval • Page retrieval

  17. Contribution • Solve one type of object-identity problem • Provide an additional tool for search engine queries

More Related