160 likes | 268 Views
In Situ Evaluation of Entity Ranking and Opinion Summarization using. Kavita Ganesan & ChengXiang Zhai University of Illinois @ Urbana Champaign. www.findilike.com. What is findilike ? . Preference – driven search engine Currently works in hotels domain
E N D
In Situ Evaluation of Entity Ranking and Opinion Summarizationusing KavitaGanesan & ChengXiangZhai University of Illinois @ Urbana Champaign www.findilike.com
What is findilike? • Preference – driven search engine • Currently works in hotels domain • Finds & ranks hotels based on user preferences: Structured:price, distance Unstructured:“friendly service”, “clean”, “good views” (Based on existing user reviews) UNIQUE • Beyond search: Support for analysis of hotels • Opinion summaries • Tag cloud visualization of reviews
…What is findilike? • Developed as part of PhD. Work – new system (Opinion-Driven Decision Support System, UIUC, 2013) • Tracked ~1000 unique users from Jan - Aug ‘13 • Working on speed & reaching out to more users
2 Components that can be evaluated through natural user interaction Ranking entities based on unstructureduser preferencesOpinion-Based Entity Ranking (Ganesan & Zhai 2012) 1 Summarization of reviews Generating short phrases summarizing key opinions (Ganesan et. al 2010, 2012) 2
Evaluation of entity ranking • Retrieval • Interleave results Base Balanced interleaving (T. Joachims, 2002) DirichletLM A click indicates preference… Base
Snapshot of pairwise comparison results for entity ranking Algorithms DirichletLM, Base, PL2 # Queries A is Better # Queries B is better
Snapshot of pairwise comparison results for entity ranking Base model better, but DLMnot too far behind Base model better & PL2 not too good
Evaluation of review summarization Randomly mix top N phrases from two algorithms ALGO1 ALGO2 Monitor click- through on per entity basis More clicks on phrases from Algo1 vs. Algo2 Algo1 better
How to submit a new algorithm? Extend existing code Write Java based code Test on mini test bed Submit code Mini Testbed Performance report Sample Code Test Data & Gold Standard Evaluator (nDCG, ROUGE) Local performance Implementation Online Performance
More information about evaluation… eval.findilike.com
Thanks! Questions? Links • Evaluation: http://eval.findilike.com • System: http://www.findilike.com • Related Papers: kavita-ganesan.com
References • Ganesan, K. A., C. X. Zhai, and E. Viegas, Micropinion Generation: An Unsupervised Approach to Generating Ultra-Concise Summaries of Opinions, Proceedings of the 21st International Conference on World Wide Web 2012 (WWW '12), 2012. • Ganesan, K. A., and C. X. Zhai, Opinion-Based Entity Ranking, Information Retrieval, vol. 15, issue 2, 2012 • Ganesan, K. A., C. X. Zhai, and J. Han, Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions, Proceedings of the 23rd International Conference on Computational Linguistics (COLING '10), 2010. • T. Joachims. Optimizing search engines using clickthroughdata. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’02,NY, 2002.
Evaluating Review Summarization Mini Test-bed • Base code to extend • Set of sample sentences • Gold standard summary for those sentences • ROUGE toolkit to evaluate the results • Data set based on - Ganesan et. al 2010
Evaluating Entity Ranking Mini Test-bed • Base code to extend • Terrier Index of hotel reviews • Gold standard ranking of hotels • Code to generate nDCG scores. • Raw unindexed data set for reference
Building a new ranking model Extend Weighting Model