220 likes | 355 Views
Recommending Questions Using the MDL-based Tree Cut Model. Yunbo CAO , Huizhong DUAN, Chin-Yew LIN, Yong YU, and Hsiao-Wuen HON Natural Language Computing Group Microsoft Research Asia. Community-based Q&A Service. Question Search. Other Aspects about Hamburg or Berlin.
E N D
Recommending Questions Using the MDL-based Tree Cut Model Yunbo CAO, Huizhong DUAN, Chin-Yew LIN, Yong YU, and Hsiao-Wuen HON Natural Language Computing Group Microsoft Research Asia
Community-based Q&A Service Question Search Other Aspects about Hamburg or Berlin More Aspects (NOT DISCOVERED) How far is it from Berlin to Hamburg? Where to see between Hamburg and Berlin? …
Question Recommendation • The problem • You ask: • Any cool clubs in Berlin or Hamburg? • We recommend: • How far is it from Berlin to Hamburg? • Where to see between Hamburg and Berlin? • Any good hostelsinHamburgorBerlin? • The principle of question recommendation • A good recommendation should be different from the queried question in question focus but similar in question topic.
Outline • Question recommendation • Our approach • A walk-through of our approach • The uses of the MDL-based tree cut model • The flow of question recommendation • Related work • Experimental results • Conclusions
Our Approach • The Principle: A good recommendation should be different from the queried question in question focus but similar in question topic. • Query: Any cool clubs in Hamburg or Berlin? • Topic terms: coolclubs, Hamburg, Berlin • How can we discriminate question topic from question focus? • different • Sameorclose • Topic terms: wheretosee, Hamburg, Berlin • Related question: where to see in Hamburg or Berlin
Specificity – Weighing Terms Travel @Yahoo! Answers Travel @Yahoo! Answers China Anyone know where to see the Dragon Boat Festival in Beijing? Where is a good (Less expensive) place to shop in Beijing? What's the cheapestway to get from Beijing to HongKong? Europe Howfar is it from Berlin to Hamburg? What is the cheapestway from Berlin to Hamburg? Whereto see between Hamburg and Berlin? Howlongdoesittake from Hamburg to Berlin?n the train? Asia Pacific Asia Pacific China China Japan Japan … … Europe Europe The specificity of a topic term is the inverse entropy of the distribution of the topic term over the sub-categories. … …
Order Topic Terms by Specificity • Query: Any cool clubs in Hamburg or Berlin? • Topic Chain: Hamburg Berlincoolclubs • Topic Terms: cool clubs, Hamburg, Berlin coolclubs Question Topic Question Focus Hamburg Berlin wheretosee howfar • Topic Terms: where to see, Hamburg, Berlin • Topic Chain: Hamburg Berlinwhere to see • Hamburg Berlinhowfar • Related questions: Where to see in Hamburg or Berlin? • How far is it from Berlin to Hamburg?
Scoring the Candidates • The recommendation score over a queried question and a recommendation candidate is defined as where Question Topic Question Focus
The MDL-based Tree Cut Model • The MDL principle • Model description length: uniform prior • Parameter description length: number of parameters • Data description length: minus log likelihood • The tree cut model (Li and Abe, 1998)
hotel (3983) western (40) nice (224) beachfront (5) affordable (248) suite (3) good (14) inexpensive (12) nice (2) embassy (1) great (3) good (3) hotel (3983) western (66) nice (224) beachfront (11) suite (6) affordable (248) good (14) inexpensive (12) Reduction of Topic Terms
Any cool clubs in Berlin or Hamburg? cool club where to see Berlin how far Hamburg good hostel fun club Where to see between Hamburg and Berlin? How far is it from Berlin to Hamburg? Any good hostels in Hamburg or Berlin? What are the best/most fun clubs in Hamburg? Determining the Cut
cool club Berlin where to see how far Hamburg good hostel fun club Flow of Question Recommendation Index Related Questions: 1. Where to see between Hamburg and Berlin? 2. How far is it from Berlin to Hamburg? 3. Any good hostels in Hamburg or Berlin? 4. What are the most/best fun club in Hamburg? STEP 1: Retrieve Related Questions Query: any cool clubs in Berlin or Hamburg? STEP 2: Discriminate Question Topic from Question Focus Recommendation: 1. Where to see between Hamburg and Berlin? 2. How far is it from Berlin to Hamburg? 3. Any good hostels in Hamburg or Berlin? Search: 1. What are the most/best fun club in Hamburg? STEP 3: Rank Questions on the basis of the cut
Outline • Question recommendation • Our approach • A walk-through of our approach • The uses of the MDL-based tree cut model • The flow of question recommendation • Related work • Experimental results • Conclusions
Related Work • Question search (Jeon et al., 2005; Sneiders, 2002; Lai et al., 2002; Burke et al., 1997) • Find semantically equivalent questions given queries • Satisfying different users’ needs when compared to question recommendation • Query suggestion (Cuerzan & White, 2007; Jensen et al., 2006; Fonseca et al., 2003) • Suggest related queries through query log mining • Query logs are usually absent for questions • Query substitution (Jones et al., 2006) • Generate queries by replacing query terms • New queries are close to the original queries
Outline • Question recommendation • Our approach • A walk-through of our approach • The uses of the MDL-based tree cut model • The flow of question recommendation • Relatedwork • Experimental results • Conclusions
Data and Evaluation Measures • The data • The resolved question from Yahoo! Answers 314,616 about ‘travel’ and 210,785 about ‘computers & internet’ • The test set developed via human judgments
Experimental Results (Basic) • Travel • Computers & Internet
Experimental Results (Basic) What's a good but cheap hotel/motel/anything in downtown Chicago?
Effectiveness of MDL • The baseline methods • First = our approach – the MDL-based reduction of topic terms • Second = our approach – the MDL-based discrimination bet. question topic and question focus • Third = our approach – the MDL-based reduction of topic terms – the MDL-based discrimination bet. question topic and question focus • The use of the MDL is significant • The size of the vocabulary is 289,251 before the reduction of topic terms and 173,202 after the reduction. The reduction is about 40%. • The contribution given by the MDL-based selection of substitution is statistically significant
Conclusions • Studied question recommendation by identifying question topics and question foci • Used the MDL-based tree cut model for • Reducing the set of topic terms • Discriminating question topics from question foci • Empirically verified the effectiveness of our approach to question recommendation