1 / 32

Introducing the Web Intelligence (WIT) Group

Introducing the Web Intelligence (WIT) Group. Microsoft Research Asia. TALK OUTLINE. Introducing WIT – Web InTelligence Group SQuAD Summary. Mission Statement. Enable synergetic collaboration between people and between people and computers to enlighten them and

glen
Download Presentation

Introducing the Web Intelligence (WIT) Group

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introducing the Web Intelligence (WIT) Group Microsoft Research Asia

  2. TALK OUTLINE • Introducing WIT – Web InTelligence Group • SQuAD • Summary

  3. Mission Statement Enable synergetic collaboration between people and between people and computers to enlighten them and enrich their lives http://research.microsoft.com/en-us/groups/WIT/

  4. Vision – a Web with Intelligence Satisfy user needs, simplify key tasks, promote serendipitousdiscovery, and foster task-orientedsocial network

  5. Web InTelligence group (WIT) I’m the manager! Youngin Song Chin-Yew Lin Yunbo Cao Wei Lai Bo Wang I’m the FIRST Korean researcher at MSRA! I’m the SECOND Japanese researcher at MSRA! Tetsuya Sakai

  6. WIT spun off from the Natural Language Computing group in June 2009! I joined MSRA in April 2009! I joined MSRA in May 2009!

  7. WIT research topics Sentiment analysis Social question answering and summarisation Expert and social search User intent/activity recognition and prediction Inarticulate user assistance Information access evaluation

  8. TALK OUTLINE • Introducing WIT – Web InTelligence Group • SQuAD • Summary

  9. Mining Community Knowledge: Social Q&A and Its ApplicationWeb Intelligence (WIT), Microsoft Research Asia Chin-Yew LIN cyl@microsoft.com

  10. Search vs. Question Answering (QA) Search vs. Question Answering (QA) Understanding what users want is difficult! User intention

  11. QA Complements Search • short: length <= 2, long: length >= 3 • high: freq >100K, mid: between 1K and 50K, • low: freq < 300

  12. Scalable Question Answering & Distillation Goal: • Create a scalable question and answering service Methods: • Index all question and answer pairs (QnA) and their authors on the web • Enrich QnA through summarization • Expand QnA database by auto-posting questions to and acquiring answers from community QnA services • Refine QnA through Wiki-style online collaboration Motivations: • Leverage and add value to search • Leverage questions that already have been answered • Leverage people’s knowledge and their networks

  13. CampusCS

  14. Baidu Zhidao (百度知道) 17,012,767 resolved questions in two years’ operation. 8,921,610 are knowledge related. 96.7% of questions are resolved. 10,000,000 daily visitors. 71,308 new questions per day. 3.14 answers per question. http://www.searchlab.com.cn (中国人搜索行为研究/User Research Lab of Chinese Search)

  15. A Traditional QA Architecture • A QA system gives direct answers to a • question instead of documents • Falcon QA system (LCC) • Moldovan et al. ACL 2000 • Surdeanu et al. IEEE Trans. PDS 2002 • Best QA system in TREC 8 & 9 • Average question answering time • TREC 8: 48 seconds • TREC 9: 94 seconds Not Scalable Traditional IR Falcon QA system module analysis: processing time

  16. Community Question and Answering Yahoo! Answers has 19,041,128 resolved questions in 26 categories adding about 48K questions per day. (August 24, 2007) http://weblogs.hitwise.com/leeann-prescott/2006/12/yahoo_answers_captures_96_of_q.html

  17. Community QnA in Details Topic Context 1 Context 2

  18. Online Discussion Forum topic Q Q Q Q

  19. FAQ Context dependent About 28,424,184 results on Live Search using query: “FAQ travel” (Google: about 64,200,000)

  20. Challenges ACL 2008 SIGIR 2008 AAAI 2008 COLING 2008 NSF QGSTEC Workshop 2008 WWW 2008 ACL 2008

  21. List of Papers Accepted • Recommending Questions Using the MDL-based Tree Cut Model – Cao et al.; WWW 2008 • Searching Questions by Identifying Question Topic and Question Focus – Duan et al.; ACL 2008 • Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums – Ding el al.; ACL 2008 • Finding Question Answer Pairs from Online Forums – Cong et al.; SIGIR 2008 • Question Utility: A Novel Static Ranking of Question Search – Song et al.; AAAI 2008 • Answer Summarization: Understanding and Summarizing Answers in Community-Based Question Answering Services – Liu et al; COLING 2008 • Automatic Question Generation from Queries – Lin; NSF Workshop on Question Generation Shared Task and Evaluation Challenge 2008

  22. Question Mining & Answering(ACL 2008 & SIGIR 2008) • Extract question and answer pairs • Community QnA • Create a resolved question list • Extract & index question, best answer, and other answers • Live Qna, Yahoo! Answers, Baidu Zhidao, … • Forum • Extract and index threads and postings, find questions and their answers

  23. QA Pairs in Online Forums CONTEXT Questions Answers

  24. Question Search & Recommendation(ACL 2008 & WWW 2008) • Query • We would like to know what will be available to see in the Forbidden Citybecause we understand that it will be under repairs. • Question search • Is it true that the Forbidden City is undergoing renovation & we won't be allow to enter? • Question recommendation • Would you get a lowerprice by not needing a guide for the Forbidden City and etc? • Can anybody recommend a budget hotel near Forbidden City? • Question = Topic + Focus + Others (TFO) • Search: sametopicsimilarfoci • Recommend: sametopicdifferentfoci How can we discriminate topic from focus?

  25. Identifying Topic and Focus • Specificity: the inverse of the entropy of the topic term‘s distribution over the sub-categories • Order topic terms by their specificity China Anyone know where to see the Dragon Boat Festival in Beijing? Where is a good(Less expensive) placeto shop in Beijing? What's the cheapestwayto get from BeijingtoHongKong? Europe Howfar is it fromBerlintoHamburg? What is the cheapestwayfromBerlintoHamburg? Whereto seebetweenHamburgandBerlin? HowlongdoesittakefromHamburgtoBerlin? Travel @Yahoo! Answers Travel @Yahoo! Answers Asia Pacific Asia Pacific China China Japan Japan … … Europe Europe … …

  26. Question Utility(AAAI 2008) • Motivation • How useful is a question? • How should we rank questions without queries? • Definition • How likely a question would be asked again? The probability generating query Q’ from question Q (Relevance score) The prior probability of question Q reflecting a static rank of the question i.e. Question Utility

  27. Answer Summarization(COLING 2008) • Example: “Where to stay in Paris?” • 2,645 answers (Yahoo! Answers 03/04/09) • Is the “best answer” the best answer? • Question clustering • Find similar questions • Answer summarization • Aggregate answers for a question cluster Answer Taxonomy Question Taxonomy

  28. Travel FAQ • Microsoft Travel Guide • Http://travel.msra.cn

  29. TALK OUTLINE • Introducing WIT – Web InTelligence Group • SQuAD • Summary

  30. Knowledge Distillation & Dissemination Mixed Mode Question Answering Knowledge Distillation and Dissemination

  31. Q&A = Knowledge = Power Q&A is complement to web keyword search Q&A can enhance existing QnA and search services Leverage existing knowledge in the question and answer forms and their authors Acquire or elicit human knowledge automatically Question and Answer = Knowledge Knowledge = Power

  32. Discussion

More Related