1 / 27

Quality-aware Collaborative Question Answering: Methods and Evaluation

Quality-aware Collaborative Question Answering: Methods and Evaluation. Maggy Anastasia Suryanto, Ee-Peng Lim Singapore Management University Aixin Sun Nanyang Technological University Roger H.L. Chiang University of Cincinnati. WSDM 2009, Barcelona, Spain. Outline.

amma
Download Presentation

Quality-aware Collaborative Question Answering: Methods and Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quality-aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng LimSingapore Management University Aixin SunNanyang Technological University Roger H.L. ChiangUniversity of Cincinnati WSDM 2009, Barcelona, Spain

  2. Outline • Motivation and objectives • Quality-aware QA framework • Expertise-based methods • Experimental Setup • Results • Conclusions

  3. Collaborative Question Answering • Finding answers to questions using community QA portals Community QA Portal QuestionInterface AnswerInterface SearchEngine Question andAnswer Database

  4. Collaborative QA • Simple Idea: Use the search engine provided by community QA portals. • Limitations: • Assume that related questions are available. • Search engines do not guarantee answer relevance and quality. • Users can vote best answers but votes are unreliable. • Users may not be experts. • Collaborative QA needs to address quality issues (answer quality problem)

  5. Research Objectives • Develop methods to find good answers for a given question using QA database of a community QA portal • Benefits: • Better answers compared with traditional QA methods. • Reduce duplicate questions

  6. Quality-Aware Framework Question Good Answer Relevant Answer Quality Answer Content Quality User Expertise

  7. Quality-Aware Framework Answer relevance score(q,a)= rscore(q,a) • qscore_<model>([q,]a) Compute AnswerRelevance Score (rscore) user question (q) Select Answers by Overall Score (score) Search QAPortal candidate answers+ questions answers QA Database Compute AnswerQuality Score (qscore) Answer quality

  8. Expertise-based Methods Quality Answer Asking Expertise User Expertise Content Quality Answering Expertise Question Dependent Expertise Peer Expertise Dependency NT Method [Jeon et. al,2006] EX_QD EXHITS_QD EXHITS EX_QD’

  9. Users, Questions and Answers

  10. Question Independent Expertise • EXHITS • Expert askers have questions answered by expert answerers. • Expert answerers answer questions by expert askers. • Content quality not considered. Asking expertise Answering expertise [Jurczyk and Agichtein, 2007a, 2007b]

  11. Question Dependent Expertise • EXHITS_QD: • Expert askers have q related questions with good answers posted by expert answers. • Expert answerers post good answers to q related questions from expert askers Answer content quality Answer relevance

  12. Question Dependent Expertise • EX_QD: • Non-peer expertise dependent counterpart of EXHITS_QD • Expert askers ask many q related questionsthat attract many good answers

  13. Question Dependent Expertise • EX_QD’: • EX_QD without using answer quality to measure asker expertise

  14. Experimental Setup • Answer relevance • Yahoo! Answers search engine • Query likelihood retrieval model Jelinek-Mercer background smoothing (λ=0.2)

  15. Baseline Methods • BasicYA: • Use question relevance ranking by Yahoo! Answer. • Returns the best answers only. • Search options: • BasicYA(s+c): question subject and content. • BasicYA(b+s+c): best answer+ question subject + content • BasicQL • Query likelihood model • BasicQL(s) • BasicQL(s+c)

  16. Baseline Method • NT: • qscore_nt(a) = p(good|a) • 9 non-text features [Jeon et. al,2006] • Proportion of best answers given by answerer • Answer length • # stars given by the asker to the answer should it be selected as the best answer. Otherwise a zero value is assigned • # answers the answerer has provided so far • # categories that the answer is declared the top contributor at (cap at 3) • # times the answer is recommended by other users • # times the answer is dis-recommended by other user • # answers for the question associated with the answer • # points that the answerer receives from answering giving best answers,voting and signing in.

  17. QA Dataset • Randomly select 50 popular test questions in the computer and internet domain • For each test question, get top 20 questions and their best answers from Yahoo! Answers → 1000 answers • Annotators label each of 1000 answers • Good vs bad quality • Used for training NT method • 50 test questions divided into • Cat A (23): with ≥4 bad quality answers • Cat B (27): with <4 good quality answers

  18. Steps to construct QA Dataset 50 popular test questions

  19. QA Dataset Statistics

  20. Relevance and Quality Judgement • 9 annotators → 3 groups • Pooled top 20 answers for each test questions by all methods → 8617 question/answer pairs • Label each question/answer pair: • {relevant, irrelevant} to test question • {good, bad} quality answer • ≥ 2 annotator groups agree

  21. Summary of Methods Used Little weight to asking expertise No weight to asking expertise

  22. Evaluation of Methods • Best Answers vs All Answer options (with *) • Top 20 answers are judged • P_q@k Precision of quality at top k • P_r@k Precision of relevance at top k • P@k Precision of both quality and relevance at top k • k = 5, 10, 20

  23. Compare Basic and NT Methods • BasicYA and BasicQL performs more poorly in Cat A → poor precision in quality • BasicQL(s) generally better than other Basic methods • NT better than BasicQL(s) in Cat A • NT* is better than NT → all answers option is good

  24. Performance of Expertise Methods • Answerer’s asking expertise is important: • (σ=0.8) is better than (σ=1) • Question dependent is better than question independent • Peer expertise dependency is not essential • EX_QD and EX_QD’ are the best • Much better than NT in Cat A • Better than BasicQL in Cat B

  25. Performance of Expertise Methods • All answer option better than best answer option • Non-best answers can be good quality • Results consistent when stricter judgement is imposed.

  26. Conclusions • Collaborative QA is a viable alternative to traditional QA. • Quality is an essential criteria for ranking answers. • Question dependent expertise improves answer quality measurement. • Other extensions: • Questions/answers from other domains. • Personalized answers vs best answers.

  27. Related Work • Jeon et. al 2006 • Measurement of content quality. • Jurczyk and Agichtein 2007a, 2007b • Proposed answering and asking expertise. • Bian, et al 2008 • Combine both content quality and relevance. • User expertise not considered. • Expert finding • Find experts of a given topic by constructing user profiles using answers posted by their users. • Liu and Croft 2005

More Related