1 / 42

An Overview of Opinionated Tasks and Corpus Preparation

An Overview of Opinionated Tasks and Corpus Preparation. Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan http://research.nii.ac.jp/ntcir/ntcir-ws6/opinion/ntcir5-opinionws-en.html. What is an opinion?.

aelan
Download Presentation

An Overview of Opinionated Tasks and Corpus Preparation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Overview of Opinionated Tasks and Corpus Preparation Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan http://research.nii.ac.jp/ntcir/ntcir-ws6/opinion/ntcir5-opinionws-en.html

  2. What is an opinion? • Opinion is a subjective information • Opinion usually contains an opinion holder an attitude, and a target, but not obligatory • A sentential clause or a meaningful unit (in Chinese) is the smallest unit of an opinion.

  3. Why opinion processing is important? • There is explosive information on the Internet, and it’s hard to extract opinions by humans. • Opinions of the public is an important index of companies and the government. • Opinions change over time, so to keep track of opinions automatically is an important issue.

  4. Fact-based vs. Opinion-based • Examples: • Circular vs. Happy • He is an engineer. vs. He thinks that his boss is a kind person. • Why the sky is blue? vs. Do people support the government?

  5. Previous Work (1) • English: • Sentiment words (Wiebe et al., Kim and Hovy, Takamura et al.) • Opinion sentence extraction (Riloff and Wiebe, Kim and Hovy) • Opinion document extraction (Wiebe et al., Pang et al.) • Opinion summarization: reviews and products (Hu and Liu, Dave et al.)

  6. Previous Work (2) • Japanese • Opinion extraction (Kobayasi et al.: reviews, at word/sentence level) • Opinion summarization (Morinaga et al.: product reputations, Seki, Eguchi, and Kando) • Chinese • Opinion extraction (Ku, Wu, Li and Chen) • Opinion summarization (Ku, Li, Wu and Chen) • News and Blog Corpora (Ku, Liang and Chen) • Korean?

  7. Corpus Preparation (1) • Quantity • How much materials should we collect? • Words/Sentences/Documents • Source • What source should we pick? Mining opinions from general documents or the obvious opinionated documents? (ex. Discussion group) • News, Reviews, Blogs, …

  8. Corpus Preparation (2) • Different granularity • Word level • Sentence level • Clause level • Document level • Multi-documents (summarization) • Different sources • Different languages

  9. Previous Work (Corpus Preparation 1/5) • Example: NRRC Summer Workshop on Multiple-Perspective QA • People involved: 1 researcher, 3 graduate students, 6 professors • Collect 270,000 documents, over 11-month periods, retrieve documents relevant to 8 topics, more than 200 documents of each topic Workshop: MPQA: Multi-Perspective Question AnsweringRRC Host: Northeast Regional Research Center (NRRC) 2002Leader: Prof. Janyce WiebeParticipants: Eric Breck, Chris Buckley, Claire Cardie, Paul Davis, Bruce Fraser, Diane Litman, David Pierce, Ellen Riloff, Theresa Wilson

  10. Previous Work (Corpus Preparation 2/5) • Source: news documents (World News Connection - WNC) • In another work on word level: 2,615 words

  11. Previous Work (Corpus Preparation 3/5) • Example: Using NTCIR Corpus (Chinese) • Reusable • NTCIR2, news documents • Retrieve documents relevant to 6 topics • On average, 34 documents for each topic • At Word level: 838 words • Experiments using NTCIR3 are ongoing

  12. Previous Work(Corpus Preparation 4/5)

  13. Previous Work(Corpus Preparation 5/5) • Example: Using reviews from Web (Japanese) • Specific domains: cars and games • 15,000 reviews (230,000 sentences) for cars, 9,700 reviews (90,000 sentences) for games • Using topic words (ex. Companies of cars and games) • Semi-automatic methods for collecting opinion terms (with patterns)

  14. Corpus Annotation • Annotation types (1) • Support/Non-support • Sentiment/Non-sentiment • Positive/Neutral/Negative • Strong/Medium/Weak • Annotation types (2) • Opinion holder/Attitude/Target • Nested opinions

  15. Previous Work (Corpus Annotation 1/4) • Example: NRRC Summer Workshop on Multiple-Perspective QA (English) • Total 114 documents annotated • 57 with deep annotations, 57 with shallow annotations • 7 annotators

  16. Previous Work (Corpus Annotation 2/4) • Tags • Opinion: on=implicit/formally declared • Fact: onlyfactive=yes/no • Subjectivity: strength=high/medium/lo • Attitude: neg-attitude/pos-attitude • Writer: opinion holder information

  17. Previous Work (Corpus Annotation 3/4) • Example: Using NTCIR Corpus (Chinese) • Total 204 documents are annotated • 3 annotators • Using XML-style tags • Define types, but no strength (considering the agreement issue)

  18. Previous Work (Corpus Annotation 4/4)

  19. Corpus Evaluation (1) • How to choose materials? • Filter out candidates whose annotations are too diverse among annotators? (Agreements?) • How many annotators are needed for one candidate? (More annotators, lower agreements) • How to build the gold standard? • Voting • Use instances with consistent annotations

  20. Corpus Evaluation (2) • How to evaluate a corpus for a subjective task? • Agreement (Is it enough?) • Kappa value (To what agreement level ?) • Almost perfect agreement • Substantial agreement • Moderate agreement • Fair agreement • Slight agreement • Less than change agreement

  21. Kappa coefficient (wiki) • Cohen's kappa coefficient is a statistical measure of inter-rater agreement. • It is generally thought to be a more robust measure than simple percent agreement calculation since κ takes into account the agreement occurring by chance. • Cohen's kappa measures the agreement between two raters who each classify N items into C mutually exclusive categories. • The first evidence of Cohen's Kappa in print can be attributed to Galton (1892).

  22. Kappa coefficient (wiki) • The equation for κ is: • Pr(a) is the relative observed agreement among raters • Pr(e) is the hypothetical probability of chance agreement • If the raters are in complete agreement then κ = 1 • If there is no agreement among the raters (other than what would be expected by chance) then κ ≤ 0.

  23. Kappa coefficient • Two raters are asked to classify objects into categories 1 and 2. The table below contains cell probabilities for a 2 by 2 table. • P0=P11+P22, observed level of agreement • This value needs to be compared to the value that you would expect if the two raters were totally independent • Pe=P1P1+P2P2 http://www.childrensmercy.org/stats/definitions/kappa.htm

  24. Example • Hypothetical Example: 29 patients are examined by two independent doctors (see Table). 'Yes' denotes the patient is diagnosed with disease X by a doctor. 'No' denotes the patient is classified as no disease X by a doctor. • P0=P11+P22=(10 + 12)/29 = 0.76 • Pe=P1P1+P2P2 =0.586 * 0.345 + 0.655 * 0.414 = 0.474 • Kappa = (0.76 - 0.474)/(1 - 0.474) = 0.54 http://www.dmi.columbia.edu/homepages/chuangj/kappa/

  25. Online Kappa Calculator http://justus.randolph.name/kappa

  26. Previous WorkCorpus Evaluation • Different languages/annotations may have different agreements. • Kappa: 0.32-0.65 (only factivity, English) • Kappa: 0.40-0.68 (word level, Chinese) • Different annotators with different background may have different agreements.

  27. What are needed for this work? • What kind of documents? News? Others? • All relevant documents? • Provide only the type of documents, or fully annotated documents for training? • Provide some sentiment words as clues? • To what granularity? Word, clause, sentence, document, or multi-document? • In which language? Mono-lingual, multi-lingual or cross-lingual?

  28. Natural Language Processing Lecture 15Opinionated Applications Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan

  29. Opinionated Applications • Opinion extraction • Sentiment word mining • Opinionated sentence extraction • Opinionated document extraction • Opinion summarization • Opinion tracking • Opinionated question answering • Multi-lingual/Cross-lingual opinionated issues

  30. Opinion Mining • Opinion extraction identifies opinion holders, extracts the relevant opinion sentencesand decides their polarity. • Opinion summarizationrecognizes the major events embedded in documents and summarizes the supportive and the non-supportive evidence. • Opinion tracking captures subjective information from various genres and monitors the developments of opinions from spatial and temporal dimensions.

  31. Opinion extraction • Extracting opinion evidence from words, sentences, and documents, and then to tell their polarities. • The composition of semantics and that of opinions are very much alike in documents: • Word -> Sentence -> Document • The algorithm is designed based on the composition of different granularities.

  32. Seeds • Sentiment words in General Inquirer (GI) and Chinese Network Sentiment Dictionary (CNSD) are collected as seeds. • GI is in English, while CNSD is in Chinese. GI is translated in Chinese. • A total of 10,542 qualified seeds are collected in NTUSD.

  33. Statistics of Seeds

  34. Thesaurus Expansion • The seed vocabulary is enlarged by • 同義詞詞林 • 中央研究院中英雙語知識本體詞網 (The Academia Sinica Bilingual Ontological WordNet) • Words in the same clusters may not always have the same opinion tendency. • 寬恕(forgive) vs. 姑息 (appease) • How to distinguish words with different polarities within the same cluster/synset • Opinion tendency of a word and its strength

  35. Sentiment Tendency of a Character (raw score)

  36. Sentiment Tendency of a Character (normalization) ?

  37. Sentiment Tendency of a Word • A sentiment degree of a Chinese word w is the average of the sentiment scores of the composing characters c1, c2, ..., cp • A positive score denotes a positive word. • A negative score denotes a negative word. • Score zero denotes non-sentiment or neutral.

  38. Opinion Extraction at Sentence Level at Sentence Level ?

  39. Opinion Extraction at Document Level

  40. Evaluation Corpus Preparation • Source: TREC (English;News) / NTCIR (Chinese;News) / Blog (Chinese:Casual Writing) • Corpus is prepared for multi-genre and multi- lingual issues. • Corpus is prepared to evaluate opinion extraction, summarization, and tracking.

  41. Opinion Summarization • Find important topics of a document set. • Find relative sentences of important topics • Find opinions embedded in sentences. • Summarize opinions of important topics.

  42. Opinion Tracking • Opinion tracking is a kind of graph-based opinion summarization. • We are concerned of how opinions change over time. • An opinion tracking system tells how people change their opinions as time goes by. • To track opinions, opinion extraction and summarization are necessary. • Opinion extraction tells the changes of opinion polarities, while opinion summarization tells the correlated events.

More Related