1 / 37

Overarching Goals

From Linguistic Annotations to Knowledge Objects Bonnie Dorr Saif Mohammad Boyan Onyshkevych 11/14/2008. Overarching Goals. Produce knowledge elements Build an explicit model of the world based on explicit and implicit language data Enable higher-order reasoning

maxim
Download Presentation

Overarching Goals

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From Linguistic Annotations to Knowledge Objects Bonnie DorrSaif MohammadBoyan Onyshkevych11/14/2008

  2. Overarching Goals • Produce knowledge elements • Build an explicit model of the world based on explicit and implicit language data • Enable higher-order reasoning • operate on knowledge units rather than on annotated raw text • infer relationships, states of affair, sentiments, beliefs, etc.

  3. Current and Next Phases • Phase I (Linguistic Annotation of semantics): Annotate raw text with entity descriptions, co-reference information, semantic categories, lexical-semantic relations, temporal information, thematic role information, modality, etc. • Local information on single sentences and documents • Phase II (Knowledge Units): Automatically produce language-independent structured representations of knowledge (entities, relations, events, opinions, scenarios, etc.) derived from unstructured text and speech in a wide variety of languages and genres. • Knowledge units based on aggregate information across multiple documents

  4. HLT COE Team: Project Organization BBN (Ramshaw, Habash) Temporal Annotation Coreference (complex) CMU (Mitamura, Levin, Nyberg) Coreference Entity relations Committed Belief JHU/CLSP (Yarowsky) Latent property extraction (e.g. gender/age/1st-lang/occupation) Meaning SpecificationAssessmentCoordination(Center of Excellence) UMCP (Dorr, Mohammad) Lexical semantic features (e.g., synonymy, antonymy) and use of belief for detection of contradiction, sentiment, entailment. Columbia (Rambow, Passonneau) Dialogic Content Committed Belief UMBC (Nirenburg, McShane) Modality: epistemic, belief, volitive, etc. Affiliated Efforts Ed Hovy Martha Palmer George Wilson (Mitre)

  5. Definitions and Examples • Linguistic annotations: tags on raw data • From: <Email>EZB</Email><Name>Ed Z Boss</Name>To: <Email>SAS</Email><Name>Sara A Secky</Name><Request-Action>Please <CB>request</CB> a meeting with <Name>Bob</Name> and <Name>Marla</Name> at <Time>2pm</Time><Date>tomorrow</Date></Request-Action>. • Knowledge Objects: representational entities over which a system may make inferences • Knowledge objects may be derived from linguistic annotations or from other indicators • Systems might ultimately infer person-person relationships, event-event relationships, sentiment, and other important information about the world. E1=Meet(P2,P3,P4) T(E1)=<11/20/200114:00> Relation: Subordinate(P1,P2)[Conf=0.9] P1=Sara P2=Ed P4=Marla P3=Bob

  6. Phase I: Linguistic Annotation • Modality • Sheikh Mohamed announced that "we want[modality=volitive; value=1]to make Dubai a new trading center." • Temporal Types and Relations • Sheikh Mohamed announced[Past.Say,Before<writer>] that "we want[Present.State,After,Before,Concurrent(announced)] to make[Unspec.State,After(want)] Dubai a new trading center." • Committed Belief • Sheikh Mohamed announced[CB] that "we want[NCB] to make[NA] Dubai a new trading center." • Dialog Acts • Please let me know what you’d like me to do [Request: answer to [or(M1.5,M1.6)]]#flink1.5(commission to pay Pasadena now) or (commission to pay Pasadena after Jul-Aug)

  7. Annotation Data and Task • Selection and manual annotation of 10,000 words of English and Arabic containing a variety of topics/ genre and some parallel views on the same topics. • Rule of thumb: When annotating an Arabic document with a parallel English translations, the representations produced for those two documents should be fundamentally the same. • The representations should not be just a syntactic parse, but should contain meaning units that go beyond surface-form issues.

  8. Annotation Corpora Features • Multilingual: IAMTC, IBM Hand-aligned corpus, Harmony • Multi-translation: IAMTC • Multi-document for same entities: Enron, AQUAINT, Arabic Gigaword • Conversation: Switchboard, Ontonotes news conversation, Enron, Harmony • Persuasion: Enron, Indianapolis museum request to join society • Correspondence: Enron, Indianapolis museum request to join society • Instructions: Enron, Harmony • Opinions: Switchboard

  9. Detailed Example: Modality The minister, {who has his own website}, also said: "I want [TYPE=VOLITIVE, VALUE=.8]Dubai to be the best [TYPE=EVALUATIVE, VALUE=1] place in the world for state-of-the-art technology companies." The minister {who has a personal website on the internet}, further said that he wanted [TYPE=VOLITIVE,VALUE=1] Dubai to become the best [TYPE=EVALUATIVE, VALUE=1]place in the world for the advanced (hitech) technological companies. Equivalence: Strict = 50%, Loose = 100% Note: {} units omitted for simplicity

  10. Detailed Example: Temporal Parse E1: The minister who has(يملك) his own website also said(واضاف) I want(اريد) Dubai to be the best place(تصبح) in the world for companies. E2: The minister who has(يملك) a personal website further said(واضاف) he wanted(اريد) Dubai to become the best place(تصبح) in the world for companies. Time UnitType Relation Parent has(يملك) Present.State* After/Before/Conc said said(واضاف) Past.Say* After <announced> want(اريد) Present.State*+ After/Before/Conc said place(تصبح) Unspec.State* After want

  11. Detailed Example: Committed Belief E1: The minister, who has(يملك) his own website(موقعا), also said(واضاف): "I want(اريد) Dubai to be(تصبح) the best place in the world for state-of-the-art technology companies." E2: The minister who has(يملك) a personal website(موقعا) on the internet, further said(واضاف) that he wanted(اريد) Dubai to become(تصبح) the best place in the world for the advanced (hitech) technological companies. Belief UnitType has(يملك) CB website(الانترنت)* CB said(واضاف) CB want(اريد) NCB be(تصبح) NA

  12. Detailed Example: Dialog Acts #<M1.5>#Do you want me to pay Pasadena on Friday for these things? #<M1.6>#or do you want me to hold off until I finish July and August[Request-info-either/or: comission to pay Pasadena or to delay paying Pasadena]#flink1.5(commission to pay Pasadena now) or (commission to pay Pasadena after Jul-Aug) . . . #<M1.11>#Please let me know what you’d like me to do [Request: answer to [or(M1.5,M1.6)]]#flink1.5(commission to pay Pasadena now) or (commission to pay Pasadena after Jul-Aug)

  13. Overall Assessment of Linguistic Annotations

  14. Relation Between Linguistic Annotations • Committed Belief ties into TMR modalities belief, epistemic, etc. • Possible to map TMR onto a temporal parse to link concepts, modality, and time: SAID(واضاف)[Past.Say, After(<announced>)] HAS(يملك)[Present.State, After(SAID)] WANT(اريد) [Present.State, After/Before/Concurrent(SAID), volitive=.8] PLACE(تصبح) [Unspec.State, After(WANT),evaluative=1]

  15. Preliminary Automatic Results(Committed Belief, Latent Properties)

  16. Phase II: Moving Toward Knowledge Units LinguisticAnnotations Lex.Relations DialogUnits CommBelief Modality TemporalUnits AttitudesSentiment Beliefs Intention KnowledgeUnits Person Event PersonalAttributes Relations Time(line) CONFIDENCE

  17. Matrix: Linguistic Annotation and Knowledge Units

  18. Matrix: Linguistic Annotation and Knowledge Units (continued)

  19. Entity-Centric Presentation Metaphor • FROM Linguistic Annotations: discourse units, lexical relations, named entities, committed belief, modality • TO Knowledge Objects: attributes, person-person/event-event relations, sentiment and/or attitudes, beliefs, intention/motivation, state of affairs • TASK: Populate descriptive facets (either pre-defined or dynamically created) associated with a person or event by reasoning over knowledge objects. • EXAMPLES: • Person-person relations: oppose, support, contradict, refute (related to sentiment, attitudes).  • Event-Event relations: precursor, causal, consequence, super-event, etc. • Personal Attributes: Name, Age, Height, DOB • Likes, Dislikes [Sentiment/attitudes, relations such as oppose, contradict, etc.] • People You Know [Relational information, e.g., sibling, co-worker, etc.] • Groups [Purpose/attitudes/sentiments of groups] • Status [State of affairs] • Activities [Purpose/attitudes/sentiments of groups] • Goals [Desired states of affairs, intentions/motivations]

  20. Exercise in October From dialogic text, can we derive: • A set of entities • A set of events • A plan or desired state of affairs • A set of relations among entities (or events) • Times of events (or plans) • Biographical information about entities

  21. Derivation of knowledge units may involve: • Mapping from a particular linguistic annotation type • Mapping from combinations of linguistic annotation types • None of the above: It might be possible to derive knowledge units directly from text (e.g., inferring personal attributes from lexical or non-lexical information).

  22. Relation: Subordinate(P1,P2)[Conf=0.9] P1=Sara P3=Bob P2=Ed Relation: Subordinate(P1,P4)[Conf=0.7] P4=Marla Relation: Subordinate(P1,P3)[Conf=0.7] E1=Meet(P2,P3,P4) T(E1)=<11/21/200115:00> T(E1)=<11/20/200114:00> P1:Gender: MAge: 50 P2:Gender: FAge: 32 Building a Plan from Knowledge Units Subject: Meeting with Bob and MarlaDate: Mon, 19 Nov 2001 08:38:25 From: Ed Z. BossTo: Sara A SeckyPlease request a meeting with Bob and Marla at 2pm tomorrow. - Ed Subject: Re: Meeting with Bob and MarlaDate: Mon, 19 Nov 2001 08:50:05 From: Sara A SeckyTo: Ed Z. Boss How about Wednesday at 3pm? – Sara Subject: Re: Meeting with Bob and MarlaDate: Mon, 19 Nov 2001 09:20:50 From: Ed Z. BossTo: Sara A Secky Fine. See you then.

  23. Corpus Example Involving Dialog Units M1.1. Kim: M1.2. I have completed the invoices for April, May and June M1.3. and we owe Pasadena each month for a total of \$3,615,910.62. M1.4. I am waiting to hear back from Patti on May and June to make sure they are okay with her. M1.5. Do you want me to pay Pasadena on Friday for these months M1.6. or do you want me to hold off until I finish July and August? M1.7. Again, I do not have all of the information for July and August, M1.8. so I cannot give you any numbers. M1.9. If I go by what is currently in the system as a guide, Pasadena would owe Enron a little over \$1 mil. M1.10. I need to forecast the money today, M1.11 so please let me know what you . . . M4.1. Patti is the one with the details, M4.2. I’m just the deal maker M4.3. and don’t have access to any of the systems. M4.4. All I know is what fixed priced baseload deals we have. M4.9. Kim

  24. Annotation Example for Exercise Relation: Subordinate(P1,P2) What would the final plan be and how do P1, P2, E1, E2play a role? P1 B = State, C = Current time CB = committed belief Request Modality = volitive(.7) P2

  25. Resulting Knowledge Unit Diagram from October Exercise 2 P1 = Kim Ward Age: Gender: Ed: First Lang: Nationality: Org1:___ Deal-maker Communicate with SubordinateConfidence 0.6 Refuse request HQ Financial Services and forecaster No access Loc:Pasadena 474 P2 = Megan Plan:Pay1(Org2, Org1, Amt1)NotTime(Now, Pay1)Time(> T1, Pay1) Age: Gender: Ed: First Lang: Nationality: Owe $X Owe $Y Communicate with Communicate with Employed-By 305 P3= Patti Org2: Enron T1: <Jul-Aug> Amt1 F($Y, $X) Age: Gender: Ed: First Lang: Nationality: Used by Enron Fin Systems Employed-By Loc:___ 305 P4= Janinie Age: Gender: Ed: First Lang: Nationality:

  26. Summary and Next Steps • Phase I has resulted in linguistic annotations with relatively high language equivalences for multi-translation and multi-language cases. • Preliminary results for automatic annotation of latent properties and committed belief indicate that these are promising avenues for continuing research. • Phase II will focus on automatically induced knowledge units that may be derived from linguistic annotations or from independent properties of the input text. • We expect that the automatically produced knowledge objects will be crucial for language analysis systems and will improve the performance of those systems. • Areas of focus for upcoming Phase II may include personal attributes (latent properties), person-person relations, and state of affairs (which may include belief/intention/sentiment). • Confidence values are another critical aspect of information that may enable a more focused analysis of incoming data.

  27. References • Mohammad, Saif, Bonnie J. Dorr, and Graeme Hirst, "Towards Antonymy-Aware Natural Language Applications," Proceedings of NSF Symposium on Semantic Knowledge Discovery: Organization and Use, New York University, November, 2008. • McShane, Marjorie, Sergei Nirenburg and Stephen Beale. 2008. Paraphrasing for Memory Management in Conversational Agents. To appear in Proceedings of AAAIFall Symposium on Naturally Inspired AI. Arlington, VA. November, 2008. • Mohammad, Saif, Bonnie Dorr, and Graeme Hirst, “Computing Word-Pair Antonymy,” Proceedings of EMNLP-2008. • McShane, Marjorie, Sergei Nirenburg and Stephen Beale. 2008. Resolving Paraphrases to Support Modeling Language Perception in an Intelligent Agent. Presented at STEP 08, Venice, September. • Dorr, Bonnie J., David Farwell, Rebecca Green, Nizar Habash, Stephen Helmreich, Eduard Hovy, Lori Levin, Keith J. Miller, Teruko Mitamura, Owen Rambow, Florence Reeder, Advaith Siddharthan. "Interlingual Annotation of Parallel Text Corpora: A New Framework for Annotation and Evaluation," under review for JNLE, 2008. • Eric Nyberg, Eric Riebling, Richard C. Wang and Robert Frederking. “Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP,” LREC 2008, Marrakech, Morocco, May 31, 2008,

  28. Reserve Slides

  29. Detailed Example: Modality The minister, {who has his own website}, also said: "I want Dubai to be the best place in the world for state-of-the-art technology companies." ASSERTIVE-ACT-70(say) MINISTER-67(minister) MODALITY-71(want)[TYPE=VOLITIVE,VALUE=.8] MODALITY-6()[TYPE=EVALUATIVE,VALUE=1] GEO-POL-ENT-74(Dubai) FOR-PROF-CORP-79(company) TECHNOLOGY-78(technology) The minister {who has a personal website on the internet}, further said that he wanted Dubai to become the best place in the world for the advanced (hitech) technological companies. ASSERTIVE-ACT-941(say) MINISTER-936(minister) DISCOURSE-940(further) MODALITY-942(want)[TYPE=VOLITIVE,VALUE=1] CHANGE-EVENT-944(become) MODALITY-2()[TYPE=EVALUATIVE,VALUE=1] GEO-POL-ENT-943(Dubai) FOR-PROF-CORP-949(company) Equivalence: Strict = 50%, Loose = 100% Note: {} units omitted for simplicity

  30. Detailed Example: Temporal Parse E1: The minister who has(يملك) his own website also said(واضاف) I want(اريد) Dubai to be the best place(تصبح) in the world for companies. E2: The minister who has(يملك) a personal website further said(واضاف) he wanted(اريد) Dubai to become the best place(تصبح) in the world for companies. Time UnitType Relation Parent has(يملك) Present.State* After/Before/Concurrent said said(واضاف) Past.Say* After <announced>(prev sent)* want(اريد) Present.State*+ After/Before/Concurrent said place(تصبح) Unspec.State* After want <announced> said has want place TIME

  31. Detailed Example: Committed Belief E1: The minister, who has(يملك) his own website(موقعا), also said(واضاف): "I want(اريد) Dubai to be(تصبح) the best place in the world for state-of-the-art technology companies." E2: The minister who has(يملك) a personal website(موقعا) on the internet, further said(واضاف) that he wanted(اريد) Dubai to become(تصبح) the best place in the world for the advanced (hitech) technological companies. Belief UnitType has(يملك) CB website(الانترنت)* CB said(واضاف) CB want(اريد) NCB be(تصبح) NA

  32. Detailed Example: Dialog Acts #<M1.5>#Do you want me to pay Pasadena on Friday for these things? #<M1.6>#or do you want me to hold off until I finish July and August[Request-info-either/or: comission to pay Pasadena or to delay paying Pasadena]#flink1.5(commission to pay Pasadena now) or (commission to pay Pasadena after Jul-Aug) . . . #<M1.11>#Please let me know what you’d like me to do [Request: answer to [or(M1.5,M1.6)]]#flink1.5(commission to pay Pasadena now) or (commission to pay Pasadena after Jul-Aug)

  33. Manual Agreement Percentages for Modality

  34. Language Equivalences: Modality

  35. Manual Agreement Percentages: Temporal Parsing

  36. Language Equivalences: Temporal Parsing

  37. Language Equivalences: Committed Belief

More Related