130 likes | 257 Views
Semantic Annotation & Utility Evaluation Meeting: Feb 14 , 2008. Project Organization: Who is here? Agenda Meaning Layers and Applications Ongoing work. Project Organization. CMU (Mitamura, Levin, Nyberg) Coreference Entity relations Factiveness. BBN (Ramshaw, Habash)
E N D
Semantic Annotation & Utility Evaluation Meeting: Feb 14, 2008 • Project Organization: Who is here? • Agenda • Meaning Layers and Applications • Ongoing work
Project Organization CMU (Mitamura, Levin, Nyberg) Coreference Entity relations Factiveness BBN (Ramshaw, Habash) Temporal Annotation Coreference (complex) Evaluation Bonnie Dorr David Yarowsky Keith HallSaif Mohammad UMBC (Nirenburg, McShane) Modality: polarity, epistemic, belief, deontic, volitive, potential, permissive, evaluative Columbia (Rambow, Passonneau) Dialogic Content Factiveness Affiliated Efforts Ed Hovy Martha Palmer George Wilson (Mitre)
Who is here? • Kathy Baker (DoD) • Mona Diab (Columbia) • Bonnie Dorr (UMD) • Jason Duncan (DoD) • Tim Finin (JHU/APL) • Nizar Habash (Columbia) • Keith Hall (JHU) • Eduard Hovy (USC/ISI) • Lori Levin (CMU) • James Mayfield (JHU/APL) • Marjorie McShane (UMBC) • Teruko Mitamura (CMU) • Saif Mohammad (UMD) • Smaranda Muresan (UMD) • Sergei Nirenburg (UMBC) • Eric Nyberg (CMU) • Doug Oard (UMD) • Boyan Onyshkevych (DoD) • Martha Palmer (Colorado) • Rebecca Passonneau (Columbia) • Owen Rambow (Columbia) • Lance Ramshaw (BBN) • Gary Strong (DoD) • Clare Voss (ARL) • Ralph Weischedel (BBN) • George Wilson (Mitre) • David Yarowsky (JHU)
Semantic Annotation & Utility Evaluation Meeting: Today’s Plan • MORNING: • Site presentations should include an overview of the phenomena covered and utility-motivating examples, extracted from the target corpus. Discussion of annotation conventions and interoperability issues should wait until the afternoon. • Discussion will be seeded by preliminary analysis from the hosts. The primary goal of discussion is to flesh out our collective assessment of what additional capabilities could be achieved if a machine could achieve near human-performance on annotation of these meaning layers relative to applications operating on text without such meaning layer analysis. • AFTERNOON: • Compatibility, Interoperability, Database population including integration into larger KB environment. • Participants should bring with them thoughts on specific issues regarding compatibility/interoperability and database population relative to their meaning layers. (Slides forwarded in advance.)
Semantic Annotation & Utility Evaluation Meeting Agenda • 9:00 Boyan Onyshkevych - remarks on annotation and utility; goals for future • 9:30 David or Bonnie - Utility analysis overview • 9:45 Brief presentation of meaning layers and utility discussion: • Factiveness: Columbia, CMU • Factiveness utility discussion [Yarowsky] • Coreference: CMU, BBN • Coreference utility discussion [Mohammad] • Propbanking/Ontonotes: Palmer, Hovy • Propbanking/Ontonotes utility discussion [Dorr] • 11:00 Break • 11:15 Continuation • Temporal Annotation: BBN • Temporal Annotation utility discussion [Dorr] • Dialogic Content: Columbia • Dialogic Content utility discussion [Hall] • Modality: UMBC • Modality utility discussion [Yarowsky] • 12:15 Working Lunch: Continue discussion (from above) • 1:15 Compatibility, Interoperability, Database population [Mayfield] • 2:00 Discussion about interoperability and database population • 3:30 Break • 3:45 Future Plans: Immediate follow-on for Y1 completion, Broader goals for Y2 • 4:45 Wrap-up
Techniques, Issues, Applications: Contradiction/Redundancy Techniques Identifying contradictory word pairs X has only a token presence in Europe X has a large presence in Europe Issues Identifying word pairs that are not antonyms, but convey the opposite point when taken in context (as above). Identifying which pairs are not contradictions because they refer to different entities Applications: Knowledge discovery, KB population, question answering, summarization, coreference filtering Capabilities beyond IR/keyword matching: KB population: Finding evidence that supports and refutes a hypothesis QA: Conveying opposite views on the same point of discussion Knowledge Discovery: Identifying new information (anomalies) when analyzing extended-time events
Techniques, Issues, Applications: Coreference Resolution Techniques (CMU, BBN) Member and subset, member-base, reference type Militant leaders including Bill Andres have received death threats Resolution of pronominal coreference ambiguity.Darius and his son ruled Persia. He was born in August Issues Teasing out the information on the interested entity from information about other entities sharing the same name Determining that the person mentioned here is the same as the person we know from earlier Applications: Biography creation, information retrieval, KB population, question answering, inferencing Capabilities beyond IR/keyword matching: Biography creation and SN Analysis: Identifying more information about an entity through coreferential inferencing than available by keyword matching. Knowledge Discovery: Determining that a person mentioned in one part of the text (or in a different text) refers to a person who is currently being tracked.
Techniques, Issues, Applications: Dialogic Analysis • Techniques (Columbia): • Thread-wide annotation: Information-Fact (Meetings run late on Mondays.), Information-Opinion (His progress on that project is slow.), External-event-planning (The project report is due tomorrow.), Social (Did you see the game last night?). • Dialog Function Units: INFORM (The meeting is at 10), REJECT (I don’t know), REQUEST (When is it due?), COMMIT (I will get back to you on that). • Belief: Non-committed belief (I am not sure), Committed belief (I am certain). • Issues: • “If” clauses complex: If J dies, I will cry.Purely hypothetical. But causal link could be committed belief. • Missing emails in a thread. • Applications: Deception Detection, SN Analysis, Sentiment • Capabilities beyond IR/keyword matching: • Deception: Determine if person X is consistently telling person Y something that isn’t true. • SN Analysis: Determine structure of network from information beyond meta data—bureaucratic structure from communication pattern. • Sentiment: Determine opinion of X—beyond who said what to whom.
Techniques, Issues, Applications: Temporal Ordering • Techniques (BBN) • Time unit identification: temporally salient phrase, e.g., “last night” • Temporal type assignment: Event, Say, Be, Date, Range • Inherent time: hypothetical, partially specified, past, current, future • Temporal parent assignment: rel clause, conjunction, etc. • Temporal relationship assignment: before, after, etc.: “After Obama’s presentation [e1] yesterday evening [t1], Clinton made [e2] a few remarks”. during (e1, t1), after(e2, e1) • Issues: • Temporal coreference still being worked out (Monday is his return day.) • Applications: Biography Creation, SN Analysis, KB population, self-learning tutor/guide, knowledge discovery, MT • Capabilities beyond IR/keyword matching: • KB population: Queries should take time into consideration • Knowledge discovery: Identifying unusual/anomalous events • MT: Generation of appropriate tense depends on temporal analysis
Techniques, Issues, Applications:Factiveness, Confidence • Techniques (CMU, Columbia): • Deducing the probability of truth of a statement based on text analysis • Analysis of other aspects of the "assertional force" or conditional truth of a statement. For example: “Guzman lives in Lima”, “Guzman must live in Lima”, “I doubt Guzman lives in Lima”, “If … then Guzman lives in Lima” • Issues: • Knowledge representation for truth status and probability • Integrating and modifying the truth value of individual assertions relative to other facts, either elsewhere in the document or in existing databases • Resolving ambiguities such as "must" as requirement vs. confidence estimation • Applications: KB population, text mining and visualization, question answering, sentiment and deception analysis • Capabilities beyond IR/keyword matching: • Text Mining: Filtering imported textual assertions based on truth status (e.g. negated, conditional) and assigning confidence values to the imported knowledge • KB Population: Determine which systems onsite are vulnerable to threat • Sentiment: “Should the US continue fighting?”
Techniques, Issues, Applications:Modality • Techniques (UMBC): • Assessment of a broad set of "modality" conditions of textual statements: polarity, epistemic, belief, deontic, volitive, potential, permissive, evaluative, epiteuctic, etc.: • He is trying to get Hamas to co-exist with Israel (volitive) • Conservative israelies are skeptical (belief & uncertainty) • Analysis of potential linguistic indicators for each modality type and performance of disambiguation when multiple possible • Issues: • Relationship to (and inter-rater agreement with) factiveness analysis • Coverage and inherent ambiguity • Applications: Knowledge discovery, KB population, text mining, visualization, question answering, sentiment and deception analysis • Capabilities beyond IR/keyword matching: • Knowledge discovery: adding modality “status” attributes to extracted facts and supporting decisions based on these distinctions (e.g. desire, intention, expectation) • KB population: Determine if a particular country succeeded in building weapons (epiteuctic modality). • Sentiment analysis: Determine whether a particular political group is skeptical (belief & uncertainty)
Ongoing work • Analysis of intra-site and cross-site annotation agreement rates • Additional rounds of utility analysis • Initial assessment of computational feasibility