200 likes | 327 Views
Temporal Ordering of Events in the News Domain. Preethi Raghavan. Motivation. Users have temporal information needs Q uery: “Prime Minister United Kingdom 2000” Query : “Prime Minister United Kingdom immediately before 2000” Problem Traditional information retrieval
E N D
Temporal Ordering of Events in the News Domain Preethi Raghavan
Motivation • Users have temporal information needs • Query: “Prime Minister United Kingdom 2000” • Query: “Prime Minister United Kingdom immediately before 2000” Problem • Traditional information retrieval systems do not exploit the temporal content in documents Possibilities • Integrate temporal dimension into an • information retreival framework • Question Answering • Relative order of events in multi-document summarization
TimeBank Corpus Characteristics • News reports annotated using the TimeML specification • 186 documents, with a total of 68.5K words. • 10% of the corpus is held out as test data • TimeML annotations • EVENT: typically verbs • TIMEX3: temporal expressions • TLINK: relates events using temporal relations modeled after Allen’s Interval Algebra+ James F. Allen: Maintaining knowledge about temporal intervals. In: Communications of the ACM., 1983
Example Unordered Events in a Document • New evidence is suggesting that a series of bombings in Atlanta and last month’s explosion at an Alabama women's clinic might be related • In 1996, a bomb blastshocksthe Olympic games • One person is killed
Simplified Sample TimeML Annotation A bomb <EVENT eid="e138" >blast</EVENT> <EVENT eid="e11" >shocks</EVENT> the Olympic games. <TLINK relType= "BEFORE" eventID="e138" relatedToEvent="e11"/>
Methodology • Infer partial order by learning the relation between event pairs in a document • Collapsed labels used: • BEFORE = {IBEFORE, BEFORE} • AFTER = {IAFTER, AFTER} • OVERLAPS = {SIMULTANEOUS, INCLUDES, INCLUDED_BY, DURING, BEGINS, ENDS, ENDED_BY, BEGUN_BY, IDENTITY} • For instance, in document d1 • e2 BEFORE e3 • e2 AFTER e1 • e3 OVERLAPS e4 • Infer global temporal order using the proposed approaches • d1: e1, e2, e3
Event Pairs Classification: Feature Set • Training data: 3000 event pairs • Testing data: 481 event pairs • Features: • Event Class: Occurrence (bombing, discovered), Reporting (say) • Tense: Present, Past etc. • Aspect: Progressive, Perfective etc. • Polarity: Positive, Negative • Event Phrase • Temporal Expression occurring in the same sentence as the event • Same aspect, Same tense
Event Pair Classification Results • Event-Event Relation using 13 Labels • Event-Event Relation using 3 Labels
Event Pair Classification Results • MaxEnt, Overall Accuracy 56.1% • (Majority Classifier 52.4%) • Other Experiments • Experiments in Mani et. al use 6 disjunctive labels. Overall accuracy 62.5% • Collapsing BEFORE and AFTER into the same category will increase accuracy
Event Pair Classification Results • TimeBank + Aquaint Corpus (6234 Event-Event pairs) • 6 labels • (BEGINS, SIMULTANEOUS,BEFORE, IBEFORE,ENDS, INCLUDES) • MaxEnt Overall accuracy 62.179 • 2 labels • (BEFORE, OVERLAPS) • MaxEnt Overall accuracy 69.711
Inferring Global Temporal Order • Ordering of events as a Temporal Directed Acyclic Graph (TDAG) • Nodes: Events • Edges: Temporal relation between events • Cycles are prohibited • Since the graph encodes order • Coarse annotation scheme • Does not capture overlap • Only captures precedence relations
Problem • Given a partial ordering of event pairs, how do we generate a TDAG to establish global ordering?
Greedy Approach • Greedy Algorithm (1) Sort edges according to scores. (2) Start with an empty graph. (3) Add the current largest edge into the graph. (4) Apply transitive closure and constraints. (5) Repeat (3) and (4) until all edges are considered.
Integer Linear Programming • For a document with N event pairs, each pair (i, j) can be related in the graph as • i BEFORE j • i AFTER j • i not connected to j • Given the probability scores for the relation assigned to each event pair • Objective: • Optimize the score of a TDAG by maximizing the sum of the scores of all edges in the graph
ILP Constraints • No cycles • Enforce transitivity • Connectivity constraint
Inferring Global Temporal Order • TDAG generated using ILP
Observations • ILP generates some feasible solution, but not necessarily optimal • In certain cases, it recognized the presence of a link, but is not able to accurately predict its direction • A single wrongly inferred relation may lead to generation of multiple wrong inferences • For the reference TDAG, • ILP gives us 80% accuracy • Greedy gives 60% accuracy
Conclusions • Accuracy for 6 disjunctive labels matches the baseline by Mani et al. for event pair relation classification • Global ordering helps infer new relations between events • This could also be used to increase the size of training data and learn on an increased corpus.
References • Philip Bramsen, PawanDeshpande, YoongKeok, Lee, Regina Barzilay, Inducing Temporal Graphs. EMNLP (2006) • Inderjeet Mani, Marc Verhagen, Ben Wellner, Chong Min Lee and James Pustejovsky, Machine Learning of Temporal Relations. ACL (2006) • J. Pustejovsky, J. Castano, R. Ingria, R. Sauri, R. Gauzauskas, A. Setzer, G. Katz, TimeML: Robust Specification of Event and Temporal Expression in Text. IWCS (2003) • J. F. Allen. Towards a general theory of action and time. Artificial Intelligence, July 1984 • www.timeml.org/site/timebank/timebank.html • Mixed Integer Programming Solver: CPLEX • Modeling tool: AMPL