590 likes | 808 Views
Overview of the merger prototype. Overview. Backgrounds: The MUMIS project Cross document annotation merging Alignment of parallel fragments Unification of aligned fragments Clean up unified fragments Reasoning Evaluation Future work & Conclusions. The MUMIS project.
E N D
Overview • Backgrounds: The MUMIS project • Cross document annotation merging • Alignment of parallel fragments • Unification of aligned fragments • Clean up unified fragments • Reasoning • Evaluation • Future work & Conclusions
The MUMIS project • Semantic access to a multimedia database.
The MUMIS project • Semantic access to a multimedia database. • Subject: Soccer
The MUMIS project • Semantic access to a multimedia database. • Corpus: Video recordings of matches, formal texts, ‘ticker’ texts.
The MUMIS project • Semantic access to a multimedia database. • Approach: • Extract knowledge from textual sources • Align this (time based) knowledge with video • Do retrieval on annotation, returning corresponding video fragments to user
The MUMIS project • Semantic access to a multimedia database. • Main subject of this presentation: Merging the annotations resulting from separate texts into one cross-document annotation.
Merging • Intention of merging: • - start with various texts • - annotate each text individually • - combine annotations Example match: Netherlands – Yugoslavia (European Championship 2000)
Two types of text in merger: • Formal texts • Ticker texts
Example formal text • Netherlands-Yugoslavia • Final score: 6-1 • Referee: Garcia Aranda • Goals: • 24' Patrick Kluivert • 90' Marc Overmars • 91' Savo Milosevic • Substitutions: • 53' out : Nisa Saveljic in : Jovan Stankovic • 58' out : Patrick Kluivert in : Roy Makaay • Yellow Cards: • Paul Bosvelt
Example ticker text (BBC) • 19 mins: Bergkamp scuffs his left-foot shot but still forces Kralj into a diving save low down to his left. • 20 mins: Edgar Davids wastes the best chance of the game so far when he blazes over with just the goalkeeper to beat after being put through by Bergkamp. • 24 mins: Kluivert puts Holland in front after latching onto a wonderful chip from Bergkamp and then planting a right-foot shot past Kralj from eight yards. • 25 mins: Boudewijn Zenden comes close to doubling Holland's lead when he fires in low, right-foot shot which Kralj just about hangs onto.
Example of parallel fragments • BBC - 15: • Van der Sar pulls of great save to block Mijatovic's shot afterSavo Milosevic has cut through the Dutch defence like a knife. • Guardian - 17: • Mijatovic, played in with a quick square ball from Milosevic, finds himself one-on-one with van der Sar 10 yards out. He picks his spot, but unfortunately for Mijatovic, it's the spot occupied by van der Sar. A great save and Yugoslavia should be one-nil up. • Kickers 15: • Milosevic auf Mijatovic, doch der Stuermer vom AC Florenz scheitert aus 12 Metern freistehend an van der Sar. • WEBTEC 15: • Milosevic filtreert door de Nederlandse defensie door één beweging en legt af voor Mijatovic. Deze laatste trapt op van der Sar.
Example of parallel fragments • BBC - 15: • Van der Sar pulls of great save to block Mijatovic's shot afterSavo Milosevic has cut through the Dutch defence like a knife. • Guardian - 17: • Mijatovic, played in with a quick square ball from Milosevic, finds himself one-on-one with van der Sar 10 yards out. He picks his spot, but unfortunately for Mijatovic, it's the spot occupied by van der Sar. A great save and Yugoslavia should be one-nil up. • Kickers 15: • Milosevic auf Mijatovic, doch der Stuermer vom AC Florenz scheitert aus 12 Metern freistehend an van der Sar. • WEBTEC 15: • Milosevic filtreert door de Nederlandse defensie door één beweging en legt af voor Mijatovic. Deze laatste trapt op van der Sar.
Merging process: overview • 2 document alignment • N-document alignment • Unification of events from separate sources • Special situations
Merging process:2-document alignment • Step 1 of the merging process: merge annotations of 2 texts
Merging process:2-document alignment Source A Source B
Merging process:2-document alignment • The strongest binding is selected, ruling out certain other bindings.
Merging process:2-document alignment • The strongest binding is selected, ruling out certain other bindings.
Merging process:2-document alignment • The strongest binding is selected, ruling out certain other bindings.
Merging process:2-document alignment • The strongest binding is selected, ruling out certain other bindings.
Merging process:N-document • Given the 2-document alignments for each pair of sources, find the n-document alignment where all fragments describing same scene in all separate sources are aligned.
Merging and reasoning: types of rules • Within events or scenes:Player1 and Player2 will not be the same person, a player performing a save will not score a goal in the same scene, etc. • Role of teams and events: offensive vs. defensive • Combinations of events that probably have the same player: ShotOnGoal+Goal, Penalty+HitThePost • Terminology of authorsmay vary:Cross—Pass, Save—Clearance
Reasoning:mistakes in IE • Sometimes the information extraction component makes mistakes. Example rules have been applied to solve some of these.
Reasoning:mistakes in IE • Fix: The goal made by Kralj (Yugoslavian keeper) is removed
Evaluation:What do we want to know? • Quality of the merger in itself • The advantages and disadvantages of merging
Evaluation:Quality of the merger • Quality of alignments • Quality of unification • The effect of the quality of the original information extraction on both
Evaluation:Approach • Create gold standard annotations for single sources • Create gold standard merged annotation of all sources • Run merger in different conditions • Compare everything with everything
Evaluation:Results • Alignments based on machine IE
Evaluation:Results • Alignments based on manual IE
Evaluation:Conclusions • Quality of alignments is pretty good. • Better IE improves alignments. • Low quality IE does not degrade alignments too much.
Extra example – the source Pass Milosevic ShotOnGoal Mijatovic Save Van der Sar • BBC - 15: • Van der Sar pulls of … • Milosevic has cut … • Guardian - 17: • Mijatovic, played in with a quick square ball from Milosevic, finds himself one-on-one with van der Sar 10 yards out. He picks his spot, but unfortunately for Mijatovic, it's the spot occupied by van der Sar. A great save and Yugoslavia should be one-nil up. • Kickers 15: • Milosevic auf Mijatovic, doch der Stuermer vom AC Florenz scheitert aus 12 Metern freistehend an van der Sar. • WEBTEC 15: • Milosevic filtreert door de Nederlandse defensie door één beweging en legt af voor Mijatovic. Deze laatste trapt op van der Sar.