130 likes | 148 Views
Sponsored by:. Text Analysis Conference Knowledge Base Population 2013. Hoa Trang Dang National Institute of Standards and Technology. TAC KBP Goals.
E N D
Sponsored by: Text Analysis ConferenceKnowledge Base Population2013 Hoa Trang Dang National Institute of Standards and Technology
TAC KBP Goals • Goal: Populate a knowledge base (KB) with information about entities as found in a collection of source documents, following a specified schema for the KB • KBP 2009-2011: Focus on augmenting an existing KB. Decompose KBP into two tasks • Entity-Linking: link each given named entity mention to a node in reference KB (or create new node) • Slot-Filling: Learn attributes about target entities from the source documents and add new information about the entity to the reference KB • KBP 2012: Combine entity-linking and slot-filling to build a KB from scratch -> Cold Start • KBP 2013: • Conversational, informal data (discussion fora) • Temporal constraints for Slot Filling (2011 pilot) • Sentiment analysis for Slot Filling
TAC KBP 2013 Track Participants • Track coordinators • Hoa Dang (Slot Filler Validation) • Jim Mayfield (Entity Linking, Cold Start KBP) • Margaret Mitchell (Sentiment Slot Filling) • Mihai Surdeanu (English Slot Filling and Temporal Slot Filling) • LDC linguistic resource providers: Joe Ellis, Jeremy Getman, Justin Mott, Xuansong Li, Kira Griffitt, Stephanie M. Strassel, Jonathan Wright • Coordinators emeritus: Ralph Grishman, Heng Ji • Advisor: Boyan Onyshkevych • 45 Teams • 14 countries (21 USA, 9 China, 3 Spain, 2 Germany,….)
6 (8) TAC KBP 2013 Tracks • Entity-Linking • English • Chinese • Spanish • Slot-Filling (English) • Regular • Sentiment • Temporal • Slot Filler Validation Task • Cold Start (English)
Entity Linking and Slot Filling Tracks • Goal: Augment a reference knowledge base (KB) with info about query entities (PER, ORG, GPE) as found in a diverse collection of documents • Reference KB: Oct 2008 Wikipedia snapshot. Each KB node corresponds to a Wikipedia page and contains: • Infobox • Wiki_text (free text not in infobox) • English source documents: • 1M News docs • 1M Web docs • 99K Discussion Forum docs (threads) • Chinese source documents: 2M news, 800K Web • Spanish source documents: 900K news
Entity-Linking Evaluation Results • English • Participants: 26 teams • Highest F1: 0.721 (0.730 in 2012) • Median F1: 0.583 (0.536 in 2012) • Chinese • Participants: 4 teams • Highest F1: 0.622 (0.740 in 2012) • Median F1: 0.619 (0.617 in 2012) • Spanish • Participants 3 teams • Highest F1: 0.709 (0.641 in 2012) • Median F1: 0.651 (0.612 in 2012)
Regular Slot Filling Evaluation Results • Participants: 18 teams • Human F1: 0.685 (0.814 in 2012) • Highest System F1: 0.373 (0.517 in 2012) • 2nd Highest System F1: 0.339 (0.296 in 2012) • Median System F1: 0.150 (0.099 in 2012)
Sentiment Slot Filling Track • Sentiment analysis for KBP: • Holder (PER, ORG, GPE) • Target (PER, ORG, GPE) • Polarity (positive, negative) • Implemented as regular slot filling, with different set of slots • {per,org,gpe}:positive-towards • {per,org,gpe}:negative-towards • {per,org,gpe}:positive-from • {per,org,gpe}:negative-from • Participants: 3 teams • Evaluation results: • Human F1: 0.727 • Highest System F1: 0.132 • Median System F1: 0.014
Temporal Slot Filling Track • Find tightest temporal constraints [T1 T2 T3 T4] on a given relation • Relation is true for a period beginning between T1 and T2 • Relation is true for a period ending between T3 and T4 • Participants: 5 teams • Evaluation results: • Human Accuracy: 0.688 • Highest System Accuracy: 0.331 • Median System Accuracy:0.148
Slot Filler Validation Track (SFV) • Task: Determine whether or not a candidate slot filler is correct • Objective: improve precision without excessive reduction of recall • Participants: 5 teams • Some SFV runs had overwhelmingly positive impact on individual SF runs!
Cold Start KBP Track • Goal: Build a KB from scratch, containing all targeted info about all entities as found in a relatively closed domain corpus of documents • KB schema: same entity types and slots as regular slot-filling task • Source document collection: • 50K Web pages from small-town publications (from TREC KBA document stream) • Required capabilities: • Entity-linking: Grounding all named entity mentions in docs to KB nodes • Slot-filling: Learning attributes about all named entities • Post-submission evaluation queries traverse KB starting from a single entity node (entity mention): • 0-hop: Find all children of Michael Jordan • 1-hop: Find date of birth of each of the children of Michael Jordan
Cold Start Evaluation Results (Preliminary) • Participants: 3 teams • 0-hop queries: • Highest F1 0.384 (0.497 in 2012) • 1-hop queries: • Highest F1 0.145 (0.255 in 2012) • Combined 0-hop and 1-hop F1 • Highest F1: 0.278 (~0.352 in 2012)
TAC KBP Discussion/Planning Sessions • Monday, November 18 (2:15-3:10pm): • English Slot Filling • Slot Filler Validation • Temporal Slot Filling? • +Spanish Slot Filling? • +Event identification and argument extraction? • Tuesday, November 19 (3:00-4:00pm): • Cold Start • English Entity Linking (as queries in Cold Start framework?) • Cross-Lingual Spanish and Chinese Entity Linking • + Discussion forum