10 likes | 90 Views
1) MENTION IDENTIFICATION. G lobal and Lo cal W ikification (GLOW) in TAC KBP Entity Linking Shared Task 2011. “ The [Ford] m1 Presidential Library is named after President [Gerald Ford] m2 ”. TAC QUERY. * ID=2012 * “ Ford ”
E N D
1) MENTION IDENTIFICATION Global and Local Wikification (GLOW) in TAC KBP Entity Linking Shared Task 2011 “The [Ford]m1 Presidential Library is named after President [Gerald Ford]m2” TAC QUERY * ID=2012 * “Ford” * “The Ford Presidential Library is named after President Gerald Ford” Visit our demo: http://cogcomp.cs.illinois.edu/demo/wikify/ (m1, http://en.wikipedia.org/wiki/Ford_Motor_Company, 0.1, -0.1) (m2, http://en.wikipedia.org/wiki/President_Gerald_Ford, 0.2, 0.7) … Michael Jordan (basketball) Michael Jackson (singer) Gerald Ford (president) … KBP TAC Knowledgebase Lev Ratinov, Dan Roth QUERY MAPPING Gerald Ford (president) 3) GLOW OUTPUT RECONCILIATION 3) GLOW OUTPUT RECONCILIATION 1) MENTION IDENTIFICATION • We have explored two strategies: • Simple Query Identification (SIQI): mark the expressions in the text which match the query form exactly. • Named Entity Query Identification (NEQI): identify the named entities in the text matching the query form approximately, normalize the spelling using Wikipedia (this poster illustrates NEQI). This is similar to query expansion. Vision: aggregate information about an entity from multiple documents Given a set of mentions linked to the query, we need to provide a single Wikipedia title. However each mention can be assigned a different title. We are using the ranker scores and the linker scores to make the decision. The “with linker” strategy discards mentions assigned negative linker score (which means the objective function increases if we map these mentions to NULL). The “no linker” strategy uses all mentions. The decision on the single-best matching title is based on ranker scores. The “Max” strategy uses a single mention with the highest ranker score. The “Sum” strategy, sums the ranker scores of all the mentions assigned to the same title. In the figure on the left, we illustrate the 4 resulting strategies along with the mentions they use, and with the resulting ranker scores for each title. The hollow circles indicate the discarded mentions, while the full circles indicate mentions that contribute to final title ranking scores. • Is a Macintosh font • Has a distinctive N • Used in Mac OS 7.6 • …. It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicagowas used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. 2) GLOW DISAMBIGUATION • GLOW Problem Formulation: bipartite matching • Γ* is a solution to the problem, a set of mention-title pairs (m,t). • Evaluate the local matching quality using Φ(m,t). • Evaluate the global structure based on (a) pair-wise coherence scores Ψ(ti,tj) (b) an approximate solution Γ’.Γ’ allows disambiguating the mentions independently while taking into account the global structure. Task methodology: map queries to a TAC entity database … Michael Jordan (basketball) Michael Jackson (singer) Gerald Ford (president) … KBP TAC Knowledgebase TAC QUERY (ID=2012, Form= “Ford”, Text=“The Ford Presidential Library is named after President Gerald Ford”) TAC QUERY (ID=2017, Form= “Michael”, Text=“This video shows Michael Jackson performing Billie Jean”) • Experiments, Results (TAC 2011 Test Data) • Conclusions: • It is possible to apply a “disambiguation to Wikipedia” system directly to the TAC KBP Entity Linking task. We did not train our system on TAC data. • NEQI mention identification gains 4 B3 F1 points over SIQI. • All reasonable output reconciliation policies have performed comparably. Our approach: use the GLOW “disambiguation to Wikipedia” system Local and Global Algorithms for Disambiguation to WikipediaL. Ratinov and D. Downey and M. Anderson and D. Roth (ACL 2011) 2) GLOW DISAMBIGUATION This research is supported by the Defense Advanced Research Projects Agency (DARPA) Machine Reading Program under Air Force Research Laboratory (AFRL) prime contract no. FA8750-09-C-0181 and by and by the Army Research Laboratory (ARL) under agreement W911NF-09-2-0053. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the view of the DARPA, AFRL, ARL or the US government.