1 / 24

CSE 454 Advanced Internet Systems Features for Relation Extraction

CSE 454 Advanced Internet Systems Features for Relation Extraction. Dan Weld. Preprocessed Data Files. Each line corresponds to a sentence. "John likes eating sausage.". Preprocessed Data Files. Each line corresponds to a sentence. "John likes eating sausage.".

noah
Download Presentation

CSE 454 Advanced Internet Systems Features for Relation Extraction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 454 Advanced Internet SystemsFeatures for Relation Extraction Dan Weld

  2. Preprocessed Data Files Each line corresponds to a sentence. "John likes eating sausage."

  3. Preprocessed Data Files Each line corresponds to a sentence. "John likes eating sausage." •  Grade School: “9 parts of speech in English” • Noun • Verb • Article • Adjective • Preposition • But: plurals, possessive, case, tense, aspect, …. • Pronoun • Adverb • Conjunction • Interjection

  4. Preprocessed Data Files Each line corresponds to a sentence. "John likes eating sausage."

  5. Learning Relational Extractors TRAINING SET Citigroup has taken over EMI, the British … Citigroup’s acquisition of EMI comes just ahead of … Google’s Adwords system has long included … Youtube. Input + + - Extractor Output Text R(a,b) tuples

  6. Learning Relational Extractors TRAINING SET Citigroup has taken over EMI, the British … Citigroup’s acquisition of EMI comes just ahead of … Google’s Adwords system has long included … Youtube. Example <X1, …, Xk, Y> + + Label -

  7. Features Citigroup has taken over EMI, the British … • NER tag of Arg1 • NER tag of Arg2 • Does word-53 (acquire) appear in span? • Consider all words? • Just use verbs & prepositions? • Does bigram-199 (take over) appear in span? • Trigrams? Xi =

  8. Outside the Span Birthplace Relation Dan had lunch in Boston Returning to his birthplace, Dan had lunch in Boston Dan had lunch in Boston, his birthplace.

  9. Proximity Birthplace Relation Dan, who was very tired from deadlines and cranky because of problems with his boss, was born in Boston

  10. Proximity Birthplace Relation Dan, who was very tired from deadlines and cranky because of problems with his boss, was born in Boston born nsubj prep_in Dan Boston rcmod tired prepfrom prepfrom cranky deadlines

  11. Proximity Birthplace Relation Dan, who was very tired from deadlines and a screaming baby, was born in Boston born nsubj prep_in Dan Boston rcmod tired prepfrom prepfrom baby deadlines screaming

  12. Parsing Ambiguity S NP VP VP PP Papa V NP P NP Det Det N N ate with the a caviar spoon

  13. Parsing Ambiguity Prepositional Phase Attachment Please Don’t Eat Me! S NP VP NP Papa V NP ate PP P NP Det N Det the N caviar with a spoon

  14. Extracting grammatical relations from statistical constituency parsers S submitted VP NP agent nsubjpass auxpass VP VBD PP NP VBN PP Brownback Bills were IN NP prep_on nn NP IN NNS NN CC NNS ports NNP NNP Senator cc_and Bills on ports and immigration were submitted by Senator Brownback immigration [de Marneffe et al. LREC 2006] • Exploit the high-quality syntactic analysis done by statistical constituency parsers to get the grammatical relations [typed dependencies] • Dependencies are generated by pattern-matching rules

  15. Preprocessed Data Files (S (NP (NNP John)) (VP (VBZ likes) (S (VP (VBG eating) (NP (NN sausage))))) (. .))

  16. Mintz features

  17. Why Extract Temporal Information? • Many relations and events are temporally bounded • a person's place of residence or employer • an organization's members • the duration of a war between two countries • the precise time at which a plane landed • … • Temporal Information Distribution • One of every fifty lines of database application code involves a date or time value (Snodgrass,1998) • Each news document in PropBank (Kingsbury and Palmer, 2002) includes eight temporal arguments 17 Slide from Dan Roth, HengJi, Taylor Cassidy, Quang Do TIE Tutorial

  18. Time-intensive Slot Types 18 Slide from Dan Roth, HengJi, Taylor Cassidy, Quang Do TIE Tutorial

  19. Temporal Expression Examples Reference Date = December 8, 2012 19 Slide from Dan Roth, HengJi, Taylor Cassidy, Quang Do TIE Tutorial

  20. Temporal Expression Extraction • Rule-based (Strtotgen and Gertz, 2010; Chang and Manning, 2012; Do et al., 2012) • Machine Learning • Risk Minimization Model (Boguraev and Ando, 2005) • Conditional Random Fields (Ahn et al., 2005; UzZaman and Allen, 2010) • State-of-the-art: about 95% F-measure for extraction and 85% F-measure for normalization 20 Slide from Dan Roth, HengJi, Taylor Cassidy, Quang Do TIE Tutorial

  21. Ordering events in discourse (1 ) John entered the room at 5:00pm. (2) It was pitch black. (3) It had been three days since he’d slept. State: John Slept Time: 3 days Event: John entered the room Time: 5pm Time: Now State: PitchBlack 21 21 Slide from Dan Roth, HengJi, Taylor Cassidy, Quang Do TIE Tutorial

  22. Ordering events in time Speech (S), Event (E), & Reference (R) time (Reichenbach, 1947) Tense: relates R and S; Gr.Aspect: relates R and E R associated with temporal anaphora (Partee 1984) Order events by comparing R across sentences By the time Boris noticed his blunder, John had (already) won the game See Michaelis (2006) for a good explanation of tense and grammatical aspect 22 22 Slide from Dan Roth, HengJi, Taylor Cassidy, Quang Do TIE Tutorial

  23. High-Level Architecture Text Distant Supervision Manual Labeling Feature Markup KB Training Data Wikifier Slot Patterns Extractor Learner Inference Manual Generation Tuples

  24. Teams • Named Entity Linking (1) • Time (1) • Distant Supervision (1) • InstaRead (1) • Relation-Specific (3-5)

More Related