200 likes | 320 Views
Comparing Information Extraction Pattern Models. Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK. Information Extraction Patterns.
E N D
Comparing Information Extraction Pattern Models Mark Stevenson and Mark A. Greenwood Natural Language Processing GroupUniversity of Sheffield, UK
Information Extraction Patterns • Popular approach to Information Extraction use lexico-syntactic patterns which match text and identify items of interest • Several recent approaches have been based on extraction patterns derived from dependency parses • Unsupervised approaches to learning extraction patterns extract all possible patterns and try to identify the useful ones
hire/V nsubj nobj partmod Microsoft/N Boor/N last/J partmod as dep force/V replacement/N week/N to after det amod recruit/N resign/V an/DT interim/J nsubj as Adams/N unexpectedly/R “Microsoft, forced to recruit after Adams unexpectedly resigned, last week hired Boor as interim replacement.”
hire/V “Microsoft, forced to recruit after Adams unexpectedly resigned, last week hired Boor as interim replacement.” nsubj nobj Microsoft/N Boor/N resign/V nsubj Adams/N
Predicate Argument Model • Pattern consists of a subject-verb-object tuple; Yangarber (2003); Stevenson and Greenwood (2005) hire/V after nsubj nobj IBM/N Smith/N resign/V nsubj Jones/N
Chain Model • Extraction patterns are chain-shaped paths in the dependency tree rooted at a verb; Sudo et. al. (2001), Sudo et. al. (2003) hire/V after nsubj nobj IBM/N Smith/N resign/V nsubj Jones/N
Linked Chain Model • Patterns are chains or any pair of chains sharing their root; (Greenwood et. al. 2005) hire/V after nsubj nobj IBM/N Smith/N resign/V nsubj Jones/N
Subtree Model • Patterns are any subtree of the dependency tree • By its definition it contains all the patterns proposed by the previous two models; Sudo et. al. (2003) hire/V after nsubj nobj IBM/N Smith/N resign/V nsubj Jones/N
Comparing Models • The models identify different parts of a sentence. • “Smith joined Acme Inc. as CEO” • SVO model identifies link between “Smith” and “Acme Inc.” • Chain model identify link between “Acme Inc.” and “CEO” • Linked chain and subtree models could identify both links • But there is a price to be paid • Models generate different numbers of patterns for a given dependency tree • More patterns probably require more memory and processing
Model Complexity • Let T be a dependency tree consisting of N nodes. V is the set of verb nodes Linear • Now let d(v) be the count of a node v (a member of V) and its descendents. Linear, polynomial in worst case
Let C(v) denote the set of child nodes for a verb v and ci be the i-th child. (So, C(v) = {c1, c2, …. c|C(v)|}) Polynomial • The number of subtrees can be defined recursively: Exponential
Experiments • Aim to identify how well each pattern model captures the relations occurring in an IE corpus • Extract patterns from a parsed corpus and, for each model, check whether it contains the items participating in the relationship • Do NOT attempting to extract the relations, just to determine whether they can be represented
Corpora • Used corpora representing two extraction tasks • Management succession • Various biomedical texts Stevens succeeds Fred Casey who retired from the OCC in June Expression of sigma(K)-dependent cwlH gene depended on gerE
Parsers • MINIPAR • Machinese Syntax Parser • Stanford Parser
Evaluating Expressivity • Coverage: proportion of relations in corpus for which there exists a pattern that includes both items participating in that relation • Analysis showed that parsers often failed to generate a parse which included all words in the sentence. • For some relations it may be impossible to generate a pattern which covers it. • No model can outperform subtree model . • Bounded coverage: proportion of relations in corpus which can be represented (given a dependency parse) for which there exists a pattern that includes both participating items.
Management Succession Results • SVO and chains do not cover many of the relations • Subtree and linked chains models have roughly same coverage
Biomedical Results • More difference between linked chains and simpler models on biomedical text • SVO and chains consistently perform badly, linked chains do well
Bounded coverage results for all models is lower on the biomedical corpora • Parsers are not generally well adapted to deal with these sorts of text; more parsing errors? • Nominalisations appear more common in these texts • “the DNA-dependent assembly of regulon into rings” assembly/N regulon/N dependent/A DNA/N rings/N
Results Summary • Average coverage for each pattern model over all texts • No statistical difference between (1) SVO and chains or (2) linked chains and subtrees
Summary • Comparison of four models for Information Extraction patterns based on dependency trees • Trade off between pattern complexity and tractability • Linked chain model performs well • But may have problems with certain linguistic constructions (such as nominalizations)