120 likes | 254 Views
A Method for Automatically Constructing Case Frames for English. Daisuke Kawahara and Kiyotaka Uchimoto. National Institute of Information and Communications Technology. (LREC2008, 2008/05/29). Background. NLP analyzers so far (Mainly) supervised, (relatively) knowledge-poor
E N D
A Method for Automatically Constructing Case Frames for English Daisuke Kawahara and Kiyotaka Uchimoto National Institute of Information and Communications Technology (LREC2008, 2008/05/29)
Background • NLP analyzers so far • (Mainly) supervised, (relatively) knowledge-poor • e.g., PP-attachment or parsing Mary ate the salad with a fork Mary ate the saladwith mushrooms • Only 1.5% of bilexical dependency was learned [Bikel, 04] • Toward knowledge-oriented NLP • Automatically compile case frames and integrate them into NLP analyzers/applications
Related work • Subcategorization frames • [Brent, 93] [Ushioda et al., 93] [Manning, 93] [Briscoe and Carroll, 97] [Korhonen, 02] … e.g., She greeted me. • NP(sbj) greet NP(obj) e.g., She gave him a book. • NP(sbj) give NP(obj) NP(obj)
Related work • Subcategorization frames • [Brent, 93] [Ushioda et al., 93] [Manning, 93] [Briscoe and Carroll, 97] [Korhonen, 02] … • (Handmade) frames • FrameNet [Baker et al., 98], PropBank [Palmer et al., 05] • Japanese case frames • Semantics-based: [Haruno, 95] [Utsuro et al., 96] • Example-based: [Kawahara and Kurohashi, 06]
Construction of case frames for Japanese [Kawahara and Kurohashi, LREC2006] ga: nominative, wo: accusative, ni: dative, de: instrument
Predicate-argument structures Clustering Filtering andParsing sbj:you pred:borrow obj:idea pp:from:artist sbj:she pred:borrow obj:idea pp:over:year sbj:i pred:borrow obj:dollar pp:from:friend sbj:farmer pred:borrow obj:money pp:for:supply sbj:he pred:borrow obj:money pp:from:company Construction of case frames for English sbj:{you,she} pred:borrow obj:idea pp:from:artist pp:over:year sbj:i pred:borrow obj:dollar pp:from:friend sbj:{farmer,he} pred:borrow obj:money pp:for:supply pp:from:company 100M sentences (English Gigaword) WordNet Case frames for 10K predicates sbj:{you,she} pred:borrow obj:idea pp:from:artist pp:over:year sbj:{farmer,he} pred:borrow obj:{money,dollar} pp:for:supply pp:from:{company,friend} MSTParser 47M sents.
Specification of our case frames • Case slots • surface cases (dependency labels) and prepositions • sbj, obj, obj2, pp:for, pp:in, … • Instances • words • several semantic markers • <time>, <num>, <clause>
Details of case frame construction • Use only reliable parses • Sentence length <= 20 words • MSTParser [McDonald et al., 06] • Extract predicate-argument structures • From labeled dependency parses • Group and cluster p-a structures • Grouping by a dominant case slot • pre-defined order: obj, sbj, pp:* • Clustering based on WordNet • Labeled dependency acc.: 89.9% → 91.5% • Complete rate: 36.3% → 56.4%
Clustering of case frames CF1 sbj:{i} obj:{dollar} pp:from:{friend} 1 8 3 0.73 1.0 0.73 0.82 CF2 sbj:{farmer, he} obj:{money} pp:from:{company} pp:for:supply 1 1 10 5 3 similarity between instances (words): ratio of common cases: similarity between case frames
Results • Obtained case frames for 9,300 verbs • Evaluated case frames of 20 verbs • Criteria: • Verb usage is disambiguated by dominant arguments • Case frames must have obligatory case slots • Case slots, except a dominant one, may contain an ineligible example • Accuracy: 88.4%
Conclusion and future work • Constructed broad-coverage case frames for English • Described real use of English verbs • Future work • Use more sophisticated methods for extracting reliable parses [Kawahara and Uchimoto, 08] • Integrate case frames to parsing (and other applications) • cf. [Zeman, 02] for subcategorization frames [Kawahara and Kurohashi, 06] for case frames