380 likes | 650 Views
Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution. Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology {ryu-i,inui,matsu}@is.naist.jp June, 20th, 2006. Zero-anaphora resolution. Zero-anaphor = a gap with an anaphoric function
E N D
Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji MatsumotoNara Institute of Science and Technology {ryu-i,inui,matsu}@is.naist.jp June, 20th, 2006
Zero-anaphora resolution • Zero-anaphor = a gap with an anaphoric function • Zero-anaphora resolution becoming important in many applications • In Japanese, even obligatory arguments of a predicate are often omitted when they are inferable from the context • 45.5% nominative arguments of verbs are omitted in newspaper articles
Zero-anaphora resolution (cont’d) • Three sub-tasks: • Zero-pronoun detection: detect a zero-pronoun • Antecedent identification: identify the antecedent for a given zero-pronoun • Anaphoricity determination: anaphoric zero-pronoun antecedent Mary-wa John-ni (φ-ga ) tabako-o yameru-youni it-ta Mary-NOM John-DAT (φ-NOM ) smoking-OBJ quit-COMP say-PAST [Mary asked John to quit smoking.]
non-anaphoric zero-pronoun (φ-ga) ie-ni kaeri-tai (φ -NOM) home-DAT want to go back [(φ=I) want to go home.] Zero-anaphora resolution (cont’d) • Three sub-tasks: • Zero-pronoun detection: detect a zero-pronoun • Antecedent identification: identify antecedent from the set of candidate antecedents for a given zero-pronoun • Anaphoricity determination: classify whether a given zero-pronoun is anaphoric or non-anaphoric anaphoric zero-pronoun antecedent Mary-wa John-ni (φ-ga ) tabako-o yameru-youni it-ta Mary-NOM John-DAT (φ-NOM ) smoking-OBJ quit-COMP say-PAST [Mary asked John to quit smoking.]
Previous work on anaphora resolution • Research trend has been shifting from rule-based approaches (Baldwin, 95; Lappin and Leass, 94; Mitkov, 97, etc.) to empirical, or learning-based, approaches (Soon et al., 2001; Ng 04, Yang et al., 05, etc.) • Cost-efficient solution for achieving performance comparable to best performing rule-based systems • Learning-based approaches represent a problem, anaphoricity determination and antecedent identification, as a set of feature vectors and apply machine learning algorithms to them
Mary-wa Mary-TOP AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM tabako-osmoking-OBJ predicateyameru-youniquit-CONP predicateit-tasay-PAST Syntactic pattern features • Useful clues for both anaphoricity determination and antecedent identification
Mary-wa Mary-TOP AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM tabako-osmoking-OBJ predicateyameru-youniquit-CONP predicateit-tasay-PAST Syntactic pattern features • Useful clues for both anaphoricity determination and antecedent identification • Questions • How to encode syntactic patterns as features • How to avoid data sparseness problem
Talk outline • Zero-anaphora resolution: Background • Selection-then-classification model (Iida et al., 05) • Proposed model • Represents syntactic patterns based on dependency trees • Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem • Incorporates syntactic pattern features in the selection-then-classification model • Experiments on Japanese zero-anaphora • Conclusion and future work
A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, … candidate anaphor federal judge tournament model candidate antecedents order … USAir Group Inc suit candidate anaphor USAir Selection-then-Classification Model(SCM) (Iida et al., 05)
federal judge tournament model candidate antecedents order … USAir Group Inc suit candidate anaphor USAir Selection-then-Classification Model(SCM) (Iida et al., 05) (Iida et al. 03) USAir Group Inc USAir Group Inc … USAir Group Inc Federal judge order USAir suit candidate anaphor candidate antecedents
Selection-then-Classification Model(SCM) (Iida et al., 05) federal judge tournament model candidate antecedents order … USAir Group Inc USAir Group Inc suit most likelycandidate antecedent candidate anaphor USAir
USAir Group Inc USAir Anaphoricitydetermination model score ≧θ ana scoreθ ana is anaphoric and USAir is non-anaphoric USAir is the antecedent of USAir Group Inc USAir Selection-then-Classification Model(SCM) (Iida et al., 05) federal judge tournament model candidate antecedents order … USAir Group Inc USAir Group Inc suit most likelycandidate antecedent candidate anaphor USAir
USAir Group Inc USAir Anaphoricitydetermination model score ≧θ ana scoreθ ana is anaphoric and USAir is non-anaphoric USAir is the antecedent of USAir Group Inc USAir Selection-then-Classification Model(SCM) (Iida et al., 05) federal judge tournament model candidate antecedents order … USAir Group Inc USAir Group Inc suit most likelycandidate antecedent candidate anaphor USAir
Anaphoric Non-anaphoric Training the anaphoricity determination model NP1 set of candidate antecedents NPi: candidate antecedent NP2 NP3 Anaphoricinstances Antecedent NP4 NP5 NP4 ANP anaphoric noun phrase ANP NP1 tournament model NP2 set of candidate antecedents NP3 Non-anaphoricinstances candidate antecedent NP4 NP3 NP5 non-anaphoric noun phrase NP3 NANP NANP
Talk outline • Zero-anaphora resolution: Background • Selection-then-classification model (Iida et al., 05) • Proposed model • Represents syntactic patterns based on dependency trees • Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem • Incorporates syntactic pattern features in the selection-then-classification model • Experiments on Japanese zero-anaphora • Conclusion and future work
candidate anaphor USAir Group Inc USAir Anaphoricitydetermination model score ≧θ ana scoreθ ana is anaphoric and USAir is non-anaphoric USAir is the antecedent of USAir Group Inc USAir New model federal judge tournament model candidate antecedents order … USAir Group Inc USAir Group Inc suit most likelycandidate antecedent USAir
Use of syntactic pattern features • Encoding parse tree features • Learning useful sub-trees
Mary-wa Mary-TOP AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM tabako-osmoking-OBJ predicateyameru-youniquit-CONP predicateit-tasay-PAST Encoding parse tree features
Mary-wa Mary-TOP tabako-osmoking-OBJ Encoding parse tree features AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM predicateyameru-youniquit-CONP predicateit-tasay-PAST
Antecedent zero-pronoun predicate predicate Encoding parse tree features AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM predicateyameru-youniquit-CONP predicateit-tasay-PAST
niDAT gaCONJ youniCONJ taPAST Encoding parse tree features AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM predicateyameru-youniquit-CONP predicateit-tasay-PAST Antecedent zero-pronoun predicate predicate
(TL) (TI) LeftCand zero- pronoun predicate predicate (TR) LeftCand RightCand predicate RightCand zero- pronoun predicate predicate Encoding parse trees LeftCandMary-wa Mary-TOP RightCand John-niJohn-DAT zero-pronounφ-gaφ-NOM tabako-osmoking-OBJ predicateyameru-youniquit-CONP predicateit-tasay-PAST
Encoding parse trees • Antecedent identification root Three sub-trees
… n 1 2 … Lexical, Grammatical, Semantic, Positional and Heuristic binary features Encoding parse trees • Antecedent identification root Three sub-trees
label Left or right Encoding parse trees • Antecedent identification root … n 1 2 … Lexical, Grammatical, Semantic, Positional and Heuristic binary features Three sub-trees
Learning useful sub-trees • Kernel methods: • Tree kernel (Collins and Duffy, 01) • Hierarchical DAG kernel (Suzuki et al., 03) • Convolution tree kernel (Moschitti, 04) • Boosting-based algorithm: • BACT (Kudo and Matsumoto, 04) system learns a list of weighted decision stumps with the Boosting algorithm
decision stumps learn weight sub-tree 0.4 Label positive apply Score: +0.34 positive Learning useful sub-trees • Boosting-based algorithm: BACT • Learns a list of weighted decision stumps with Boosting • Classifies a given input tree by weighted voting Training instances positive Labels positive positive ….
scoreintra≧θintra Output the most-likely candidate antecedent appearing in S scoreintra<θintra scoreinter≧θinter Inter-sentential model Output the most-likely candidate appearing outside of S scoreinter<θinter Return ‘‘non-anaphoric’’ Overall process Input (a zero-pronoun φ in the sentence S) syntactic patterns Intra-sentential model
Table of contents • Zero-anaphora resolution • Selection-then-classification model (Iida et al., 05) • Proposed model • Parse encoding • Tree mining • Experiments • Conclusion and future work
# of correctly resolved zero-anaphoric relations # of anaphoric zero-pronouns # of correctly resolved zero-anaphoric relations # of anaphoric zero-pronouns the model detected Experiments • Japanese newspaper article corpus comprising zero-anaphoric relations: 197 texts (1,803 sentences) • 995 intra-sentential anaphoric zero-pronouns • 754 inter-sentential anaphoric zero-pronouns • 603 non-anaphoric zero-pronouns • Recall = • Precision =
Experimental settings • Conducting five-fold cross validation • Comparison among four models • BM: Ng and Cardie (02)’s model: • Identify an antecedent with candidate-wise classification • Determine the anaphoricity of a given anaphor as a by-product of the search for its antecedent • BM_STR: BM +syntactic pattern features • SCM: Selection-then-classification model (Iida et al., 05) • SCM_STR: SCM + syntactic pattern features
Results of intra-sentential ZAR • Antecedent identification (accuracy) The performance of antecedent identification improved by using syntactic pattern features
Results of intra-sentential ZAR • antecedent identification + anaphoricity determination
Impact on overall ZAR • Evaluate the overall performance for both intra-sentential and inter-sentential ZAR • Baseline model:SCM • resolves intra-sentential and inter-sentential zero-anaphora simultaneously with no syntactic pattern features.
AUC curve • AUC (Area Under the recall-precision Curve) plotted by altering θintra • Not peaky optimizing parameter θintra is not difficult
Conclusion • We have addressed the issue of how to use syntactic patterns for zero-anaphora resolution. • How to encode syntactic pattern features • How to seek useful sub-trees • Incorporating syntactic pattern features into our selection-then-classification model improves the accuracy for intra-sentential zero-anaphora, which consequently improves the overall performance of zero-anaphora resolution
Future work • How to find zero-pronouns? • Designing a broader framework to interact with analysis of predicate argument structure • How to find a globally optimal solution to the set of zero-anaphora resolution problems in a given discourse? • Exploring methods as discussed by McCallum and Wellner (03)