220 likes | 341 Views
Syntactic Contributions in the Entailment Task. Lucy Vanderwende, Arul Menezes, Rion Snow (Stanford). RTE-1 analysis. Recap of MSR’s manual analysis of RTE-1 test data; in principle, 74% is achievable using syntax and thesaurus. RTE-1 analysis.
E N D
Syntactic Contributions in the Entailment Task Lucy Vanderwende, Arul Menezes, Rion Snow (Stanford)
RTE-1 analysis • Recap of MSR’s manual analysis of RTE-1 test data; in principle, 74% is achievable using syntax and thesaurus
RTE-1 analysis • Recap of MSR’s manual analysis of RTE-1 test data; in principle, 74% is achievable using syntax and thesaurus
MENT algorithm Predicting negative entailment using syntactic features: Obtain syntactic dependency graphs for T and H sentences Attempt to align each H node to a node in T Check syntactic heuristics on aligned nodes if match, then predict false If no match, use lexical similarity model (with threshold)
MENT: superlative heuristic Superlative heuristic (100% accurate, 5 test items): • If the superlatives align, and their heads are aligned, and the head in Text has any additional modifiers, and those modifiers are aligned to some modifier in H, say yes, else say no. (RTE2-test- #477) • Crater Lake is the deepest lake in the United States, the second deepest in the Western Hemisphere, and the seventh deepest in the world, dropping downward to 1,932 feet just southeast of Merriam Cone. • Crater Lake is the deepest lake in the world.
MENT: superlative heuristic Superlative heuristic (100% accurate, 5 test items): • If the superlatives align, and their heads are aligned, and the head in Text has any additional modifiers, and those modifiers are aligned to some modifier in H, say yes, else say no. (RTE2-test- #477) • Crater Lake is the deepestlake in the United States, the second deepest in the Western Hemisphere, and the seventh deepest in the world, dropping downward to 1,932 feet just southeast of Merriam Cone. • Crater Lake is the deepestlake in the world.
MENT: superlative heuristic Superlative heuristic (100% accurate, 5 test items): • If the superlatives align, and their heads are aligned, and the head in Text has any additional modifiers, and those modifiers are aligned to some modifier in H, say yes, else say no. (RTE2-test- #477) • Crater Lake is the deepestlakein the United States, the second deepest in the Western Hemisphere, and the seventh deepest in the world, dropping downward to 1,932 feet just southeast of Merriam Cone. • Crater Lake is the deepestlake in the world.
MENT: superlative heuristic Superlative heuristic (100% accurate, 5 test items): • If the superlatives align, and their heads are aligned, and the head in Text has any additional modifiers, and those modifiers are aligned to some modifier in H, say yes, else say no. (RTE2-test- #477) • Crater Lake is the deepest lake in the United States, the second deepest in the Western Hemisphere, and the seventhdeepestlakein the world, dropping downward to 1,932 feet just southeast of Merriam Cone. • Crater Lake is the deepestlakein the world.
MENT: Counterfactual heuristic Counterfactual heuristic (80% accurate, 15 test items): • If there is a pair of aligned nodes, and a second pair of aligned nodes, and the PATH in the dependency contains a conditional or counterfactual, say no. (RTE2-test- #473) • Blondlot was trying to polarize X-rays when he claimed to have discovered this new form of radiation. • Blondlot discovered x-rays.
MENT: Counterfactual heuristic Counterfactual heuristic (80% accurate, 15 test items): • If there is a pair of aligned nodes, and a second pair of aligned nodes, and the PATH in the dependency contains a conditional or counterfactual, say no. (RTE2-test- #473) • Blondlot was trying to polarize X-rays when heclaimed to have discovered this new form of radiation. • Blondlotdiscovered x-rays.
MENT: training feature weights • “run2”: treating a syntactic heuristic match as a yes/no vote, alignment threshold set using training data • “run1”: learning weights (using MaxEnt) for each syntactic and alignment heuristic, as well as for sub-components of these heuristics
MENT: results MENT Run1 says no 43.25% of the time
MENT variations – no thresholds • If heuristics apply, say no • Else say yes • 56% accurate • system says no 35% • Say no, unless • everything is aligned and no heuristics apply • 59.25% accurate • system says no 74.5% ** Note: Run2 = if no heuristics apply, and alignment score is above a threshold trained on the training set, then say yes, else no. Accuracy: 58.50
MENT variations – with threshold • With learned alignment and syntactic heuristic weights, with alignment threshold from training, say no • Else say yes • 60.25% accurate • System says no 43% of the time • Say no, unless • alignment score is above an Oracle threshold and no heuristics apply • 61.25% accurate • System says no 70% of the time
Lessons? • Use syntactic heuristics and sub-components as features and apply discriminative training • Thresholding for lexical similarity isn’t stable across data sets • Error Analysis …
How far do you take syntactic heuristics? Location: for a pair of aligned verb nodes, if there is an argument in H, and that argument is aligned to a node in T, say no if that node is not also the same argument of the aligned verb (applied 7 times, 5 incorrect) • Brandenburg Gate is one of Berlin's best known landmarks and is now regarded as one of the greatest symbols of German unity. • Brandenburg Gate isin Berlin.
A great heuristic …but Unaligned Verb: if there is an aligned subject and an aligned object, then if their verb is not aligned, say no • This heuristic was not used because of its poor performance, for example: • Rodriguez told detectives he never touched the burning backpack, which was loaded withplastic pipes packed with gunpowder and BBs. • The burning backpackcontainedplastic pipes packed with gunpowder and BBs. • Need to learn paraphrase similarity for verbs – see NAACL-HLT paper forthcoming.
Directions and Plans • MSR submission available at http://research.microsoft.com/~lucyv/Might it be possible to have access to all sites’ submissions? • Need to learn paraphrase similarity for verbs • More feature engineering • Different graph-matching strategies to avoid brittleness of syntactic heuristics • Find more data for training to build more stable systems
A plug for Pyramids • Conservatives oppose any form of devolution. • The conservatives are opposed to devolution. • The UK’s Tory Prime Minister adamantly resisted calls for devolution of British rule. • Scotts want self-rule • … as buoyed as most Scotts by North Ireland’s prospective self-rule • Wales is following Scotland, and moving towards a call for an elected assembly with devolved powers … • A self-governing Wales would be part of the EU • … an independent Wales within the European community • … Wales could participate directly in forthcoming EC meetings … • … a fully self-governing Wales within the European Community.
A plug for Pyramids SCU name, given by annotator • Conservatives oppose any form of devolution. • The conservatives are opposed to devolution. • The UK’s Tory Prime Minister adamantly resisted calls for devolution of British rule. • Scotts want self-rule • … as buoyed as most Scotts by North Ireland’s prospective self-rule • Wales is following Scotland, and moving towards a call for an elected assembly with devolved powers … • A self-governing Wales would be part of the EU • … an independent Wales within the European community • … Wales could participate directly in forthcoming EC meetings … • … a fully self-governing Wales within the European Community. Candidate hypothesis? Candidate Text?