1 / 34

Quasi-Synchronous Grammars

Quasi-Synchronous Grammars. Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in entirety. the strictness of the isomorphism may vary across words or syntactic rules. Key idea:

Download Presentation

Quasi-Synchronous Grammars

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quasi-Synchronous Grammars • Based on key observations in MT: • translated sentences often have some isomorphic syntactic structure, but not usually in entirety. • the strictness of the isomorphism may vary across words or syntactic rules. • Key idea: • Unlike some synchronous grammars (e.g. SCFG, which is more strict and rigid), QG defines a monolingual grammar for the target tree, “inspired” by the source tree.

  2. Quasi-Synchronous Grammars • In other words, we model the generation of the target tree, influenced by the source tree (and their alignment) • QA can be thought of as extremely free monolingual translation. • The linkage between question and answer trees in QA is looser than in MT, which gives a bigger edge to QG.

  3. Model • Works on labeled dependency parse trees • Learn the hidden structure (alignment between Q and A trees) by summing out ALL possible alignments • One particular alignment tells us both the syntactic configurations and the word-to-word semantic correspondences • An example… question parse tree answer parse tree an alignment answer question

  4. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det of nmod the DT France NNP location president NN nmod French JJ location

  5. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det of nmod the DT France NNP location president NN nmod French JJ location

  6. $ root $ root Q: A: root root met VBD is VB subj with Bush NNP person Jacques Chirac NNP person nmod president NN nmod given its parent, a word is independent of all other words (including siblings). French JJ location Our model makes local Markov assumptions to allow efficient computation via Dynamic Programming (details in paper)

  7. $ root $ root Q: A: root root met VBD is VB subj subj with who WP qword Bush NNP person Jacques Chirac NNP person nmod president NN nmod French JJ location

  8. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person nmod president NN nmod French JJ location

  9. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det nmod the DT president NN nmod French JJ location

  10. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det of nmod the DT France NNP location president NN nmod French JJ location

  11. 6 types of syntactic configurations • Parent-child

  12. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det of nmod the DT France NNP location president NN nmod French JJ location

  13. Parent-child configuration

  14. 6 types of syntactic configurations • Parent-child • Same-word

  15. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det of nmod the DT France NNP location president NN nmod French JJ location

  16. Same-word configuration

  17. 6 types of syntactic configurations • Parent-child • Same-word • Grandparent-child

  18. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det of nmod the DT France NNP location president NN nmod French JJ location

  19. Grandparent-child configuration

  20. 6 types of syntactic configurations • Parent-child • Same-word • Grandparent-child • Child-parent • Siblings • C-command (Same as [D. Smith & Eisner ’06])

  21. Modeling alignment • Base model

  22. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det of nmod the DT France NNP location president NN nmod French JJ location

  23. $ root $ root Q: A: root root met VBD is VB subj obj subj with who WP qword leader NN Bush NNP person Jacques Chirac NNP person det of nmod the DT France NNP location president NN nmod French JJ location

  24. Modeling alignment cont. • Base model • Log-linear model Lexical-semantic features from WordNet, Identity, hypernym, synonym, entailment, etc. • Mixture model

  25. Parameter estimation • Things to be learnt • Multinomial distributions in base model • Log-linear model feature weights • Mixture coefficient • Training involves summing out hidden structures, thus non-convex. • Solved using conditional Expectation-Maximization

  26. Experiments • Trec8-12 data set for training • Trec13 questions for development and testing

  27. Candidate answer generation • For each question, we take all documents from the TREC doc pool, and extract sentences that contain at least one non-stop keywords from the question. • For computational reasons (parsing speed, etc.), we only took answer sentences <= 40 words.

  28. Dataset statistics • Manually labeled 100 questions for training • Total: 348 positive Q/A pairs • 84 questions for dev • Total: 1415 Q/A pairs • 3.1+, 17.1- • 100 questions for testing • Total: 1703 Q/A pairs • 3.6+, 20.0- • Automatically labeled another 2193 questions to create a noisy training set, for evaluating model robustness

  29. Experiments cont. • Each question and answer sentence is tokenized, POS tagged (MX-POST), parsed (MSTParser) and labeled with named-entity tags (Identifinder)

  30. Baseline systems (replications) • [Cui et al. SIGIR ‘05] • The algorithm behind one of the best performing systems in TREC evaluations. • It uses a mutual information-inspired score computed over dependency trees and a single fixed alignment between them. • [Punyakanok et al. NLE ’04] • measures the similarity between Q and A by computing tree edit distance. • Both baselines are high-performing, syntax-based, and most straight-forward to replicate • We further enhanced the algorithms by augmenting them with WordNet.

  31. Results Mean Average Precision Mean Reciprocal Rank of Top 1 28.2% 41.2% 30.3% 23.9% Statistically significantly better than the 2nd best score in each column

  32. Summing vs. Max

  33. Switching back • Tree-edit CRFs

More Related