400 likes | 563 Views
Intelligent Systems (AI-2) Computer Science cpsc422 , Lecture 29 Mar, 26, 2014. Some general points. Intelligent Systems are complex; often integrating many R&R systems + Machine Learning Discourse Parsing is a task we are very good at, here at UBC ;-) Show Demo. Questions.
E N D
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 29 Mar, 26, 2014 CPSC 422, Lecture 29
Some general points • Intelligent Systems are complex; often integrating many R&R systems + Machine Learning • Discourse Parsing is a task we are very good at, here at UBC ;-) • Show Demo CPSC 422, Lecture 29
Questions • - How were the features selected in the segmentation model? What trade offs if any were considered in the decision process? I didn't understand the analysis in this section of the paper. • mainly based on previous work • Why is “…discourse segmentation a primary source of inaccuracy for discourse parsing…” (p6)? Is this because discourse segmentation is typically done manually (i.e. humans involved)? If so, is developing an automated segmenter (as the paper has done) becoming a more preferable approach? • no it is because if you start from the wrong building blocks, you are more likely to make mistakes Relation and structure- Why is it important to capture the joint relation between relation and structure? • some relationships are more/less likely to connect certain structures and viceversa • - Why are similar good performances harder to obtain in graph structure representations? • - Why is it that chunks of sentences cannot be skipped over in rhetorical relations? ex. in fig 4. why can't chunks 1 and chunk 4 form a relation, independently of chunks 2 and 3? • RST theory CPSC 422, Lecture 29
Questions DTs and EDUs- How are EDUs defined? • Elementary discourse unit (EDU): A span of text, usually a clause, but in general ranging from minimally a (nominalization) NP to maximally a sentence. It denotes a single event or type of event, serving as a complete, distinct unit of information that the subsequent discourse may connect to. An EDU may be structurally embedded in another. • - How are DTs and EDUs generated? • human annotation • - What are sequence and hierarchical dependencies in a DT? • some rhet relations are more/less likely to follow/dominate other onesWhat does it mean that previous discourse analysis make strong independence assumption on the struct and labels of the resulting DT CPSC 422, Lecture 29
Questions • Parsing Model • - What exactly was dominance used for in the parsing model? • features for the CRF • - What does model parameter Θ represent? • all the feature weights in the CRF • Parsing Algo- How do you prove that a parsing model is optimal? • CYK has been proven to be optimal, I think by induction but not sureDictionary- How does one construct a lexical N-gram dictionary from the training data? • - What is a lexical N-gram dictionary? and how do they work exactly? • frequencies of words • frequencies of pair of consecutive words • etc. CPSC 422, Lecture 29
Questions • Experiments- It seems that the model is more accurate in finding the right tree structure when the sentences are shorter and contain fewer EDUs; what would happen if you were to test various sentence-length conditions on the model? • - - How well does the model deal with numbers? (Ex. applying discourse analysis on the results section of a research paper) • - Why are such specific genres of text used (articles and how-to-do-manuals)? • because they are the only ones annotated with DT! CPSC 422, Lecture 29
Questions • Experiments- Is the parser specifically geared to parse sentences from such a body of work, or is it versatile to achieve similar results with different genres? Or rather, would this be a form of experimental control? • this is an empirical question. the fact that it does work well on two rather different genres is encouranging • What would happen if the corpus was grammatically incorrect (even in the slightest way)? Would this affect the experiment? • it all depends on how robust is the parser. also a possible strategy would be …….. • - Go over table 2 and what each column represents? • ok CPSC 422, Lecture 29
Questions • Confusion matrix • - I didn't really understand the discussion of the confusion matrix. What can we learn about the relational labelling from the confusion matrix? What can we look for? How can I get some intuition? • - Concerning the confusion matrix, are we able to develop metrics based on the matrix occurrences (average, medians..) to differentiate particular text samples by difficulty? CPSC 422, Lecture 29
Our Discourse Parser • Discourse Parsing State of the art limitations: • Structure and labels determined separately • Do not consider sequential dependency • Suboptimal algorithm to build structure • Our Discourse Parser addresses these limitations • LayeredCRFs + CKY-like parsing CPSC 422, Lecture 29
Ongoing Work • Detect Controversiality in online asynchronous conversations • Summarize evaluative text (e.g., customer reviews) • "Using Discourse Structure Improves Machine Translation Evaluation“ ACL 2014 • Also, we submitted one to Coling on semantic parsing (for spoken language) that uses discourse trees in svm kernels. CPSC 422, Lecture 29
Discovering Discourse Structure: Computational Tasks The bank was hamstrung in its efforts to face the challenges of a changing market by its links to the government, analysts say. Discourse Segmentation to face the challenges of a changing market by its links to the government, The bank was hamstrung in its efforts analysts say. 3 1 2 Discourse Parsing CPSC 422, Lecture 29
Our Parsing Model Model structure and label jointly Structure at level i S ϵ {0, 1} Relation at level i R ϵ {1 .. M} -- -- -- -- W1 W2 S2 R2 S3 W3 R3 -- -- Rj Wj Sj Sk-1 Rk-1 Wk-1 Wk Sk Rk Spans at level i Dynamic Conditional Random Field (DCRF) [Sutton et al, 2007] Models sequential dependencies CPSC 422, Lecture 29
Discourse Parsing: Evaluation RST-DT corpus (Carlson & Marcu, 2001) Corpora/Datasets Instructional corpus (Subba & Di-Eugenio, 2009) • 176 how-to-do manuals • 3430 sentences • 385 news articles • -Train: 347 (7673 sentences) • -Test: 38 (991 sentences) Excellent Results (beat state-of-the-art by a wide margin): [EMNLP-2012, ACL-2013] CPSC 422, Lecture 29
Discourse Parsing: Example MAIN-CLAIM EVIDENCE • What is the main claim in a message and how it is expanded/supported by the other claims her background is perfect for the job. It would be great to hire Mary. CONCESSION “It would be great to hire Mary. Even though she may initially need some coaching , her background is perfect for the job.” Even though, she may initially need some coaching CPSC 422, Lecture 29
Discourse Parsing Existing parsersbased on features derived from the lexicalized syntactic tree. (Soricut& Marcu2003), (Hernault et. al. 2010) Strong assumptions of independence Local/greedy relation selection We are currently working on a new parser based on: linear chain Conditional Random Fields (CRFs) A* parsing technique CPSC 422, Lecture 29
A Novel Discriminative Framework for Sentence-Level Discourse Analysis Shafiq Joty In collaboration with Giuseppe Carenini, Raymond T. Ng CPSC 422, Lecture 29
Discourse Analysis in RST The bank was hamstrung in its efforts to face the challenges of a changing market by its links to the government, analysts say. Relation Nucleus Satellite Relation Nucleus Satellite EDU EDU EDU CPSC 422, Lecture 29
Motivation • Text summarization (Marcu, 2000) • Text generation (Prasad et al., 2005) • Sentence compression (Sporleder & Lapata, 2005) • Question Answering (Verberne et al., 2007) CPSC 422, Lecture 29
Outline • Previous work • Discourse parser • Discourse segmenter • Corpora/datasets • Evaluation metrics • Experiments • Conclusion and future work CPSC 422, Lecture 29
Previous Work (1) Soricut & Marcu, (2003) Hernault et al. (2010) HILDA SPADE Segmenter & Parser Segmenter & Parser Document level Sentence level Generative approach √ Lexico-syntactic features √ Structure & Label dependentХ Sequential dependencies Х Hierarchical dependenciesХ SVMs √ Large feature set √ OptimalХ Sequential dependencies Х Hierarchical dependencies √ Newspaper articles CPSC 422, Lecture 29
Previous Work (2) Subba & Di-Eugenio, (2009) Fisher & Roark (2007) Only Parser Only Segmenter Binary log-linear Shift-reduce Sentence + Document level State-of-the-art performance Parse-tree features are important ILP-based classifier √ Compositional semantics √ OptimalХ Sequential dependencies Х Hierarchical dependenciesХ Instructional manuals CPSC 422, Lecture 29
Discourse Parsing Assume a sentence is already segmented into EDUs. e1 e2 e3 Discourse parsing Label Structure Relation + Nuclearity e1-3 e1-3 r1-3 e2-3 r2-3 e1-2 e1 e2 e3 e1 e2 e3 e1 e2 e3 CPSC 422, Lecture 29
Our Discourse Parser Parsing model R ranges over set of relations Assign probabilities to DTs. Parsing algorithm r1-3 R1-3 R1-3 R2-3 Find the most probable DT r1-2 R1-2 e1 e2 e3 e1 e2 e1 e2 e3 e3 CPSC 422, Lecture 29
Requirements for Our Parsing Model • Discriminative • Joint model for Structure and Label • Sequential dependencies • Hierarchical dependencies • Should support an optimal parsing algorithm CPSC 422, Lecture 29
Our Parsing Model Model structure and label jointly Structure at level i S ϵ {0, 1} Relation at level i R ϵ {1 .. M} -- -- -- -- W1 W2 S2 R2 S3 W3 R3 -- -- Rj Wj Sj Sk-1 Rk-1 Wk-1 Wk Sk Rk Spans at level i Dynamic Conditional Random Field (DCRF) [Sutton et al, 2007] Models sequential dependencies CPSC 422, Lecture 29
Obtaining probabilities Apply DCRF recursively at different levels and compute posterior marginals of relation-structure pairs Level 1 Level 2 R2-3 R2 S2-3 S2 e1-2 R3-4 e2-3 e2 e3 e3-4 e1 e1 e1 S3 R2 R3 S2 e2 S3-4 S4 R4 S4 R4 e4 R3 S3 e3 e4 e4 S4 R4 CPSC 422, Lecture 29
Obtaining probabilities Apply DCRF recursively at different levels and compute posterior marginals of relation-structure pairs R2-4 R3-4 R4 Level 3 S4 S2-4 S3-4 e1 e1-2 e1-3 e4 e2-4 e3-4 CPSC 422, Lecture 29
Features Used in Parsing Model (SPADE) Hierarchical dependencies CPSC 422, Lecture 29
Parsing Algorithm Probabilistic CKY-like bottom-up algorithm 4Х4 4 EDUs DPT D Finds global optimal CPSC 422, Lecture 29
Outline • Previous work • Discourse parser • Discourse segmenter • Corpora/datasets • Evaluation metrics • Experiments • Conclusion and future work CPSC 422, Lecture 29
Discourse Segmentation The bank was hamstrung in its efforts to face the challenges of a changing market by its links to the government, analysts say. Discourse Segmentation to face the challenges of a changing market by its links to the government, The bank was hamstrung in its efforts analysts say. EDU EDU EDU Segmentation is the primary source of inaccuracy (Soricut & Marcu, 2003) CPSC 422, Lecture 29
Our Discourse Segmenter • Binary classification: boundary or no-boundary • Logistic Regression with L2 regularization • Bagging to deal with sparse boundary tags Features used SPADE features Chunk and POS features Positional features Contextual features CPSC 422, Lecture 29
Corpora/Datasets RST-DT corpus (Carlson & Marcu, 2001) Instructional corpus (Subba & Di-Eugenio, 2009) • 176 how-to-do manuals • 3430 sentences • 385 news articles • -Train: 347 (7673 sentences) • -Test: 38 (991 sentences) • Relations • 26 primary relations • Reversal of non-commutative as separate relations • 70 with Nucleus-Satellite • Relations • 18 relations • 39 with Nucleus-Satellite CPSC 422, Lecture 29
Evaluation Metrics Metrics for parsing accuracy (Marcu, 2000) Precision, Recall F-measure • Unlabeled (Span) • Labeled (Nuclearity, Relation) Metric for segmentation accuracy (Soricut & Marcu, 2003; Fisher & Roark, 2007) Precision, Recall F-measure Intra-sentence EDU boundary CPSC 422, Lecture 29
Experiments (1) Parsing based on manual segmentation Our model outperforms the state-of-the-art by a wide margin, especially on relation labeling CPSC 422, Lecture 29
Experiments (2) Discourse segmentation Human agreement (F-measure): 98.3 • Our model outperforms SPADE and comparable to F&R • We use fewer features than F&R CPSC 422, Lecture 29
Experiments (3) Parsing based on automatic segmentation • Our model outperforms SPADE by a wide margin • Inaccuracies in segmentation affects parsing on • Instructional corpus CPSC 422, Lecture 29
Error analysis (Relation labeling) • Most frequent ones confuse less frequent ones • Hard to distinguish semantically similar relations CPSC 422, Lecture 29
Conclusion • Discriminative framework for discourse analysis. • Our parsing model: • Discriminative • Structure and label jointly • Sequential and hierarchical dependencies • Supports an optimal parsing algorithm • Our approach outperforms the state-of-the-art by a wide margin. CPSC 422, Lecture 29
Future Work • Extend to multi-sentential text. • Can segmentation and parsing be done jointly? CPSC 422, Lecture 29