830 likes | 1.07k Views
Some Observations on Hindi Dependency Parsing. Samar Husain Language Technologies Research Centre, IIIT-Hyderabad. Introduction. Parsing a free work order language with (relatively) rich morphology is a challenging task Methods, problems, causes. Experiments with Hindi. Introduction.
E N D
Some Observations on Hindi Dependency Parsing Samar Husain Language Technologies Research Centre, IIIT-Hyderabad.
Introduction • Parsing a free work order language with (relatively) rich morphology is a challenging task • Methods, problems, causes. • Experiments with Hindi
Introduction • Parsing a free work order language with (relatively) rich morphology is a challenging task • Methods, problems, causes. • Experiments with Hindi
Hindi: Brief overview • malaya ne sameer ko kitaba dii. Malay ERG Sameer DAT book gave “Malay gave the book to Sameer” (S-IO-DO-V) S-DO-IO-V IO-S-DO-V IO-DO-S-V DO-S-IO-V DO-IO-S-V
Hindi: Brief overview • Inflections • Gender, number, person • Tense, aspect and modality • Agreement • Noun-adjective • Noun-verb
Dependency Grammar A formalism for linguistic analysis Dependencies between words central to analysis Different from phrase structure analysis
Dependency Grammar A formalism for linguistic analysis Dependencies between words central to analysis Different from phrase structure analysis Abhay ate a mango
Dependency Grammar A formalism for linguistic analysis Dependencies between words central to analysis Different from phrase structure analysis Abhay ate a mango
Dependency Tree Root property Spanning property Connectedness property Single head property Acyclicity property Arc size property Kübler et al. (2009)
Dependency Parsing M = (Γ, λ, h) A dependency parsing model M comprises of a set of constraints Γ that define the space of permissible dependency structures, a set of parameters λ and a parsing algorithm h Γ maps an arbitrary sentence S and dependency type set R to a set of well-formed dependency trees Gs Γ = (Σ, R, C) where, Σ is the set of terminal symbols (here, words), R is the label set, and C is the set of constraints. Such constraints restrict dependencies between words and possible head of the word in well defined ways. G = h (Γ, λ, S) given a set of constraints Γ, parameter λ and a new sentence S, how does the system find out the most appropriate dependency tree G for that sentence Kübler et al. (2009)
Dependency Parsing M = (Γ, λ, h) A dependency parsing model M comprises of a set of constraints Γ that define the space of permissible dependency structures, a set of parameters λ and a parsing algorithm h Γ maps an arbitrary sentence S and dependency type set R to a set of well-formed dependency trees Gs Γ = (Σ, R, C) where, Σ is the set of terminal symbols (here, words), R is the label set, and C is the set of constraints. Such constraints restrict dependencies between words and possible head of the word in well defined ways. G = h (Γ, λ, S) given a set of constraints Γ, parameter λ and a new sentence S, how does the system find out the most appropriate dependency tree G for that sentence Kübler et al. (2009)
Dependency Parsing M = (Γ, λ, h) A dependency parsing model M comprises of a set of constraints Γ that define the space of permissible dependency structures, a set of parameters λ and a parsing algorithm h Γ maps an arbitrary sentence S and dependency type set R to a set of well-formed dependency trees Gs Γ = (Σ, R, C) where, Σ is the set of terminal symbols (here, words), R is the label set, and C is the set of constraints. Such constraints restrict dependencies between words and possible head of the word in well defined ways. G = h (Γ, λ, S) given a set of constraints Γ, parameter λ and a new sentence S, how does the system find out the most appropriate dependency tree G for that sentence Kübler et al. (2009)
Constraint based based on the notion of eliminative parsing, where sentences are analyzed by successively eliminating representations that violate constraints until only valid representations remain Data-driven learning problem, which is the task of learning a parsing model from a representative sample of structure of sentences (training data) the parsing problem (or inference/decoding problem), which is the task of applying the learned model to the analysis of a new sentence.
Constraint based based on the notion of eliminative parsing, where sentences are analyzed by successively eliminating representations that violate constraints until only valid representations remain Data-driven learning problem, which is the task of learning a parsing model from a representative sample of structure of sentences (training data) the parsing problem (or inference/decoding problem), which is the task of applying the learned model to the analysis of a new sentence.
Constraint based method A Two-Stage Generalized Hybrid Constraint Based Parser (GH-CBP) Incorporates some of the notions of CPG Uses Integer linear programming for constraint satisfaction Also incorporate ideas from graph-based parsing and labeling for prioritization 15 Bharati et al. (2009a, 2009b); Husain (2011)
Data driven approaches • Transition based systems • MaltParser • Graph based systems • MSTParser
MaltParser • Malt is a classifier based Shift/Reduce parser. • It uses arc-eager, arc-standard, covington projective and convington non-projective algorithms for parsing • History-based feature models are used for predicting the next parser action • Support vector machines are used for mapping histories to parser actions Nivre et al., (2006)
MSTParser • MST uses Chu-Liu-Edmonds Maximum Spanning Tree algorithm for non-projective parsing and Eisner's algorithm for projective parsing. • It uses online large margin learning as the learning algorithm McDonald et al., (2005a, 2005b)
Hybrid • Constraint parser + MSTParser (Husain et al., 2011b)
Modularity • Chunk • Local word groups • Local dependencies
Modularity • Clause • Intra-clausal • Inter-clausal
Chunk based parsing (I) Chunk as hard constraint Intra-chunk and inter-chunk dependencies identified separately But use intra-chunk features Identifying intra-chunk relations easy Ambati et al., (2010b)
Chunk based parsing (II) • Chunk as soft constraint • Intra-chunk and inter-chunk dependencies identified together • Use local morphosyntactic features
Clause based parsing (I) Husain et al., (2009)
Clause based parsing (II) Husain et al. (2011a)
Clause based parsing (III) Similar to parser stacking ‘guide’ Malt with a 1st stage parse by Malt. The additional features added to the 2nd-stage parser during 2-Soft parsing encode the decisions by the 1st-stage parser concerning potential arcs and labels considered by the 2nd stage parser, in particular, arcs involving the word currently on top of the stack and the word currently at the head of the input buffer.
Experimental setup Parsers GH-CBP (version 1.6) MaltParser (version 1.3.1) MSTParser (version 0.4b) Data ICON10 tools contest The training set had 3000 sentences, the development had 500 sentences and test set had 300 sentences 35
Evaluation metric and accuracies • CoNLL dependency parsing shared task 2008 (Nivre et al., 2008) • UAS: Unlabeled attachment accuracy • LAS: Label attachment accuracy • LA: Label accuracy • Performance • Constraint based (coarse-grained tagset; oracle) • UAS = 88.50 • LAS = 79.12 • Statistical (fine-grained) • UAS = ~91 • LAS = ~76
Remarks: Malt • Crucial features • Deprel of the partially built tree • Conjoined features • Good for short distance dependencies • Non-projective algo doesn’t help • Arc-eager, Libsvm Bharati et al., (2008), Ambati et al., (2010a)
Remarks: MSTParser • Crucial features • Conjoined features • Modified MST • Difficult to incorporate complex features for labeled parsing • We use MaxEnt as a labeler • Good for long distance dependencies and for identifying the root • Non-projective performs better • Training k=5, order=2 (Bharati et al., 2008), (Ambati et al., 2010a)
What helps Morphological features Local morphosyntactic Clausal Minimal semantics 39 Bharati et al., (2008); Ambati et al;. (2009) Ambati et al., (2010a); Ambati et al., (2010b); Gadde et al., (2010);
What helps Morphological features Local morphosyntactic Clausal Minimal semantics 40
What helps Morphological Local morphosyntactic Clausal Minimal semantics 41
What helps Morphological Local morphosyntactic Clausal Minimal semantics 42
What helps Morphological Local morphosyntactic Clausal Minimal semantics 43
Relative comparison the relative importance of these features over the baseline LAS of MSTParser. 44
What doesn’t • Gender, number, person
Parsing MOR-FWO languages Problems in parsing of MOR-FWO languages Non-configurational nature of these languages Inherent limitations in the parsing/learning algorithms Less amount of annotated data 47
Common errors • Simple sentences • the correct identification of the argument structure (labels)
Common errors • Reasons for errors in label • Word order not strict • absence of postpositions, • ambiguous postpositions, • ambiguous TAMs, and • inability of the parser to exploit agreement features • inability to always make simple linguistic generalizations
Embedded clauses • Relative clauses • Participles