490 likes | 653 Views
Two-Stage Constraint Based Hindi Parser. LTRC, IIIT Hyderabad. Brief Recap. Broad coverage parser Dependency Paninian framework vibhakti-karaka correspondence karaka frames (basic + transformation) Source groups, demand groups Constraints Three basic constraints
E N D
Two-Stage Constraint Based Hindi Parser LTRC, IIIT Hyderabad
Brief Recap • Broad coverage parser • Dependency • Paninian framework • vibhakti-karaka correspondence • karaka frames (basic + transformation) • Source groups, demand groups • Constraints • Three basic constraints • Constraints as Integer programming equations
Parser • Two stage strategy • Appropriate constraints formed • Stage I (Intra-clausal relations) • Dependency relations marked • Relations such as k1, k2, k3, etc. for each verb • Stage II (Inter-clausal relations & conjunct relations) • Conjuncts, relative clauses, kriya mula, etc • In certain cases, separates syntax from semantics (eg. kriya mula), in others, reduces the complexity.
Steps in Parsing SENTENCE Morph, POS tagging, Chunking Identify Demand Groups STAGE - II Load Frames & Transform YES Is Complex NO Find Candidates Apply Constraints & Solve Final Parse
Stage I: Types being handled • Simple Sentences (finite verbs) • Clausal arguments • Non-finite verbs • wA_huA • wA_hI • nA • kara • 0_rahe, etc. • Copula • Genitive
Stage - II • Handles: • Conjuncts • Subordinating & Coordinating • Relative clauses • Complex predicates • Basic constraints similar to Stage-I • Some additional constraints • New demand groups • New candidates
Steps (Stage II) Identify New Demand Groups Load Frames & Transform Output of STAGE - I Find Candidates Repair Apply Constraints & Solve FINAL PARSE
Example – Relative Clause • vaha puswaka jo rAma ne mohana ko xI hE prasixXa hE that book which Ram ERG. Mohana DAT. gave is famous is ‘The book which Ram gave to Mohana is famous’
Output after Stage - I _ROOT_ main main hE xI k1 k1s prasixXa k2 puswaka k1 k4 vaha jo rAma mohana
Identify the demand group • xiyA ‘give’ • Main verb of the relative clause
Identify the demand group,Load and Transform DF • jo ‘which’ transformation (special) • Transforms the demand frame of the main verb of the relative clause -------------------------------------------------------------------------------------------------------------- arc-label necessity vibhakti lextype src-pos arc-dir oprt -------------------------------------------------------------------------------------------------------------- nmod__relc m any n r|l p insert --------------------------------------------------------------------------------------------------------------
Karaka Frame Main verb of relative clause • vaha puswaka jo rAma ne mohana ko xI prasixXa hE| • that book which Ram ERG. Mohana DAT. gave famous is • ‘The book which Ram gave to Mohana is famous’ Transformed frame for xeafter applying the jotrasformation -------------------------------------------------------------------------------------------------------- arc-label necessity vibhakti lextype src-pos arc-dir oprt -------------------------------------------------------------------------------------------------------- nmod__relc m any n r|l p insert --------------------------------------------------------------------------------------------------------- New row inserted after transformation
Possible candidates • vaha puswakajo rAma ne mohana koxI hE prasixXa hE | nmod__relc
Output after Stage - II _ROOT_ main hE k1 k1s prasixXa vaha puswaka nmod__relc xiyA hE k1 k2 k4 rAma mohana jo
Example II – Coordination • rAma Ora siwA kala Aye | Ram and Sita yesterday came ‘Ram and Sita came yesterday’
Output of Stage - I _ROOT_ dummy main dummy rAma Aye Ora k1 k7t siwA kala
For Stage – II (Constraint Graph) _ROOT_ main rAma Aye Ora k1 k7t ccof ccof siwA kala
Candidate Arcs _ROOT_ main k1 rAma Aye Ora k1 k1 ccof ccof siwA kala
Solution Graph _ROOT_ main k1 rAma Aye Ora k7t ccof ccof siwA kala
Parse tree _ROOT_ main Aye k7t k1 Ora kala ccof ccof siwA rAma Output after Stage II
Finite Verb Coordination • rAma Gara gayAOra vaha so gayA | Ram home went and he sleep went ‘Ram went home and slept’ _ROOT_ main main dummy so gayA Ora k1 k1 k2 vaha rAma Gara Output after Stage I
Karaka Frame - Ora Finite Ora Ora ccof ccof ccof ccof v_fin gayA v_fin so
Finite Verb Coordination (Parse Tree) _ROOT_ main Ora ccof ccof gayA so k1 k2 k1 rAma Gara vaha Output after Stage II
Relative Clause Coordination • rAma ne vaha puswaka KarIxI jo prasixXa hE Ora jo saswI hE ‘Ram purchased the book which is famous and which is cheap’ _ROOT_ main main main dummy KarIxI hE Ora hE k1 k1s k1 k1s k1 k2 jo prasixXa jo saswI rAma puswaka Output after Stage I
Karaka Frame - Ora Relative Clause n puswaka nmod__relc nmod__relc Ora Ora ccof ccof ccof ccof v_rel v_rel hE hE
Relative Clause Coordination (Parse Tree) _ROOT_ main KarIxI k1 k2 rAma puswaka nmod__relc Ora ccof ccof hE hE k1 k1s k1 k1s jo prasixXa jo saswI Output after Stage II
Non-Finite Verb Coordination • rAma Kelakara Ora KAnA KAkara so gayA Ram having played and food having eaten sleep went _ROOT_ main dummy so Ora vmod vmod k1 rAma Kelakara KAkara k2 KAnA Output after Stage I
Karaka Frame - Ora Non-Finite so v_fin Ora Ora ccof ccof ccof ccof v_nfin v_nfin Kelakara KAkara
Non-Finite Verb Coordination (Parse Tree) _ROOT_ main so vmod k1 Ora rAma ccof ccof KAkara Kelakara k2 KAnA Output after Stage II
Nominal Coordination • rAma Ora siwA kala Aye | Ram and Sita yesterday came ‘Ram and Sita came yesterday’ _ROOT_ dummy main dummy rAma Aye Ora k1 k7t siwA kala Output after Stage I
Karaka Frame - Ora Nominal Ora Ora ccof ccof ccof ccof siwA rAma n n
Nominal Coordination (Parse Tree) _ROOT_ main Aye k7t k1 Ora kala ccof ccof siwA rAma Output after Stage II
Example • rAma Ora siwA kala Aye | _ROOT_ dummy main dummy rAma Aye Ora k1 k7t siwA kala
Steps (Stage II) Identify Nodes Identify New Demand Groups Load Frames & Transform Output of STAGE - I Find Candidates Repair Apply Constraints & Solve FINAL PARSE
Constraint Graph Nodes (Stage II) • Selected from the intermediate parse tree (Stage I) • Set-I (demand nodes) • Conjuncts • Nearest verbal ancestor of ‘jo’ (usually just the parent) • _ROOT_ • Children of _ROOT_ other than (1) and (2). • Other nodes which are added due to nodes in Set 2
Constraint Graph Nodes (Stage II) • Set-II (source nodes) • Possible children and parents of conjuncts • Possible heads of the relative clause. • Identification of nodes in Set-II will generally trigger the repair.
Steps (Stage II) Identify Nodes Identify New Demand Groups Load Frames & Transform Output of STAGE - I Find Candidates Repair Apply Constraints & Solve FINAL PARSE
Identify the demand group • Ora • Aye
Steps (Stage II) Identify Nodes Identify New Demand Groups Load Frames & Transform Output of STAGE - I Find Candidates Repair Apply Constraints & Solve FINAL PARSE
General Principles • Repair/Revision • Any node which becomes a potential child in stage 2, its arc to its existing parent is open to revision • rAma Ora siwA kala Aye • Node 4 becomes potential child (of node • 1) • Its parent (node 2) is open to revision
General Principles • Repair/Revision after parse of stage I • Any node which becomes a potential parent must be re-looked at. • rAma Ora siwA kala Aye • Node 2 becomes potential parent (of 1) • Its child (node 4) is open to revision
Algorithm • Identify nodes of the constraint graph • From Set 1, and • From Set 2 • Remove all outgoing edges from _ROOT_. • Find possible candidates for demand nodes present in Set 1 from Set 2 • Parent candidate for finite verb • Parent and children for conjuncts • Children of _ROOT_ • Convert the formed constraint graph into integer programming (IP) problem. • Solve the IP equations to get the possible solution parse.
An example raama aura sitaa kala aaye ’Ram’ ’and’ ’Sita’ ’yesterday’ ‘came’ Ram and Sita came yesterday _ROOT_ dummy main dummy • Output after stage I rAma Aye Ora k1 k7t siwA kala
Identify Nodes _ROOT_ • Set 1 nodes dummy main dummy rAma Aye Ora k1 k7t siwA kala _ROOT_ dummy • Set 1 and Set 2 main dummy rAma Aye Ora k1 k7t siwA kala
Constraint Graph • New Constraint Graph • Ora, Aye and _ROOT_ are the demand groups • Note: ‘kala’ remains attached to its parent ‘aaye’ (does not show up in stage 2) _ROOT_ main k1 ccof Aye Ora rAma k1 ccof siwA
Example • Final Parse _ROOT_ main Aye k7t k1 Ora kala ccof ccof siwA rAma
Types of complex sentences • Relative clauses • Initial • Final • Medial • Conjuncts (Coordination) • Simple clause • Relative clause • Non-finite • Nominal, adjectival, adverbial
Some other examples: • rAma ne vaha puswaka KarIxI jo saswI hE Ora jo bAjZAra meM prasixXa hE| • samIra Ora aBay ne vaha puswaka KarIxI jo saswI hE Ora jo bAjZAra meM prasixXa hE| • rAma Ora mohana ke xoswa kI baccI Aye | • Only baccI came, or • Both rAma and baccI came • Use of ‘gnp’ of the main verb, Aye vs. AI