210 likes | 298 Views
CSA3180: NLP Algorithms. Sentence Parsing Algorithms 2 Problems with DFTD Parser. Left Recursion Handling Ambiguity Inefficiency. Problems with DFTD Parser. Left Recursion. A grammar is left recursive if it contains at least one non-terminal A for which A * A and *
E N D
CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser CSA3180: Sentence Parsing
Left Recursion Handling Ambiguity Inefficiency Problems withDFTD Parser CSA3180: Sentence Parsing
Left Recursion • A grammar is left recursive if it contains at least one non-terminal A for whichA * A and * (n.b. * is the transitive closure of ) • Intuitive idea: derivation of that category includes itself along its leftmost branch. NP NP PP NP NP and NP NP DetP Nominal DetP NP ' s CSA3180: Sentence Parsing
Left Recursion Left recursion can lead to an infinite loop [nltk demo CSA3180: Sentence Parsing
Dealing with Left Recursion • Use different parsing strategy • Reformulate the grammar to eliminate LR A A | is rewritten as A A' A' A' | CSA3180: Sentence Parsing
NP → NP ‘and’ NP NP → D N | D N PP Rewriting the Grammar CSA3180: Sentence Parsing
NP → NP ‘and’ NP β NP → D N | D N PP α Rewriting the Grammar CSA3180: Sentence Parsing
NP → NP ‘and’ NP β NP → D N | D N PP α New Grammar NP → αNP1 NP1 → βNP1 | ε Rewriting the Grammar CSA3180: Sentence Parsing
NP → NP ‘and’ NP β NP → D N | D N PP α New Grammar NP → αNP1 NP1 → βNP1 | ε α→ D N | D N PP β→ ‘and’ NP Rewriting the Grammar CSA3180: Sentence Parsing
New Parse Tree NP α NP1 D N the cat ε CSA3180: Sentence Parsing
Rewriting the Grammar • Different parse tree • Unnatural parse tree? CSA3180: Sentence Parsing
Left Recursion Handling Ambiguity Inefficiency Problems withDFTD Parser CSA3180: Sentence Parsing
Handling Ambiguity • Coordination Ambiguity: different scope of conjunction:Hot curry and ice taste nice with riceHot curry and rice taste nice with ice • Attachment Ambiguity: a constituent can be added to the parse tree in different places:I shot an elephant in my trousers • VP → VP PPNP → NP PP CSA3180: Sentence Parsing
Real sentences are full of ambiguities President Kennedy today pushed aside other White House business to devote all his time and attention to working on the Berlin crisis address he will deliver tomorrow night to the American people over nationwide television and radio CSA3180: Sentence Parsing
Prepositional Phrase Ambiguity • he will deliver • to the American people • over nationwide TV • in New York • during September • for very good reasons CSA3180: Sentence Parsing
Growth of Number of Ambiguities The nth Catalan number counts the ways of dissecting a polygon with n+2 sides into triangles by drawing nonintersecting diagonals. CSA3180: Sentence Parsing
Handling Ambiguities • Statistical disambiguation • which is the most probable interpretation? • Semantic knowledge • which is the most sensible interpretation? • Subatomic particles such as positively charged protons and electrons CSA3180: Sentence Parsing
Left Recursion Handling Ambiguity Inefficiency Problems withDFTD Parser CSA3180: Sentence Parsing
Repeated Parsing of Subtrees • Local versus global ambiguity. • NP → Det Noun • NP → NP PP • Because of the top down depth first, left to right policy, the parser builds trees that fail because they do not cover all of the input. • Successive parses cover larger segments of the input, but these include structures that have already been built before. CSA3180: Sentence Parsing
NP Nom Det Noun a flight Repeated Parsing ofSubtrees NP NP PP Nom DetNoun P Noun aflight from Indianapolis CSA3180: Sentence Parsing
Repeated Parsing ofSubtrees CSA3180: Sentence Parsing