220 likes | 377 Views
Context-Free Grammar Parsing by Message Passing. Paper by Dekang Lin and Randy Goebel Presented by Matt Watkins. Context-Free Grammars. A context-free grammar is represented by a 4-tuple:. V t → A set of terminals V n → A set of non-terminals P → A set of production rules
E N D
Context-Free Grammar Parsing by Message Passing Paper by Dekang Lin and Randy Goebel Presented by Matt Watkins
Context-Free Grammars A context-free grammar is represented by a 4-tuple: Vt → A set of terminals Vn → A set of non-terminals P → A set of production rules S → A member of Vn, representing the starting non-terminal Context-free grammars are used to represent the syntax of both programming languages and natural languages, as well as other things
Context-Free Grammars Example: <Rs:S> → <NP><VP> <Rnp1:NP> → <n> <Rnp2:NP> → <d><n> <Rnp3:NP> → <NP><PP> <Rvp1:VP> → <VP><PP> <Rvp2:VP> → <v><NP> <Rpp:PP> → <p><NP> <n> → "I" <n> → "saw" <n> → "man" <n> → "park" <v> → "saw" <d> → "a" <d> → "the" <p> → "in"
Context-Free Grammars <S> <NP> <VP> <VP> <PP> <NP> <NP> <n> <v> <d> <n> <p> <d> <n> I saw a man in the park
Parsing Context-Free Grammars Given only the definition of a context-free grammar, determine if a particular expression is a valid output of the grammar, and if so, how it is generated. Earley’s parser CYK parser Message passing parser
Grammar Representation Message passing algorithm represents a CFG as a 6-tuple <N, O, T, s, P, L> N → set of non-terminals O → set of pre-terminals T → set of terminals s → start symbol P → production rules L → a lexicon consisting of pairs (w, p), wT and pO
Grammar Representation N = {S, NP, VP, PP, n, v, p, d} O = {n, v, p, d} T = {I, saw, a, man, in, the, park} s = S P = <Rs:S> → <NP><VP> <Rnp1:NP> → <n> <Rnp2:NP> → <d><n> <Rnp3:NP> → <NP><PP> <Rvp1:VP> → <VP><PP> <Rvp2:VP> → <v><NP> <Rpp:PP> → <p><NP> L = { (I, {n}), (saw, {v, n}), (a, {d}), (man, {n}), (in, {p}), (the, {d}), (park, {n}) }
Message Passing Network S VP 1 Rs 0 NP 0 1 Rvp2 0 Rnp3 0 Rnp1 v 1 0 Rnp2 1 Rvp1 Rpp 1 0 1 0 n d p PP
Message Passing Rules Non-terminal nodes are called NT nodes Phrase structure rule nodes are called PSR nodes Messages that are passed are integer pairs representing an interval in the expression being parsed I saw a man in the park NT nodes and PSR nodes have different rules for receiving and sending messages 0 1 2 3 4 5 6 7
Message Passing Rules • NT nodes: • Never send the same message twice • Always send all unique messages to parents • PSR nodes: • Have a memory bank of pairs (I, n) where I is an interval and n is a link number • Store pairs where n = 0 in memory bank • Combines pairs where applicable if n ≠ 0 • Pair (I, n) is combined with (I´, n´) iff: • i, j, k such that I = {i, j} and I´ = {j, k} • n´ = n + 1 • If n´ is the last link, send a message to parents
Message Passing Rules Use T to identify the locations of terminals in the expression to be parsed Use the lexicon to determine the terminal’s part of speech. Pass a message to all parts of speech indicating the starting and ending position of all the terminals in the expression.
Message Passing Example S Parsing “I” VP 1 Rs 0 NP 0 1 Rvp2 0 Rnp3 0 Rnp1 v 1 0 Rnp2 1 Rvp1 Rpp 1 0 1 0 n d p PP {0,1}
Message Passing Example S Parsing “I” VP 1 Rs 0 NP 0 1 Rvp2 0 Rnp3 0 Rnp1 ({0,1}, 0) v 1 0 Rnp2 1 Rvp1 Rpp 1 0 1 0 n d p PP {0,1}
Message Passing Example S Parsing “I” VP 1 Rs 0 {0,1} NP 0 1 Rvp2 0 Rnp3 0 Rnp1 ({0,1}, 0) v 1 0 Rnp2 1 Rvp1 Rpp 1 0 1 0 n d p PP {0,1}
Message Passing Example S Parsing “I” VP ({0,1}, 0) 1 Rs 0 {0,1} NP 0 1 Rvp2 0 Rnp3 0 ({0,1}, 0) Rnp1 ({0,1}, 0) v 1 0 Rnp2 1 Rvp1 Rpp 1 0 1 0 n d p PP {0,1}
Completion Each node will contain a set of intervals that represent where in the expression the non-terminals can be found. After message passing has completed, if the expression is represented by the grammar, then the network will contain a packed parse forest
Completion Tested on SPARCstation SLC
Analysis • Strengths • O(|G|n3) time complexity • Easily parallelizable • Can handle empty rules • Weaknesses • Must convert some grammars in Backus Naur form