170 likes | 427 Views
NP. NP. PP. NP. girl. in. the. park. Context Free Grammar (CFG): Specification for Structures & Constituency. Parse Tree: graphical representation of structure root node (S): a sentencial level structure internal nodes: constituents of the sentence
E N D
NP NP PP NP girl in the park Context Free Grammar (CFG): Specification for Structures & Constituency • Parse Tree: graphical representation of structure • root node (S): a sentencial level structure • internal nodes: constituents of the sentence • arcs: relationship between parent nodes and their children (constituents) • terminal nodes: surface forms of the input symbols (e.g., words) • alternative representation: bracketed notation: • e.g., [I saw [the [girl [in [the park]]]]] • For example: Jing-Shin Chang
S NP VP NP NP NP PP NP pron v det n p det n I saw the girl in the park Parse Tree: “I saw the girl in the park” Jing-Shin Chang
CFG: Components • CFG: formal specification of parse trees • G = {, N, P, S} • : terminal symbols • N: non-terminal symbols • P: production rules • S: start symbol • : terminal symbols • the input symbols of the language • programming language: tokens (reserved words, variables, operators, …) • natural languages: words or parts of speech • pre-terminal: parts of speech (when words are regarded as terminals) • N: non-terminal symbols • groups of terminals and/or other non-terminals • S: start symbol: the largest constituent of a parse tree • P: production (re-writing) rules • form: α → β (α: non-terminal, β: string of terminals and non-terminals) • meaning: α re-writes to (“consists of”, “derived into”)β, or βreduced to α • start with “S-productions” (S → β) Jing-Shin Chang
CFG: Example Grammar • Grammar Rules • S → NP VP • NP → Pron | Proper-Noun | Det Norm • Norm → Noun Norm | Noun • VP → Verb | Verb NP | Verb NP PP | Verb PP • PP → Prep NP • S: sentence, NP: noun phrase, VP: verb phrase • Pron: pronoun • Det: determiner, Norm: Norminal • PP: prepositional phrase, Prep: preposition • Lexicon (in CFG form) • Noun → girl | park | desk • Verb → like | want | is | saw | walk • Prep → by | in | with | for • Det → the | a | this | these • Pron → I | you | he | she | him • Proper-Noun → IBM | Microsoft | Berkeley Jing-Shin Chang
CFG: Accepted Languages • CFG Operations • derivation: applying a production rule to re-write the LHS non-terminal into its constituents • rightmost derivation: a sequence of derivations in which the rightmost non-terminal is always re-write first • leftmost derivation: leftmost non-terminal first • Context-Free Language • Language accepted by a CFG • L(G) = {w | S =*=> w (strings of terminals that can be derived from start symbol)} Jing-Shin Chang
CFG: Expressive Power • CFG vs. Regular Expression (R.E.) • every R.E. can be recognized by a FSA • every FSA can be represented by a CFG with production rules of the form: A -> a B | ε • therefore, L(RE) < L(CFG) • Writing a CFG for a FSA (RE) • define a non-terminal Ni for a state with state number i • start symbol S = N0 (assuming that state 0 is the initial state) • for each transition δ(i,a)=j (from state i to stet j on input alphabet a), add a new production Ni -> a Nj to P • for each final state i, add a new production Ni -> εto P Jing-Shin Chang
a a b b 0 1 2 3 b CFG: Expressive Power (cont.) • Writing a CFG for a FSA (RE) • define a non-terminal Ni for a state with state number i • start symbol S = N0 (assuming that state 0 is the initial state) • for each transition δ(i,a)=j (from state i to stet j on input alphabet a), add a new production Ni -> a Nj to P • for each final state i, add a new production Ni -> εto P • For example: RE: (a|b)* a b b S -> a S | b S | a N1 N1 -> b N2 N2 -> b N3 N3 -> ε Jing-Shin Chang
CFG: Expressive Power (cont.) • Chomsky Hierarchy: • R.E.: regular set (FSA) • CFG: context-free (pushdown automata) • CSG: context-sensitive (linear bounded automata) • unrestricted: recursively enumerable (Tuning Machine) Jing-Shin Chang
CFG: Equivalence • Chomsky Normal Form (CNF) (Chmosky, 1963): • ε-free, and • Every production rule is in either of the following form: • A -> A1 A2 • A -> a (A1, A2: non-terminal, a: terminal) • two non-terminals or one terminal at the RHS • generate binary tree • good simplification for some algorithms (e.g., grammar training with the inside-outside algorithm (Baker 1979)) • Every CFG can be converted into a weakly equivalent CNF • equivalence: L(G1) = L(G2) • strong equivalent: assign the same phrase structure to each sentence (except for renaming non-terminals) • weak equivalent: do not assign the same phrase structure to each sentence • e.g., A -> B C D == {A -> B X, X -> CD} Jing-Shin Chang
CFG vs. Finite-State Machine • Inappropriateness of FAS • Constituents • Recursion • RTN (Recursive Transition Network) • FSA with augmentation of recursion • arc: terminal or non-terminal • if arc is non-terminal: call to a sub-transition network & return upon traversal Jing-Shin Chang
CFG for English • Sentence Level Constructions • Declarative (直述句): NP (Subject) VP • Imperative (命令句): VP • Yes-No Questions: Aux NP VP • WH-Questions: Wh-NP VP Jing-Shin Chang
CFG for English • Noun Phrase • Head Noun • Modifiers: • pre-nominal (pre-head) and post-nominal (post-head) • Pre-nominal Modifiers: • pre-determiner: “all” • determiner: “the” • post-determiner: (ordinal) (cardinal) (quantifier) (ADJP) • ordinal: “first”,”second”,”next” • cardinal: “two”, ”three” • quantifier: “many”, “several” • Post-nominal Modifiers: • PP: prepositional phrase • non-finite clauses: VP(+ing), VP(+ed), VP(to-V) forms • relative clauses: restrictive, non-restrictive • the man whose son lives in NY (restrictive) • the man, whose son lives in NY (non-restrictive) Jing-Shin Chang
CFG for English • Coordination (同位語, 對等連接詞,…) • conjunction (conj): and, or, but • X → X conj X • a big source of ambiguity: X can be almost anything • Comparison with Mathematic Operators • (left/right) association: ((a + b) + c), ( a ** (b ** c) ) • (high/low) precedence: a + b x c : (a + b) x c, a + (b * c) Jing-Shin Chang
CFG for English • Agreement • Subject-Verb (or Aux. Verb): person & number • I like her • He likes her • Gender Agreement (German or French): ADJ-Noun, Det-Noun Jing-Shin Chang
CFG for English • Verb Phrases & Subcategorization • not every verb is compatible with every verb phrases • + NP, +NP-NP, +to-V, +Ving… • e.g., transitive (Vt), intransitive (Vi) • subcat. frame for a verb: possible set of complements • CFG for SUBCAT Problems • Solution 1: • VP → v1 • VP → v2 NP • VP → v3 S • v1 → disappear | … • v2 → find | leave | repeat • v3 → think | believe | say • Solution 2: • VP → verb • VP → verb NP • VP → verb S • verb → disappear | … | find |… | think Jing-Shin Chang
CFG for English • Auxiliaries • modal: “can”, “may”, “must”, “will” +V(stem) • perfect: “have” +V(pp) • progressive: “be” +V(ing) • passive: “be” +V(past) • Multiple Auxiliaries • modal < perfect < progressive < passive • modal perfect: “could have been …” • modal passive: “will be married …” • perfect progressive: “have been feasting …” • modal perfect passive: “might have been prevented …” Jing-Shin Chang