1 / 16

Context Free Grammar (CFG): Specification for Structures & Constituency

NP. NP. PP. NP. girl. in. the. park. Context Free Grammar (CFG): Specification for Structures & Constituency. Parse Tree: graphical representation of structure root node (S): a sentencial level structure internal nodes: constituents of the sentence

haines
Download Presentation

Context Free Grammar (CFG): Specification for Structures & Constituency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NP NP PP NP girl in the park Context Free Grammar (CFG): Specification for Structures & Constituency • Parse Tree: graphical representation of structure • root node (S): a sentencial level structure • internal nodes: constituents of the sentence • arcs: relationship between parent nodes and their children (constituents) • terminal nodes: surface forms of the input symbols (e.g., words) • alternative representation: bracketed notation: • e.g., [I saw [the [girl [in [the park]]]]] • For example: Jing-Shin Chang

  2. S NP VP NP NP NP PP NP pron v det n p det n I saw the girl in the park Parse Tree: “I saw the girl in the park” Jing-Shin Chang

  3. CFG: Components • CFG: formal specification of parse trees • G = {, N, P, S} • : terminal symbols • N: non-terminal symbols • P: production rules • S: start symbol • : terminal symbols • the input symbols of the language • programming language: tokens (reserved words, variables, operators, …) • natural languages: words or parts of speech • pre-terminal: parts of speech (when words are regarded as terminals) • N: non-terminal symbols • groups of terminals and/or other non-terminals • S: start symbol: the largest constituent of a parse tree • P: production (re-writing) rules • form: α → β (α: non-terminal, β: string of terminals and non-terminals) • meaning: α re-writes to (“consists of”, “derived into”)β, or βreduced to α • start with “S-productions” (S → β) Jing-Shin Chang

  4. CFG: Example Grammar • Grammar Rules • S → NP VP • NP → Pron | Proper-Noun | Det Norm • Norm → Noun Norm | Noun • VP → Verb | Verb NP | Verb NP PP | Verb PP • PP → Prep NP • S: sentence, NP: noun phrase, VP: verb phrase • Pron: pronoun • Det: determiner, Norm: Norminal • PP: prepositional phrase, Prep: preposition • Lexicon (in CFG form) • Noun → girl | park | desk • Verb → like | want | is | saw | walk • Prep → by | in | with | for • Det → the | a | this | these • Pron → I | you | he | she | him • Proper-Noun → IBM | Microsoft | Berkeley Jing-Shin Chang

  5. CFG: Accepted Languages • CFG Operations • derivation: applying a production rule to re-write the LHS non-terminal into its constituents • rightmost derivation: a sequence of derivations in which the rightmost non-terminal is always re-write first • leftmost derivation: leftmost non-terminal first • Context-Free Language • Language accepted by a CFG • L(G) = {w | S =*=> w (strings of terminals that can be derived from start symbol)} Jing-Shin Chang

  6. CFG: Expressive Power • CFG vs. Regular Expression (R.E.) • every R.E. can be recognized by a FSA • every FSA can be represented by a CFG with production rules of the form: A -> a B | ε • therefore, L(RE) < L(CFG) • Writing a CFG for a FSA (RE) • define a non-terminal Ni for a state with state number i • start symbol S = N0 (assuming that state 0 is the initial state) • for each transition δ(i,a)=j (from state i to stet j on input alphabet a), add a new production Ni -> a Nj to P • for each final state i, add a new production Ni -> εto P Jing-Shin Chang

  7. a a b b 0 1 2 3 b CFG: Expressive Power (cont.) • Writing a CFG for a FSA (RE) • define a non-terminal Ni for a state with state number i • start symbol S = N0 (assuming that state 0 is the initial state) • for each transition δ(i,a)=j (from state i to stet j on input alphabet a), add a new production Ni -> a Nj to P • for each final state i, add a new production Ni -> εto P • For example: RE: (a|b)* a b b S -> a S | b S | a N1 N1 -> b N2 N2 -> b N3 N3 -> ε Jing-Shin Chang

  8. CFG: Expressive Power (cont.) • Chomsky Hierarchy: • R.E.: regular set (FSA) • CFG: context-free (pushdown automata) • CSG: context-sensitive (linear bounded automata) • unrestricted: recursively enumerable (Tuning Machine) Jing-Shin Chang

  9. CFG: Equivalence • Chomsky Normal Form (CNF) (Chmosky, 1963): • ε-free, and • Every production rule is in either of the following form: • A -> A1 A2 • A -> a (A1, A2: non-terminal, a: terminal) • two non-terminals or one terminal at the RHS • generate binary tree • good simplification for some algorithms (e.g., grammar training with the inside-outside algorithm (Baker 1979)) • Every CFG can be converted into a weakly equivalent CNF • equivalence: L(G1) = L(G2) • strong equivalent: assign the same phrase structure to each sentence (except for renaming non-terminals) • weak equivalent: do not assign the same phrase structure to each sentence • e.g., A -> B C D == {A -> B X, X -> CD} Jing-Shin Chang

  10. CFG vs. Finite-State Machine • Inappropriateness of FAS • Constituents • Recursion • RTN (Recursive Transition Network) • FSA with augmentation of recursion • arc: terminal or non-terminal • if arc is non-terminal: call to a sub-transition network & return upon traversal Jing-Shin Chang

  11. CFG for English • Sentence Level Constructions • Declarative (直述句): NP (Subject) VP • Imperative (命令句): VP • Yes-No Questions: Aux NP VP • WH-Questions: Wh-NP VP Jing-Shin Chang

  12. CFG for English • Noun Phrase • Head Noun • Modifiers: • pre-nominal (pre-head) and post-nominal (post-head) • Pre-nominal Modifiers: • pre-determiner: “all” • determiner: “the” • post-determiner: (ordinal) (cardinal) (quantifier) (ADJP) • ordinal: “first”,”second”,”next” • cardinal: “two”, ”three” • quantifier: “many”, “several” • Post-nominal Modifiers: • PP: prepositional phrase • non-finite clauses: VP(+ing), VP(+ed), VP(to-V) forms • relative clauses: restrictive, non-restrictive • the man whose son lives in NY (restrictive) • the man, whose son lives in NY (non-restrictive) Jing-Shin Chang

  13. CFG for English • Coordination (同位語, 對等連接詞,…) • conjunction (conj): and, or, but • X → X conj X • a big source of ambiguity: X can be almost anything • Comparison with Mathematic Operators • (left/right) association: ((a + b) + c), ( a ** (b ** c) ) • (high/low) precedence: a + b x c : (a + b) x c, a + (b * c) Jing-Shin Chang

  14. CFG for English • Agreement • Subject-Verb (or Aux. Verb): person & number • I like her • He likes her • Gender Agreement (German or French): ADJ-Noun, Det-Noun Jing-Shin Chang

  15. CFG for English • Verb Phrases & Subcategorization • not every verb is compatible with every verb phrases • + NP, +NP-NP, +to-V, +Ving… • e.g., transitive (Vt), intransitive (Vi) • subcat. frame for a verb: possible set of complements • CFG for SUBCAT Problems • Solution 1: • VP → v1 • VP → v2 NP • VP → v3 S • v1 → disappear | … • v2 → find | leave | repeat • v3 → think | believe | say • Solution 2: • VP → verb • VP → verb NP • VP → verb S • verb → disappear | … | find |… | think Jing-Shin Chang

  16. CFG for English • Auxiliaries • modal: “can”, “may”, “must”, “will” +V(stem) • perfect: “have” +V(pp) • progressive: “be” +V(ing) • passive: “be” +V(past) • Multiple Auxiliaries • modal < perfect < progressive < passive • modal perfect: “could have been …” • modal passive: “will be married …” • perfect progressive: “have been feasting …” • modal perfect passive: “might have been prevented …” Jing-Shin Chang

More Related