160 likes | 282 Views
CFGs and PDAs. Sipser 2 (pages 99-115). Long long ago…. Context-free grammars. A context-free grammar G is a quadruple (V, Σ , R, S) , where V is a finite set called the variables Σ is a finite set, disjoint from V , called the terminals
E N D
CFGs and PDAs Sipser 2 (pages 99-115)
Context-free grammars • A context-free grammar G is a quadruple (V, Σ, R, S), where • V is a finite set called the variables • Σis a finite set, disjoint fromV, called the terminals • Ris a finite subset ofV × (V∪Σ)* called the rules • S∈Vis called the start symbol • For any A∈V and u∈(V∪Σ)*, we write A→Gu whenever (A, u)∈R
Arithmetic expressionsand parse trees • ConsiderG = (V, Σ, R, S), where • V ={<EXPR>, <TERM>, <FACTOR>} • Σ ={a, +, ×, (, )} • R ={ <EXPR> →G <EXPR>+<TERM> | <TERM>, <TERM> →G <TERM>×<FACTOR> | <FACTOR>, <FACTOR> →G (<EXPR>) | a } • S = <EXPR> • What about a × a +a?
Leftmost derivation • A derivation of a string in a grammar is a leftmost derivation if: at every step the leftmost remaining variable is the one replaced
Needlessly complicated? • How about just <EXPR> →G <EXPR>+<EXPR> | <EXPR>×<EXPR> | <(EXPR)> | a • A grammar G is ambiguous if some string w has two or more different leftmost derivations
Chomsky normal form • A context-free grammar Gis in Chomsky normal form • If every rule is of the form • A → BC • A → a • where A,B,C ∈ V, B ≠ S ≠ C, anda ∈ Σ • We permit S → ε
Chomsky normal form • Theorem 2.9: Any context-free language is generated by a context-free grammar in Chomsky normal form • Proof: • Make sure S appears only on the left • Remove empty rules:A → ε • Handle unit rules:A → B • Fix all the rest… • For instance: • S →G ASA | aA • A →Gb | ε
Balanced Brackets • The grammar G = (V, Σ, R, S), where V = {S} Σ = {[, ]} R = { S →Gε, S →G SS, S →G [S]} generates all strings of balanced brackets • Is the language L(G) is regular? • Why/Why not?
Recognizing Context-Free Languages • Grammars are language generators. It is not immediately clear how they might be used as language recognizers. • The language L(G) of balanced brackets is not regular. It cannot be recognized by a finite state automaton. • However, it is very similar to the BEGIN…END blocks of many procedural languages and, therefore, must be recognizable by some compiler or interpreter!
Auxiliary storage • We could recognize the language L(G) of balanced brackets by reading left to right, if we could remember left brackets along the way. [[][[]]] Must match some left bracket along the way
Pushdown Automata • The last left bracket seen matches the first right bracket. This suggests a stack storage mechanism. [ [ ] [ [ ] ] ] reading head [ stack or pushdown store [ Finite control [ $
Formally… • A pushdown automaton is a sextuple M = (Q, Σ, Γ, δ, q0, F), where • Q is a finite set of states • Σ is a finite alphabet (the input symbols) • Γ is a finite alphabet (the stack symbols) • δ: (Q × Σε ×Γε) → P(Q × Γε) is the transition function • q0∈Q is the initial state, and • F⊆Q is the set of accept states