620 likes | 728 Views
Chapter 5 Grammars and Parsers. Prof Chung. Outlines. The LL(1) Predict Function The LL(1) Parse Table Building Recursive Descent Parsers from LL(1) Tables An LL(1) Parser Driver LL(1) Action Symbols Making Grammars LL(1) The If-Then-Else Problem in LL(1) Parsing.
E N D
Chapter 5 Grammars and Parsers Prof Chung.
Outlines • The LL(1) Predict Function • The LL(1) Parse Table • Building Recursive Descent Parsers from LL(1) Tables • An LL(1) Parser Driver • LL(1) Action Symbols • Making Grammars LL(1) • The If-Then-Else Problem in LL(1) Parsing
The LL(1) Predict Function • Given the productions • A1 • A2 • … • An • During a (leftmost) derivation • … A … … 1 … or • … 2 … or • … n … • Deciding which production to match • Using lookahead symbols
More Detail Operation at Next Page… The LL(1) Predict Function (1) • The limitation of LL(1) • LL(1) contains exactly those grammarsthat have disjoint predict sets for productions that share a common left-hand side Single Symbol Lookahead Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm)
More Detail Operation at Next Page… The LL(1) Predict Function (2) • An LL(1) parse table T : Vnx Vt P {Error} • The definition of T T[A][t] = AX1Xmif t Prediction (AX1Xm); T[A][t] = Error otherwise
More Detail Operation at Next Page… The LL(1) Parse Table (1)
The LL(1) Parse Table (2) More Detail Operation at Next Page… 7
More Detail Operation at Next Page… The LL(1) Parse Table (3)
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 1 9
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 2 2 2 10
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 3 3 3 11
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 4 12
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 5 13
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 6 14
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 7 15
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 8 16
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 9 17
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 10 18
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 11 11 11 19
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 12 20
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 13 21
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 14 22
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 15 15 23
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 16 16 16 24
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 17 25
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 18 26
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 19 27
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 20 28
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 21 29
Predict (AX1 …Xm )= If ∈ First (X1…Xm) ( First (X1…Xm) - ) ∪Follow (A) else First (X1…Xm) 22 30
Building Recursive Descent Parsers from LL(1) Tables (1) • Similar the implementation of a scanner • Build in • The parsing decisions recorded in LL(1) tables can be hardwired into the parsing procedures used by recursive descent parsers • Table-driven
Building Recursive Descent Parsers from LL(1) Tables (2) • The form of parsing procedure: • non_term() [Function Code] • It is the name of the nonterminal the parsing procedure handles. • case TERMINAL_LIST • …… • default • statement () [Function Code] • An algorithm that automatically creates parsing procedures • case ID • case READ • case WRITE • default
Building Recursive Descent Parsers from LL(1) Tables (3) • Describing grammars • Data Structure [Function Code] • Setup Describing grammars Data Structure. • gen_actions() [Function Code][Example] • Takes the grammar symbols and generates the actions necessary to match them in a recursive descent parse • make_parsing_proc() [Function Code] • Generae recursive descent parsing procedure for A. • for each production where A is the LHS • for each terminal in the grammar
An LL(1) Parser Driver (1) • Rather than using the LL(1) table to build parsing procedures • it is possible to use the table in conjunction with a driver program to form an LL(1) parser • Smaller and fasterthan a corresponding recursive descent parser • Changing a grammar and building a new parser is easy • LL(1) driver [Function Code] • computed and substitutedfor the old tables
LL(1) Action Symbols (1) • During parsing, the appearance of an action symbolin a production will serve to initiate the corresponding semantic action – a call to the corresponding semantic routine. • The semantic routine calls pass no explicit parameters • LL(1) driver Action Symbols [Function Code] • Necessary parameters are transmitted through a semantic stack • Semantic stackparse stack[Compare. @ Next page] • Semantic stack is a stack of semantic records • Action symbols are pushed to the parse stack
LL(1) Action Symbols (2) • Compare • Parse Stack • In recursive descent parsers, the parse stack is “hidden” in the procedure call stack of the running compiler. • Semantic Stack • “Parse Stack” extended. • The semantic stack can be hidden that way as well, by having each parsing routine return a semantic record.
Making Grammars LL(1) • Not all grammars are LL(1). • LR • LR(1) • SLR(1) • Some non-LL(1) grammars can be made LL(1) by simple modifications.
Making Grammars LL(1) ID 2,5 <stmt> 39 When a grammar is not LL(1) ? This is called a Conflict, which means we do not knowwhich production to use when <stmt> is on stack top and ID is the next input token.
Making Grammars LL(1) • Major LL(1) prediction conflicts • Common prefixes • Left recursion • Common prefixes • <stmt> if <exp> then <stmt> • <stmt> if <exp> then <stmt> else <stmt> • Solution: factoring transform • See Next Page.
Making Grammars LL(1) void factor (grammar *G) { while (G has 2 or more productions with the same LHS and a common prefix) { Let S = { Aαβ, …, Aαξ } be the set of productions with the same left-hand side, A, and a common prefix, α; Create a new nonterminal, N; Replace S with the production set SET_OF (AαN, Nβ, …, Nξ); } } <stmt> if <exp> then <stmt list> <if suffix> <if suffix> end if; <if suffix> else <stmt list> end if;
Making Grammars LL(1) • Grammars with left-recursive production can never be LL(1) A A • Why? • A will be the top stack symbol, and hence the same production would be predicted forever
ANT N … N T T T AA A … A Making Grammars LL(1) • Solution: void remove_left_recursion (grammar *G) { while ( G contains a left-recursive nonterminal) { Let S = {AA ,A , …, Aζ} be the set of production with the same left-hand side, A, where A is left-recursive. cerate two new nonterminals, T and N; Replace S with the production set SET_OF (AN T, Nβ, …, Nξ, T T, Tλ); } }
ID 2,3 <stmt> Making Grammars LL(1) • Other transformation may be needed • No common prefixes, no left recursion • <stmt> <label> <unlabeled stmt> • <label> ID : • <label> • <unlabeled stmt> ID := <exp> ;
Making Grammars LL(1) • Example Grammar: • <stmt> ID <suffix> • <suffix> : <unlabeled stmt> • <suffix> := <exp> ; • <unlabeled stmt> ID := <exp> ;
Making Grammars LL(1) • In Ada, we may declare arrays as • A: array(I .. J, BOOLEAN) • A straightforward grammar for array bound • <array bound> <expr> .. <expr> • <array bound> ID • Solution • <array bound> <expr> <bound tail> • <bound tail> .. <expr> • <bound tail>
Making Grammars LL(1) • Greibach Normal Form • Every production is of the form Aa • a is a terminal and is a string (possible empty) of symbols • Every context-free language L without can be generated by a grammar in Greibach Normal Form • Factoring of common prefixes is easy • Given a grammar G, we can • G GNF No common prefixes, no left recursion (butstill may notbe LL(1))
The If-Then-Else Problem in LL(1) Parsing • “Dangling else” problem in Algo60, Pascal, and C • else clause is optional • BL={[i]j | ij 0} [ if <expr> then <stmt> ] else <stmt> • BL is not LL(1) and in fact not LL(k) for any k
S S [ S CL [ S CL [ S CL ] [ S CL ] The If-Then-Else Problem in LL(1) Parsing • First try • G1: S [ S CL S CL ] CL • G1 is ambiguous: E.g., [[]
S [ S1 [ S1 ] The If-Then-Else Problem in LL(1) Parsing • Second try • G2: S [S S S1 S1 [S1] S1 • G2 is not ambiguous: E.g., [[] • The problem is [ First([S) and [ First(S1) [[ First2([S) and [[ First2 (S1) • G2 is not LL(1), nor is it LL(k) for any k.