230 likes | 313 Views
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 5, 09/25/2003. Prof. Roy Levow. LL(1) Parser Generation. Roadmap LL(1) parsing LL(1) conflicts as an asset LL(1) conflicts as a liability The LL(1) push-down automaton LL(1) error handling LL gen.
E N D
COP 4620 / 5625Programming Language Translation /Compiler WritingFall 2003Lecture 5, 09/25/2003 Prof. Roy Levow
LL(1) Parser Generation • Roadmap • LL(1) parsing • LL(1) conflicts as an asset • LL(1) conflicts as a liability • The LL(1) push-down automaton • LL(1) error handling • LL gen
LL(1) Parsing • To determine choice of production, we use FIRST and FOLLOW sets • FIRST set • Contains tokens that can occur as the start of a production or the tail of a production • FOLLOW set • Contains tokens that can follow a nullable prefix in a production
LL(1) Conflicts • First/First conflict • Two alternatives share some first choice • First/Follow conflict • There is a non-null alternative that shares a choice with the follow for a nullable alternative • Two nullable alternatives • We do not consider Follow/Follow conflicts since they will be seen also as one of the other conflict cases
Conditions to be LL(1) • No First/First conflicts • No First/Follow conflicts • No multiple nullable alternatives • If we can remove all conflicts, we may be able to produce a LL(1) grammar from one that is not
Making a Grammar LL(1) • Left-factoring • Factor out the common prefix into a separate production • Example: term -> IDENT | IDENT ‘[‘ expr ‘]’ | … Becomes term -> IDENT (‘[‘ expr ‘]’ )? | …
Making a Grammar LL(1).2 • Substitution • Replace non-terminal on RHS with its • Example (indirect First/First conflict) term -> IDENT | indexed | … indexed -> IDENT ‘[‘ expr ‘]’ Substitute and then left factor
Making a Grammar LL(1).3 • Left Recursion Removal • Direct: a -> a b • Indirect: a -> b c; b -> a d • Hidden: a -> c a; c -> ε • In many cases, left recursion can be replaced by right recursion either by using a standard transformation or by studying the construct and developing an alternative
Correcting Semantic Effects • Restructuring a grammar may change the implied semantics, interfering with translation • Can sometimes resolve the problem by adding “marker rules” that are transformed along with the original grammar • See p.131
Automatic Conflict Resolution • Always choose first alternative • Use LL(2), look ahead an extra token • Dynamic resolution • Add decision rule to first alternative • Example else-tail-option -> %if (1) ‘else’ stmt | ε
Table Driven LL(1) Parsing • Uses a push-down automaton • Finite State Automaton with a stack • Start with start symbol on stack • When processing a non-terminal • Pop non-terminal • Push selected RHS in reverse order • When processing a terminal • Match with input
LL(1) Error Handling • Concerns • Avoid corrupt syntax tree • Avoid infinite loops • Possible approaches • Insert needed token • May loop • Skip input until match • Pop items from stack until match • May produce corrupt syntax tree
LL(1) Error Handling.2 • Acceptable set method is a combination of skipping input and popping items from stack • Discard input until acceptable continuation token is found • Resynchronize parser based on token found • Acceptable sets are restart points for each production • Set used is union for pending productions
LLgen • Parser generator from • Amsterdam Compiler Kit • Tanenbaum et al., 1983