690 likes | 867 Views
Contents. Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing Lex and yacc LR Parsing Semantic Processing Symbol Tables Run-time Storage Organization Code Generation and Local Code Optimization Global Optimization.
E N D
Contents • Introduction • A Simple Compiler • Scanning – Theory and Practice • Grammars and Parsing • LL(1) Parsing • Lex and yacc • LR Parsing • Semantic Processing • Symbol Tables • Run-time Storage Organization • Code Generation and Local Code Optimization • Global Optimization
Outline • Shift-Reduce Parsers • LR Parsers: LR(0) • LR(1) Parsing • SLR(1) Parsing • LALR(1) Parsing
Shift-Reduce Parsers • The fundamental concern of a bottom-up parser is deciding when what looks like the RHS of a production can be replaced by its LHS.
Shift-Reduce Parsers • A shift-reduce parser works as follows: • A parser stack, initially empty, contains symbols already parsed. • The parse stack catenated with the remaining input always represents a right sentential form. • Tokens are shifted onto the stack until the top of the stack contains the handle of the sentential form. • Recall that a handle is a sequence of symbols that match some production’s RHS and which may be correctly replaced with that production’s LHS. • The handle is reduced by replacing it on the parse stack with the nonT that is its parent in the parse tree.
Shift-Reduce Parsers • The grammar G0 • A shift-reduce parser
Shift-Reduce Parsers • The driver utilizes a parse stack that contains parse states, usually coded as integers. • The driver uses two tables, action and go_to. • The action tells the parser whether to shift, reduce, terminate successfully, or signal a syntax error. • The go_to table defines successor states after a token or LHS is matched and shifted.
Shift-Reduce Parsers • The action
Shift-Reduce Parsers • The go_to
R1=A R2 R2 R4
LR Parsers • LR parsers are characterized by the number of lookahead symbols that are examined to determine parsing actions. • LR(K), where k is the lookahead size.
LR Parsers: LR(0) • LR(0) Parsing
LR Parsers: LR(0) • We begin parsing with a configuration set Start Symbol the end maker
需求取First集合而不必求Follow集合 ! Canonical LR parsing 鑑于 historical information 尚需融入 state Ii之中, 以解決 shift/reduce conflict ! 今以一實例說明 canonical LR parsing table 之建構: First($) G’: S’ ::= S S ::= CC C ::= cC C ::= d I0 S’ ::= .S ,$ S ::= .CC ,$ C ::= .cC ,c/d C ::= .d ,c/d I1 I2 I3 I4 求 LR(1) item set First(C$) I1 S’ ::= S. ,$ I4 I7 C ::= d. ,c/d C ::= d. ,$ S ::= C.C ,$ C ::= .cC ,$ C ::= .d ,$ I2 I5 I8 S ::= CC. ,$ C ::=cC. ,c/d I6 I9 I3 C ::= c.C ,c/d C ::= .cC ,c/d C ::= .d ,c/d C ::= c.C ,$ C ::= .cC ,$ C ::= .d ,$ C ::= cC. ,$
State action goto c d $ S C 0 s3 s4 1 2 1 acc 2 s6 s7 5 3 s3 s4 8 4 r3 r3 5 r1 6 s6 s7 9 7 r3 8 r2 r2 9 r2 Canonical parsing table for grammar G’ Every SLR(1) grammar is an LR(1) grammar, but for an SLR(1) grammar the canonical LR parser may have more states than the SLR parser for the same grammar. The grammar of the previous example is SLR and has an SLR parser with seven states, compared with the ten shown above. 由於就 the number of states而言,canonical LR parser 實在太龐大,因此時常難以落實,而 SLR parser 卻又能力有所未逮,於是LALR parser於焉誕生;其狀態數與 SLR parser 完成相同,然 shift/reduce conflict 較少發生.
LR Parsers: SLR(1) • SLR parsing table 之建構流程: • (一) 增加一條語法規則 (0)S’ →S {S表文法之start symbol} 於文法G中, 令新語法名曰 G’ • (二) 計算 items I 之closure集合 closure (I) • (三) 計算 grammar symbol X 之 goto值 I’ goto (I, X) 並求其 closure (I’) • (四) 重覆上一步驟直到不再產生新的 goto值 {而得 item set} • (五) 計算所有 nonterminals 之 FIRST 與FOLLOW集合 • (六) 藉 Item 與 FOLLOW 兩集合建置 parsing table
LR Parsers: SLR(1) • LR(1) is very powerful. • The goto and action tables of LR(1) are too big due to too many states. • Two alternatives to LR(1) • SLR(1): LR(0) machine + lookahead • LALR(1): merge states of LR(1) machine • The lookahead in LR(1) machine is computed from the context, whereas the lookahead in SLR(1) is the Follow sets. • LR(1) lookaheads are more precise.