280 likes | 410 Views
Semantic Analysis. Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/courses/wcc03.html. Outline. What is Semantic Analysis Why is it needed? Syntax directed translations/attribute grammar (Chapter 3). Semantic Analysis. The “meaning of the program”
E N D
Semantic Analysis Mooly Sagiv html://www.cs.tau.ac.il/~msagiv/courses/wcc03.html
Outline • What is Semantic Analysis • Why is it needed? • Syntax directed translations/attribute grammar (Chapter 3)
Semantic Analysis • The “meaning of the program” • Requirements related to the “context” in which a construct occurs • Context sensitive requirements - cannot be specified using a context free grammar • Requires complicated and unnatural context free grammars • Guides subsequent phases
Basic Compiler Phases Source program (string) Front-End lexical analysis Tokens syntax analysis Abstract syntax tree semantic analysis Back-End Fin. Assembly
Example Semantic Condition • In C • break statements can only occur inside switch or loop statements
Partial Grammar for C Stm Exp; Stm if (Exp) Stm StList StList Stm Stm if (Exp) Stm else Stm StList Stm while (Exp) do Stm Stm break; Stm {StList }
LStm Exp; LStm if (Exp) LStm LStm if (Exp) LStm else LStm LStList LStList LStm LStm while (Exp) do LStm LStm {LStList } LStList LStm break; Refined Grammar for C StmExp; Stm if (Exp) Stm StList StList Stm Stm if (Exp) Stm else Stm StList Stm while (Exp) do LStm Stm {StList }
A Possible Abstract Syntax for C typedef struct A_St_ *A_St; struct A_St { enum {A_if, A_while, A_break, A_block, ...} kind; A_pos pos; union { struct { A_Exp e; A_St st1; A_St st2; } if_st; struct { A_Exp e; A_St st; } while_st; struct { A_St st1; A_St st2; } block_st; ... } u ; } A_St A_IfStm(A_Exp, A_St, A_St); A_St A_WhileStm(A_Exp A_St); A_St A_BreakStm(void); A_St A_BlockStm(A_St, A_St);
Partial Bison Specification stm : IF ‘(‘ exp ‘)’ stm { $$ = A_IfStm($3, $5, NULL) ; } | IF ‘(‘ exp ‘)’ stm ELSE stm { $$ = A_IfStm($3, $5, $7) ; } | WHILE ‘(‘ exp ‘)’ stm { $$ = A_WhileStm($3, $5); } | ‘{‘ stmList ‘}’ { $$ = $2; } | BREAK `;' { $$ = A_BreakStm(); } ; stmList :stmList st { $$ = A_BlockStm($1, $2) ;} | /* empty */ {$$ = NULL ;}
A Semantic Check(on the abstract syntax tree) void check_break(A_St st) { switch (st->kind) { case A_if: check_break(st-> u.if_st.st1); check_break(st->u.if_st.st2); break; case A_while: break ; case A_break: error(“Break must be enclosed within a loop”, st->pos); break; case A_block: check_break(st->u.block_st.st1) check_break(st->u.block_st.st2); break; } }
Syntax Directed Solution %{static int loop_count = 0 ;%} %% stm : exp ‘;’ | IF ‘(‘ exp ‘)’ stm | IF ‘(‘ exp ‘)’ stm ELSE stm | WHILE ‘(‘ exp ‘)’ m stm { loop_count--;} | ‘{‘ stmList ‘}’ | BREAK ‘;’ { if (!loop_count) error(“Break must be enclosed within a loop”, line_count); } ; stmList :stmList st | /* empty */ ; m : /* empty */ { loop_count++ ;} ;
Problems with Syntax Directed Translations • Grammar specification may be tedious (e.g., to achieve LALR(1)) • May need to rewrite the grammar to incorporate different semantics • Modularity is impossible to achieve • Some programming languages allow forwarddeclarations (Algol, ML and Java)
Example Semantic Condition: Scope Rules • Variables must be defined within scope • Dynamic vs. Static Scope rules • Cannot be coded using a context free grammar
Dynamic vs. Static Scope Rules procedure p; var x: integer procedure q ; begin { q } … x … end { q }; procedure r ; var x: integer begin { r } q ; end; { r } begin { p } q ; r ; end { p }
Example Semantic Condition • In Pascal Types in assignment must be “compatible”'
Partial Grammar for Pascal Stm id Assign Exp Exp IntConst Exp RealConst Exp Exp + Exp Exp Exp -Exp Exp ( Exp )
Refined Grammar for Pascal Stm RealId Assign RealExp StmIntExpAssign IntExp StmRealId Assign IntExp RealExp RealConst IntExp IntConst RealIntExp RealId IntExp IntId RealExp RealExp + RealExp RealExp RealExp + IntExp IntExp IntExp + IntExp RealExp IntExp + RealExp IntExp IntExp -IntExp RealExp RealExp -RealExp RealExp RealExp -RealExp IntExp ( IntExp ) RealExp RealExp -IntExp RealExp IntExp -RealExp RealExp ( RealExp )
Syntax Directed Solution %% ... stm : id Assign exp {compat_ass(lookup($1), $4) ; } ; exp : exp PLUS exp {compat_op(PLUS, $1, $3); $$ = op_type(PLUS, $1, $3); } | exp MINUS exp {compat_op(MINUS, $1, $3); $$ = op_type(MINUS, $1, $3); } | ID { $$ = lookup($1); } | INCONST { $$= ty_int ; } | REALCONST { $$ = ty_real ;} | ‘(‘ exp ‘)’ { $$ = $2 ; } ;
Attribute Grammars [Knuth 68] • Generalize syntax directed translations • Every grammar symbol can have several attributes • Every production is associated with evaluation rules • Context rules • The order of evaluation is automatically determined • declarative • Multiple visits of the abstract syntax tree
Attribute Grammar for Types stm id Assign exp {compat_ass(id.type, exp.type) } exp exp PLUS exp {compat_op(PLUS, exp[1].type,exp[2].type) exp[0].type = op_type(PLUS, exp[1].type, exp[2].type) } exp exp MINUS exp {compat_op(MINUS, exp[1].type, exp[2].type) exp[0].type = op_type(MINUS, exp[1].type, exp[2].type) } exp ID { exp.type = lookup(id.repr) } exp INCONST { exp.type= ty_int ; } exp REALCONST { exp.type = ty_real ;} exp ‘(‘ exp ‘)’ { exp[0].type = exp[1].type ; }
Example Binary Numbers Z L Z L.L L L B L B B 0 B 1 Compute the numeric value of Z
Z L { Z.v = L.v } Z L.L { Z.v = L[1].v + L[2].v } L L B { L[0].v = L[1].v + B.v } L B { L.v = B.v } } B 0 {B.v = 0 } B 1 {B.v = ? }
Z L { Z.v = L.v } Z L.L { Z.v = L[1].v + L[2].v } L L B { L[0].v = L[1].v + B.v } L B { L.v = B.v } B 0 {B.v = 0 } B 1 {B.v = 2B.s}
Z L { Z.v = L.v } Z L.L { Z.v = L[1].v + L[2].v } L L B { L[0].v = L[1].v + B.v B.s = L[0].s L[1].s = L[0].s + 1} } L B { L.v = B.v B.s = L.s } B 0 {B.v = 0 } B 1 {B.v = 2B.s }
Z L { Z.v = L.v L.s = 0 } Z L.L { Z.v = L[1].v + L[2].v L[1].s = 0 L[2].s=? } L L B { L[0].v = L[1].v + B.v B.s = L[0].s L[1].s = L[0].s + 1} } L B { L.v = B.v B.s = L.s } B 0 {B.v = 0 } B 1 {B.v = 2B.s }
Z L { Z.v = L.v L.s = 0 } Z L.L { Z.v = L[1].v + L[2].v L[1].s = 0 L[2].s=-L[2].l } L L B { L[0].v = L[1].v + B.v B.s = L[0].s L[1].s = L[0].s + 1 L[0].l = L[1].l + 1} } L B { L.v = B.v B.s = L.s L.l = 1 } B 0 {B.v = 0 } B 1 {B.v = 2B.s }
Z.v=1.625 Z L.v=0.625 L.v=1 L.l=3 L.s=-3 L . L L.s=0 L.l=1 L.v=0.5 B.s=-3 L.l=2 B.s=0 B L L.s=-2 B B.v=1 B.v=0.125 B.s=-2 L.s=-1 1 B L L.l=1 B.v=0 L.v=0.5 1 0 B B.v=0.5 B.s=-1 1
Summary • Several ways to enforce semantic correctness conditions • syntax • Regular expressions • Context free grammars • syntax directed • traversals on the abstract syntax tree • later compiler phases? • Runtime? • There are tools that automatically generate semantic analyzer from specification(Based on attribute grammars)