280 likes | 408 Views
Ştefan Stăncescu. PART I SISTEM UTILITIES. Lecture 6 Compilers. COMPILERS. “high level language” HLL, w/complex grammar laws, closer to human language HLL mean for man computer link human language binary language HLL binary language
E N D
Ştefan Stăncescu PART ISISTEM UTILITIES Lecture 6 Compilers
COMPILERS • “high level language” HLL, • w/complex grammar laws, • closer to human language • HLL mean for man computer link • human language binary language • HLL binary language • COMPILER - Automatic translation machine
COMPILERS • Source Code =>in HLL language • Object code =>in binary language (machine code) • COMPILATION – cf. HLL grammar law • lexical laws • language elements type and structure • syntactic laws • composition rules of language elements • "semantic" laws (translationprograms) • syntactic law correspondent in object code, “semantic programs” for machine
COMPILERS • Compiling = review + translate HLL source text • lexical laws • scanner • syntactic laws • parser • "semantic" laws • object code generator • (at the VM – intermediate code - "bytecode“)
COMPILERS • SCANER identifies tokens • language elements - • one or many adjacent single characters separated by characters sp, LF,FF, etc.) • words START, STOP, LABEL01 • operators +*/- • special signs(){}//.,
COMPILERS • SCANER step I • scanning HLL source text • determine the token list by boundary • identify HLL tokens • identify programmer invented tokens • create look-up table with • numerical symbols for tokens
COMPILERS • SCANNER step 2 • create intermediate source file • with replaced tokens with numerical symbols from the look-up table created in step 1
COMPILERS • BNF – Bachus-Naur Form • syntactic rule REPRESENTATION • A rule - law in BNF format • a valid construction in HLL language • formatted template of • a rule applied in a line in source file • (and a rule applied for lines in a line list)
COMPILERS • Syntactic rule valid construction in HLL • A template have the name of • the new built and checked element • that can be part of other construction • (including one with the same pattern) • New build name “nonterminal” symbol • BNF rule form: • <nonterminal symbol > :: = building template
COMPILERS • Parsing discovery in HLL source file of • successive valid BNF rules (templates) until • there are no more undiscovered laws • (no more “nonterminal” symbols) • Parsing ends only on tokens (“terminal” symbols) • Chaining BNF rules (templates) => syntax tree • The purpose parsing => the discovery of • the syntax tree of the source file
COMPILATOARE • Line in the source file: S = A + B • (A, B, S-integer variables - tokens) • The code generator must explain • to the machine the templates finded • The scanner identifies tokens • “S” “=“ “A”“+”“B” • tokens “A”, “B”, “S” as variables • token “+” operator , token “=“ assign
COMPILATOARE • The parser verifies also the coherence of variables, if are the same • (if all A, B, S integers – OK) • if one is different, the templates for “+” and “=“ need conversion to coherent type • Ex: if S is real, A,B integer • “+” rule OK , result integer • “=“ (assignment rule) add • format conversion integer => real(float)
COMPILERS • I-stparseroperation - structures consistency • (conversion, if needed) • II-ndparser operation - A+B • (result in temporary memory) • III-rdparser operation - assigning result to S • (S=A+B) • Applicable BNF rules: • conversion, addition, assignment, in that order
COMPILERS • EXAMPLE II (bottom-up parsing) • S=A+B*C – D • scan theline, discover operations to be performed first • result become “nonterminal” symbol <N> • => The precedence of operators( + <. * )| ( * .> -) • Assuming algebraic expression rules • Syntactic algebraic rule of multiplication <product>::=<agent>*<agent> • Syntactic law of addition • <sum> ::=(<agent>+< agent >)|(< agent >-< agent >)
COMPILERS • EXEMPLE II (bottom-up parsing) • <N1>::=B*C • <N2>::=A+N1 • <N3>::=N2-D • Syntactic tree of expression A+B*C-D
COMPILERS • EXEMPLE II (bottom-up parsing) • S=A+(B*C-D) • S=ATTRIB(N3) • N3=SUM(A,N2) • N2=SCAD(N1,D) • N1=PROD(B,C) • Syntactic tree of expression A+B*C-D
COMPILERS • STANDARD PROGRAM IN PASCAL SIMPLIFIED LANGUAGE • 1 MEDIA ANALYSIS PROGRAM • 2 VAR • 3 NRCRT, I: INTEGER; • 3 SARITM, SARMON, DIF: REAL • 4 BEGIN • 5 SARITM :=0; • 6 SARMON :=0; • 7 FOR I :=0 TO 100 DO • 8 BEGIN • 9 READ (NRCRT); • 10 SARITM := SARITM + NRCRT; • 11 SARMON := SARMON + 1 DIV NRCRT; • 12 END; • 13 DIF:=SARITM DIV 100 – 100 DIV SARMON; • 14 WRITE(DIF); • 15 END.
COMPILERS • GRAMMAR (BNF) PASCAL SIMPLIFIED LANGUAGE • 1. <prog> ::= PROGRAM <prog-name> VAR <dec-list> BEGIN <stmt-list> END. • 2. <prog_name> ::= id • 3. <dec_list> ::= <dec> | <dec_list> ; <dec> • 4. <dec> ::= <id_list> : <type> • 5. <type> ::= INTEGER | REAL • 6. <id_list> ::= id | <id_list> , id • 7. <stmt_list> ::= <stmt> | <stmst_list> ; <stmt> • 8. <stmt> ::= <assign> | <read> | <write> | <for> • 9. <assign> ::= id := <exp> • 10. <exp> ::= <term> | <exp> + <term> | <exp> - <term> • 11. <term> ::= <factor> | <term> * <factor> | <term> DIV <factor> • 12. <factor> ::= id | int | (<exp>) • 13. <read> ::= READ(id_list) • 14. <write> ::= WRITE(id_list) • 15. <for> ::= FOR <index_exp> DO <body> ; • 16. <index_exp> ::= id:= <exp> TO <exp> • 17. <body> ::= <stmt> | BEGIN <stmt_list> END
COMPILERS • STANDARD • 9. READ (NRCRT); • BNF: • 13. <read>::=READ(id_list) • 6. <id_list> ::=id | <id_list>) ; id
COMPILERS • STANDARD • 15. DIF :=SARITM DIV 100 – 100 DIV SARMON; • BNF: • 9. <assign> ::= id := <exp> • 10. <exp> ::= <term> | <exp> - <term> • 11. <term> ::= <factor> | <term> DIV <factor> • 12. <factor> ::= id | int| (<exp>)
COMPILERS • PROGRAM .=. VAR • BEGIN <. FOR • ; .> END. • Vide pairs - grammatical errors • Precedence relations– only one • (consistency grammar)
COMPILERS • Generating semantic programs • DIF := SARITM DIV 100 – 100 DIV SARMON • id1 := id2 DIV int - int DIV id4 • id1 := exp1 - exp2 • id1 := exp3 • DIVSARITM #100 i1 • DIV #100 SARMON i2 • - i1 i2 i3 • := i4 , DIF
COMPILERS • (1) := #0 , SARITM {SARITM:=0} • (2) := #0 , SARMON {SARMON:=0} • (3) := #1 , I {FOR i=1 to 100} • (4) JGT I #100 (15) • (5) CALL X READ {READ(NRCRT)} • (6) PARAM NRCRT • (7) + SARITM NRCRT i1 {SARITM:=SARITM+NRCRT} • (8) := i1 , SARITM • (9) DIV #1 NRCRT i2 {SARMON:=SARMON+1 DIV NRCRT) • (10) + SARMON i2 i3 • (11) := i3 , SARMON • (12) + I #1 i4 {sfîrşit FOR} • (13) := i4 , I • (14) J (4) • (15) DIV SARITM #100 i6{DIF :=SARITM DIV 100 -100 DIV SARMON} • (16) DIV #100 SARMON i7 • (17) - i6 i7 i8 • (18) := i8 , DIF • (19) CALL X WRITE • (20) PARAM DIF
COMPILERS • 1. L.L. Beck, „System Software: An introduction to systems programming”, Addison Wesley. 3’rd edition, 1997. • 2. A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman, „Compilers: Principles, Techniques, and Tools”, 2'nd Edition. Addison-Wesley, 2007 • 3. Wirth Niklaus ""Compiler Construction", Addison-Wesley, 1996, 176 pages. Revised November 2005 • 4. Knuth, Donald E. "Backus Normal Form vs. Backus Naur Form", Communications of the ACM 7 (12), 1964, p735–736.