350 likes | 502 Views
Translating High Level Languages. Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126. Stages of translation. Lexical analysis - the lexer or scanner Syntactic analysis - the parser Code generation Linking Before Execution. Lexical analysis.
E N D
Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126 D Goforth COSC 3127
Stages of translation • Lexical analysis - the lexer or scanner • Syntactic analysis - the parser • Code generation • Linking Before • Execution D Goforth COSC 3127
Lexical analysis • Translate stream of characters into lexemes • Lexemes belong to categories called tokens • Token identity of lexemes is used at the next stage of syntactic analysis D Goforth COSC 3127
From characters to lexemes yVal = x + 450 – min ( 100, 4xVal )); yVal = x + 450 – min ( 100, 4xVal )); D Goforth COSC 3127
Examples: tokens and lexemes • Some token categories contain only one lexeme: semi-colon; • Some tokens categorize many lexemes: identifier count, maxCost,… based on a rule for legal identifier strings D Goforth COSC 3127
Tokens and Lexemes yVal=x +450 – min(100, 4xVal )); identifier illegal lexeme equal_sign left_paren • Lexical analysis • identifies lexemes and their token type • recognizes illegal lexemes (4xVal) • does NOT identify syntax error: ) ) D Goforth COSC 3127
Syntax or Grammar of Language rules for • generating (used by programmer) or • Recognizing (used by parser) a valid sequence of lexemes D Goforth COSC 3127
Grammars • 4 categories of grammars (Chomsky) • Two categories are important in computing: • Regular expressions (pattern matching) • Context-free grammars (programming languages) D Goforth COSC 3127
Context-free grammar • Meta-language for describing languages • States rules or productions for what lexeme sequences are correct in the language • Written in • Backus-Naur Form (BNF) or EBNF • Syntax graphs D Goforth COSC 3127
Example of BNF rule PROBLEM: how to recognize all these as correct? y = x f = rVec.length + 1 button[4].label = “Exit” RULE for defining assignment statement: <assign> <variable> = <expression> Assumes other rules for <variable>, <expression> D Goforth COSC 3127
BNF rules Non-terminal and terminal symbols: • Non-terminals are defined by at least one rule <assignment> < var> = <expression> • Terminals are tokens (or lexemes) D Goforth COSC 3127
Simple sample grammar(p.123) <assign> <id> = <expr> <id> A | B | C // lexical <expr> <id> + <expr> | <id> * <expr> |( <expr>) | <id> terminals <nonterminals> D Goforth COSC 3127
Simple sample production <assign><id>=<expr><- apply one rule at each step B =<expr>to leftmost non-terminal B =<id>*<expr> B =A *<expr> B = A * (<expr>) B = A * (<id>+<expr>) B = A * ( C +<expr>) B = A * ( C +<id>) B = A * ( C + C ) <assign> <id> = <expr> <id> A | B | C <expr> <id> + <expr> | <id> * <expr> |( <expr>) | <id> D Goforth COSC 3127
Sample parse tree <assign> <expr> = <id> <expr> <id> * B <assign> <id> = <expr> <id> A | B | C <expr> <id> + <expr> | <id> * <expr> |( <expr>) | <id> ) <expr> A ( Rule application <id> <expr> + <id> C C Leaves represent the sentence of lexemes D Goforth COSC 3127
extended sample grammar <stmt> <assign> | <ifstmt> <ifstmt> if(<cond>)then <stmt> |if (<cond>) then <stmt> else <stmt> <cond> <expr> <compareop><expr> <compareop> < | > | <= | >= | == | ~= How to add compound condition? D Goforth COSC 3127
Ambiguous grammar • Different parse trees for same sentence • Different translations for same sentence • Different machine code for same source code! D Goforth COSC 3127
Grammars for ‘human’ conventions without ambiguity • Putting features of languages into grammars: • expression any length: lists, p. 121 • precedence - an extra non-terminal: p. 125 • associativity - order in recursive rules: p. 128 • nested if statements - “dangling else” problem: p. 130 D Goforth COSC 3127
Forms for grammars • Backus-Naur form (BNF) • Extended Backus-Naur form (EBNF) -shortens set of rules • Syntax graphs -easier to read for learning language D Goforth COSC 3127
EBNF • optional zero or one occurrence [..] <expr> ->[<expr> + ]<term> • optional zero or more occurrences {..} <expr> -> <term> { + <term> } • ‘or’ choice of alternative symbols | <term> -> <term> [ (*|/)<term>] D Goforth COSC 3127
Syntax Graph - basic structures expr term * term factor term / term factor * factor / expr term + term -
BNF (p. 121) EBNF <expr> -> <expr>+<term> | <expr>-<term> | <term> <term> -> <term>*<factor> | <term>/<factor> | <factor> <expr> -> [<expr> (+|-)] <term> <term> -> [<term> (*|/)] <factor> <expr> -> <term> {(+|-) <term>} <term> -> <factor> {(*|/)<factor>} Syntax Graph expr term + term - term factor * factor /
Attribute grammars • Problem: context-free grammars cannot describe some features needed in programming - “static semantics” e.g.: rules for using data types *Can’t assign real to integer (clumsy in BNF) *Can’t access variable before assigning (impossible in BNF) D Goforth COSC 3127
Attributes • Symbols in the grammar can have attributes (properties) • Productions can have functions of some of the attributes of their symbols that compute the attributes of other symbols • Predicates (boolean functions) inspect the attributes of non-terminals to see if they are legitimate D Goforth COSC 3127
Using attributes • Apply productions to create parse tree (symbols have some intrinsic attributes) • Apply functions to determine remaining attributes • Apply predicates to test correctness of parse tree D Goforth COSC 3127
Sebesta’s example <assign> <var> = <expr> <expr> <var> + <var> | <var> <var> A | B | C Add attributes for type checking Expected_type Actual_type D Goforth COSC 3127
expected_type actual_type expected_type actual_type Sebesta’s example <assign> <var> = <expr> <expr> <var> + <var> | <var> <var> A | B | C D Goforth COSC 3127
Sebesta’s example <assign> <var> = <expr> <expr> <var> + <var> | <var> <var> A | B | C actual_type Determined from string (A,B,C) Which has been declared D Goforth COSC 3127
Sebesta’s example actual_type Determined from <var> Actual types <assign> <var> = <expr> <expr> <var> + <var> | <var> <var> A | B | C D Goforth COSC 3127
Sebesta’s example <assign> <var> = <expr> <expr> <var> + <var> | <var> <var> A | B | C expected type Determined from <var> Actual types D Goforth COSC 3127
Sebesta’s type rules p.138 D Goforth COSC 3127
Sebesta’s example D Goforth COSC 3127
Sebesta’s example D Goforth COSC 3127
Axiomatic semantics • Assertions about statements • Preconditions • Postconditions • like JUnit testing • Purpose • Define meaning of statement • Test for validity of computation (does it do what it is supposed to do?) D Goforth COSC 3127
Example for assignment • What the statement should do is expressed as a postcondition • Based on the syntax of the assignment, a precondition is inferred • When statement is executed, conditions can be verified before and after D Goforth COSC 3127
Example assignment statement y = 25 + x * 2 postcondition: y>40 y>40 25+x*2>40 x*2>15 x>7.5 precondition D Goforth COSC 3127