200 likes | 343 Views
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. Syntax and semantics {week 03}. from Concepts of Programming Languages , 9th edition by Robert W. Sebesta, Addison-Wesley, 2010, ISBN 0-13-607347-6. Syntax.
E N D
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. Syntax and semantics{week 03} from Concepts of Programming Languages, 9th edition by Robert W. Sebesta, Addison-Wesley, 2010, ISBN 0-13-607347-6
Syntax • Syntax is the expected form or structure of the expressions, statements, and program units of a programming language • Syntax of a Java while statement: • while ( <boolean_expr> ) <statement> • Partial syntax of an if statement: • if ( <boolean_expr> ) <statement>
Semantics • Semantics is the meaning of the expressions, statements, and program units of a given programming language • Semantics of a Java while statement • while ( <boolean_expr> ) <statement> • Execute <statement> zero or more times as long as <boolean_expr> evaluates to true
Syntax and semantics • Together, syntax and semantics definea programming language • Syntax errors are detectedand reported by a compiler • Errors related to semantics are defects in program logic that cause incorrect resultsor program crashes
Defining syntax • Terminology to describe syntax: • A sentence is a string of characters over an alphabet of symbols • A language is a set of sentences • A lexeme is the lowest-level syntactic unitof a language (e.g. +, *, sum, while) • One step above individual characters • A token is a set of lexemes • e.g. identifier, equal_sign, integer_literal, etc.
Language recognizers • A language recognizer reads an input string and determines whether it belongs to the given language • This is the syntax analysis partof a compiler or interpreter input strings (source code) language recognizer accept or reject each input string
Language generators • A language generator produces syntactically acceptable strings of a given language • Not practical to generate all valid strings • Instead, inspect generator rules (the grammar) to determine if a sentence is acceptable for a given language language generator valid strings of the language
Noam Chomsky • In the mid-1950s, linguist Noam Chomsky (born 1928) developed four classesof generative grammars • Context-free grammars (CFGs) areuseful for describing programminglanguage syntax • Regular grammars are useful for describingvalid tokens of a programming language
Backus-Naur Form (BNF) • In 1960, John Backus and Peter Naur developed a formal notationfor specifying programminglanguage syntax • Backus-Naur Form (BNF) is nearly identical to Chomsky’s context-free grammars • Syntax of an assignment statement in BNF: • <assign> <var> = <expression> ;
BNF structure • Syntax of an assignment statement in BNF: • BNF rule or production defining <assign>: <assign> <var> = <expression> ; abstractionbeing defined definitionof <assign> • The definition consists of other abstractions, as well as lexemes and tokens
Example language <program> begin <stmts> end <stmts> <stmt> | <stmt> ; <stmts> <stmt> <var> = <expr> <var> a | b | c | d | e <expr> <term> + <term> | <term> - <term> <term> <var> | literal-integer-value a vertical bar indicates an OR a token, which is simplya grouping of lexemes • Write a sentence that conforms to this grammar
<program> begin <stmts> end <stmts> <stmt> | <stmt> ; <stmts> <stmt> <var> = <expr> <var> a | b | c | d | e <expr> <term> + <term> | <term> - <term> <term> <var> | literal-integer-value Derivations • A derivation is a repeated application of rules • Start with a start symbol and end with a sentence <program> => begin <stmts> end => begin <stmt> end => begin <var> = <expr> end => begin b = <expr> end => begin b = <term> + <term> end => begin b = <var> + <term> end => begin b = c + <term> end => begin b = c + 123 end • Many possible (often infinite) derivations
Leftmost and rightmost derivations • A leftmost derivation is one in which the leftmost abstraction is always the next one expanded • Write both a leftmost and rightmost derivation to obtain this sentence: begind=10-a end • Why is the leftmost derivation important? <program> begin <stmts> end <stmts> <stmt> | <stmt> ; <stmts> <stmt> <var> = <expr> <var> a | b | c | d | e <expr> <term> + <term> | <term> - <term> <term> <var> | literal-integer-value
Working with grammars <S> <A> <B> <C> <A> a <A> | a <B> b <B> | b <C> c <C> | c • Given this simple grammar: • Which of the following sentences aregenerated by this grammar? • baaabbccc • abc • bbaabbaabbaabbaac • aabbbbccccccccccccccccccccc
What next? (i) • Write BNF for the following constructs from your favorite programming language: • Assignment statement • Include operators +, -, *, /, %, ++, -- • Complete while and if statements • Class header for Java/C++/C# • etc.
What next? (ii) <assign> <var> = <expr> <var> A | B | C | D <expr> <var> + <expr> | <var> * <expr> | ( <expr> ) | <var> • Given this grammar: • Show both leftmost and rightmost derivations for the following sentences: • A = A * ( B + ( C * A ) ) • B = B * ( (D) + C ) • C = A + B + C * D + A
What next? (iii) • Use BNF to write a grammarfor reverse Polish notation • Use <expr> as your start symbol • Valid sentences include: • 5 8 19 + * • 2 3 + 5 7 * / • 2 3 + 5 7 * / 3 4 + * 1 – • 9 9 * 8 7 * * 5 5 * * 4 -
What next? (iv) • Read and studyChapter 3 • Do Exercises at the end of Chapter 3