650 likes | 792 Views
Lecture 3 Concepts of Programming Languages. Arne Kutzner Hanyang University / Seoul Korea. Topics. Formal Methods of Describing Syntax / BNF and Context free grammars Attribute Grammars Describing the Meanings of Programs / Semantics. Introduction.
E N D
Lecture 3Concepts of Programming Languages Arne Kutzner Hanyang University / Seoul Korea
Topics • Formal Methods of Describing Syntax / BNF and Context free grammars • Attribute Grammars • Describing the Meanings of Programs / Semantics Concepts of Programming Languages
Introduction • Syntax: form and structure of the program code • Semantics: meaning of the program code (meaning of expressions, statements, control structures, …) • Syntax and semantics provide a language’s definition • Users of a language definition • Implementers • Programmers (the users of the language) Concepts of Programming Languages
General Problem of Describing Syntax: Terminology • A sentenceis a string of characters over some alphabet • A language is a set of sentences • Alexemeis the lowest level syntactic unit of a language (e.g., *, sum, begin) • A tokenis a category of lexemes (e.g., identifier) Concepts of Programming Languages
Formal Definition of Languages • Generators (Grammars) • A device that generates sentences of a language • One can determine if the syntax of a particular sentence is syntactically correct by comparing it to the structure of the generator • Recognizers (Parser) • A recognition device reads input strings over the alphabet of the language and decides whether the input strings belong to the language • Example: syntax analysis part of a compiler • We will first look into the world of generators and later we will study the recognizer Concepts of Programming Languages
BNF / Context-Free Grammars • Context-Free Grammars (CFG) • Developed by Noam Chomsky in the mid-1950s • Language generators, meant to describe the syntax of natural languages • Define a class of languages called context-free languages • Backus-Naur Form (1959) • Invented by John Backus to describe Algol 58 • BNF and context-free grammars represent equal concepts Concepts of Programming Languages
BNF (CFG) Fundamentals • In BNF, abstractions are used to represent classes of syntactic structures--they act like syntactic variables (also called nonterminal symbols, or just nonterminals) • Terminals are lexemes or tokens • A rule has a left-hand side (LHS), which is a nonterminal, and a right-hand side (RHS), which is a string of terminals and/or nonterminals Concepts of Programming Languages
BNF / GCF grammar Rules • An abstraction (or nonterminal symbol) can have more than one RHS <stmt> <single_stmt> | begin <stmt_list> end Concepts of Programming Languages
BNF (CFG) Fundamentals • Nonterminals are often enclosed in angle brackets • Examples of BNF rules: <ident_list> → identifier | identifier , <ident_list> <if_stmt> → if <logic_expr> then <stmt> • Grammar: a finite non-empty set of rules • A start symbol is a special element of the nonterminals of a grammar Concepts of Programming Languages
CFG - Formal Definition • A context-free grammar G is defined by the 4-tuple G = (V, Σ, R, S) where • V is a finite set; each element v in V is called a non-terminal character or a variable. Each variable represents a different type of phrase or clause in the sentence. • Σ is a finite set of terminals, disjoint from V, which make up the actual content of the sentence. The set of terminals is the alphabet of the language defined by the grammar G . • R is a finite relation from V to (V + Σ)*, where the asterisk represents the Kleene star operation. The members of R, are called the (rewrite) rules or productions of the grammar. (also commonly symbolized by a P) • S is the start variable (or start symbol), used to represent the whole sentence (or program). It must be an element of V. Concepts of Programming Languages
Example: Describing Lists • Syntactic lists are described using recursion <ident_list> ident | ident ',' <ident_list> Concepts of Programming Languages
Derivation • A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols) Concepts of Programming Languages
An Example Grammar <program> <stmts> <stmts> <stmt> | <stmt> ';' <stmts> <stmt> <var> '=' <expr> <var> 'a' | 'b' | 'c' | 'd' <expr> <term> '+' <term> | <term> '-' <term> <term> <var> | 'const' Concepts of Programming Languages
An Example Derivation <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <term> + <term> => a = <var> + <term> => a = b + <term> => a = b + const Concepts of Programming Languages
Derivations – Some Notions • Every string of symbols in a derivation is a sentential form • A sentence is a sentential form that has only terminal symbols • A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded. Accordingly a rightmost derivation for the rightmost nonterminal. • A derivation may be neither leftmost nor rightmost Concepts of Programming Languages
Parse Tree • A parse tree represents a hierarchical representation of a derivation by means of a tree.Example: <program> <stmts> <stmt> <var> = <expr> a <term> + <term> <var> const b Concepts of Programming Languages
Ambiguity in Grammars • A grammar is ambiguous if and only if it generates a sentential form (the language generated by the grammar comprises at least one word) that has two or more distinct parse trees Concepts of Programming Languages
<expr> <expr> <expr> <op> <expr> <expr> <op> <op> <expr> <expr> <op> <expr> <expr> <op> <expr> const '-' const ‘+' const const '-' const ‘+' const Example for Ambiguous Grammar <expr> <expr> <op> <expr> | const <op> ‘+' | '-' word: const – const + const Concepts of Programming Languages
<expr> <expr> - <term> <term> <term> + const const const Unambiguous Grammar • Grammar on the slide before in unambiguous form: • We get some form of operator precedence <expr> <expr> - <term> | <term> <term> <term> + const | const Concepts of Programming Languages
<expr> <expr> <expr> + const <expr> + const const Associativity of Operators • Operator associativity can also be indicated by a grammar. Example: <expr> -> <expr> + <expr> | const (ambiguous) <expr> -> <expr> + const | const (unambiguous) Concepts of Programming Languages
The empty word ε • Grammars can comprise rules of the form: <A> ε • ε is the empty word; the above rule indicates that the non-terminal <A> can be deleted without any replacement • We will use the empty world later in the context of parsers. Concepts of Programming Languages
Extended BNF • Optional parts are placed in brackets [ ] <proc_call> -> ident [(<expr_list>)] • Alternative parts of RHSs are placed inside parentheses and separated via vertical bars <term> -> <term>(+|-) const • Repetitions (0 or more) are placed inside braces { } <ident> -> letter {letter|digit} Concepts of Programming Languages
BNF and EBNF - Examples • BNF <expr> <expr> + <term> | <expr> - <term> | <term> <term> <term> * <factor> | <term> / <factor> | <factor> • EBNF <expr> <term> {(+ | -) <term>} <term> <factor> {(* | /) <factor>} Concepts of Programming Languages
Variations in EBNF • Use of a colon instead of -> • Use of opt for optional parts • Use of oneof for choices Concepts of Programming Languages
Static Semantics • Nothing to do with meaning • Context-free grammars (CFGs) cannot describe all of the syntax of programming languages • Categories of constructs that are “trouble-maker”: • Still Context-free, but cumbersome (e.g., types of operands in expressions) • Non-context-free (e.g., variables must be declared before they are used) Concepts of Programming Languages
Attribute Grammars • Attribute grammars (AGs) have additions to CFGs to carry some semantic info on parse tree nodes • Primary value of AGs: • Static semantics specification for static semantics checks Concepts of Programming Languages
Attribute Grammars : Definition • Def: An attribute grammar is a context-free grammar with the following additions: • For each grammar symbol x there is a set A(x) of attribute values • Each rule has a set of functions that define certain attributes of the nonterminals in the rule • Each rule has a (possibly empty) set of predicates to check for attribute consistency Concepts of Programming Languages
Attribute Grammars: Definition • Let X0 X1 ... Xn be a rule • Functions of the form S(X0) = f(A(X1), ... ,A(Xn)) define synthesized attributes • Functions of the form I(Xj) = f(A(X0), ... , A(Xn)), for 1 <= j <= n, define inherited attributes • Initially, there are intrinsic attributes on the leaves Concepts of Programming Languages
Attribute Grammars: An Example • Syntax <assign> -> <var> = <expr> <expr> -> <var> + <var> | <var> <var> -> id • Attributes • actual_type: synthesized with the non-terminals <var>and<expr> • expected_type: inherited with the non-terminal <expr> Concepts of Programming Languages
Attribute Grammar (Example) • Syntax rule: <assign> <var> = <expr>Semantic rules: <expr>.expected_type <var>.actual_type • Syntax rule: <var> idSemantic rule:<var>.actual_type lookup (id.type) • Syntax rule: <expr> <var>Semantic rules: <expr>.actual_type <var>.actual_typePredicate: <expr>.expected_type == <expr>.actual_type • Syntax rule: <expr> <var>[1] + <var>[2]Semantic rules: <expr>.actual_type <var>[1].actual_typePredicate:<var>[1].actual_type == <var>[2].actual_type<expr>.expected_type == <expr>.actual_type Concepts of Programming Languages
Attribute Grammars (continued) • How are attribute values computed? • If all attributes were synthesized, the tree could be decorated in bottom-up order. • If all attributes were inherited, the tree could be decorated in top-down order. • In many cases, both kinds of attributes are used, and it is some combination of top-down and bottom-up that must be used. Concepts of Programming Languages
Semantics • There is no single generally, widely accepted notation or formalism for describing program semantics • We have several general approaches: • Operational Semantics(Behavior oriented) • Axiomatic Semantics(Formal Logic, Predicate calculus oriented) • Denotational Semantics(Domain oriented) Concepts of Programming Languages
Operational Semantics • Approach 1: Describe the meaning of the language constructs by using clean and unambiguous natural language in mix with formal elementsExample:Current ISO C++ standard (ISO/IEC 14882:2012(E) ) Concepts of Programming Languages
Operational Semantics (continued) • Approach 2: Use some form of formal method/machinery as description tool.Examples: • Vienna Definition Language (VDL) http://en.wikipedia.org/wiki/Vienna_Development_Method • Formal (state) machine approach. You get the meaning of some code by executing it on a formal machine • May require a translation scheme for getting formal machine code out of language code. • Use a mathematical formal system like a calculus as replacement for the formal machine in the above approach Concepts of Programming Languages
Operational Semantics (continued) • Uses of operational semantics: • Language definition itself for compiler constructors as well as language users. • Proofing the correctness of automatic program transformations/optimizations. • Later we will meet some form of operational semantics ... Concepts of Programming Languages
Axiomatic Semantics • Based on formal logic (predicate calculus) • Original purpose: Formal program verification • Axioms or inference rules are defined for each statement type in the language (to allow transformations of logic expressions into more formal logic expressions) • The logic expressions are called assertions Concepts of Programming Languages
Logic Implications • P implies Q (symbolized as P⇒ Q)Meaning: • If condition P is true, then condition Q must be true • If condition P is false, the implication is true independent of the value of Q As logic table: Concepts of Programming Languages
Logic Implications • Example: b>10 ⇒ b>5Proof: (Range check for b)b in [11..infinite] b>10 is true and b>5 is trueb in [6..10] b>10 is false and b>5 is trueb in [-infinite..5] b>10 is false and b>5 is false • Intuition: With respect to the range of all integers b>10 is “more often false”, then b>5. So the condition b>10 it is less general and more restrictive; We say it is a stronger condition. Concepts of Programming Languages
Logic Implications • Weaker versus stronger conditions:P ⇒ Q stronger condition weaker condition • P is a stronger condition then Q andQ is a weaker condition then P Concepts of Programming Languages
Axiomatic Semantics Form {P} statement {Q} Precondition Postcondition • Preconditions and postconditions are called assertions and will be evaluated to either true or false • The pair of a prepcondition and postcondition is called a specification. • A statement is correct with respect to its specification, if pre-condition and post-condition become true Concepts of Programming Languages
Axiomatic Semantics Form An example • a = b + 1 {a > 1} • One possible precondition: {b > 10} • Weakest precondition: {b > 0} • A weakest precondition is the least restrictive precondition that will guarantee the postcondition Concepts of Programming Languages
Inference rules • H1, H2, ..., Hn H • This notation can be interpreted as:If H1, H2, ..., Hn have all been verified, we may conclude that H is valid. Concepts of Programming Languages
Rule of consequence Meaning: We can strengthen the precondition and we can weaken the postcondition in order to get a fresh “valid” specification Example: ,b>30 ⇒ b>22, a>10 ⇒a>5 {a > 10} {b > 22} a = b / 2 - 1 {a > 5} {b > 30} a = b / 2 - 1 Concepts of Programming Languages
Axiomatic Semantics: Axioms • An axiom for assignment statements (x = E): {Qx->E} x = E {Q} • Delivers the weakest precondition for Q E Q x {b > 22} {a > 10} a = b / 2 - 1 b / 2 – 1 > 10 => b > 22 Concepts of Programming Languages
Sequences • {P1} S1 {Q}, {Q} S2 {Q2} {P1} S1; S2 {Q2} b < 2} 3 * b + 1 < 7 => b < 2 y = 3 * b + 1; a = y + 3; {y < 7} b < 2} {y < 7} {a < 10} y + 3 < 10 => y < 7 a < 10} Concepts of Programming Languages
Selection Statements • {B and P} S1 {Q}, {(not B) and P} S2 {Q} {P} if B then S1 else S2 {Q} {y > 1} apply the rule of consequence if x > 0 then a = y - 1 else a = y + 1 and x > 0 {y > 1} a = y – 1 {a > 0} {y > -1} a = y + 1 {a > 0} and not (x > 0) {a > 0} Concepts of Programming Languages
Axiomatic Semantics: while • Inference rule for logical pretest loops where I is the loop invariant (the inductive hypothesis) • Axiomatic description for while loops: {P} while B do S end {Q} Concepts of Programming Languages
Loop Invariant Requirements • The loop invariant must satisfy the following requirements in order to be useful: • P => I (precondition implies invariant) • {I and B} S {I} (axiomatic semantics during loop iteration / the condition B is true) • {I and (not B)} => Q (invariant at termination implies postcondition) • loop terminates Concepts of Programming Languages
Finding a loop invariant … • Example:while y != x do y = y + 1 end {y = x} • One possible technique: unroll the loop..Zero iterations: {y = x}One iteration: {y = x - 1}y = y + 1 {y = x}Two iterations:{y = x – 2}y = y + 1 {y = x - 1}Three iterations:{y = x – 3}y = y + 1 {y = x - 2} y <= x assumed loop invariant … Concepts of Programming Languages
Check whether the loop invariant is a (weakest) precondition • We select P as {y <= x} and I as {y <= x} : • P = I directly implies P=>I • {I and B} S {I}We have:{y <= x and y != x} y = y + 1 {y <= x}Verification: We get {y + 1 <= x} , which is equivalent to {y < x}, equivalent to {y <= x and y != x} • {I and (not B)} => QWe have:{(y < = x) and not (y !=x)} => {y = x}this is trivially true … • Loop terminates obviously • So, the loop invariant can be used as precondition.Further it represents a weakest precondition Concepts of Programming Languages