520 likes | 696 Views
Programming Language Theory Formal Semantics. Leif Grönqvist The national Graduate School of Language Technology (GSLT) MSI. Contents. Leif’s three parts of the course: Functional programming Logical programming Similar ways of thinking, but different from the imperative way
E N D
Programming Language TheoryFormal Semantics Leif Grönqvist The national Graduate School of Language Technology (GSLT) MSI Formal Semantics
Contents Leif’s three parts of the course: • Functional programming • Logical programming Similar ways of thinking, but different from the imperative way • Formal semantics Formal Semantics
Formal Semantics? • You have seen informal semantics (meaning) earlier in the course • Most languages don’t have a complete formal specification • Why defining a formal specification? • Possibility to make proofs of programs • The compiler may be validated • Easier to build a compiler! • No agreed standard to describe semantics Formal Semantics
Principal methods • Operational semantics • Define the language by describing its actions as operations of a machine • The machine has to be precisely defined • Denotational semantics • Uses mathematical functions on programs • Programs are translated into functions • Standard mathematical theory of functions is used • Axiomatic semantics • Uses mathematical logic • Pre and post conditions • Aimed specifically at correctness proofs Formal Semantics
Principal methods, cont. • The three methods are based on BNF-rules (Backus-Naur Form) • Syntax described in BNF – semantics is the rest, i.e. • Static types (could be syntax) • Semantic rules • Important properties for the semantics: • Completeness: all correct programs will get semantics • Consistence: one program -> one unique description • Independence: should be minimal Formal Semantics
A small example language expr expr + term | expr - term | term term term * factor | factor factor ( expr ) | number number number digit | digit digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Example: 4 * (50 - 3) Formal Semantics
Expand the example • Let’s add variables, statements, and assignments: factor ( expr ) | number | identifier program stmtList stmtList stmt ; stmtList | stmt stmt identifier := expr identifier identifier letter | letter letter a | b | c | … | z Formal Semantics
Example of a program a := 2 + 3 ; b := a * 4 ; a := b - 5 ; • Will result in: b=20 and a=15 • {b=20, a=15} represents the semantics of this program • A function from the set of identifiers to integers • We will call this function an environment Env (I) = 15 if I = a 20 if I = b undef otherwise Formal Semantics
Operations on environments Env : Identifier Integer {undef} • Lookup: Env (I) • Adding: Env & {I = n} • The empty environment: Env0(I) = undef for all I • More complex environments include pointers, aliases, scope information Formal Semantics
Expand a little bit more • Let’s add if and while statements: stmt assignStmt | ifStmt | whileStmt assignStmt identitfier := expr ifStmt if expr then stmtList else stmtList fi whileStmt while expr do stmtList od • Expressions are True if the value > 0 Formal Semantics
An example n := 0 - 5 ; if n then i := n else i := 0 - n fi ; f := 1 ; while i do f := f * i ; i := i - 1 od • The semantics is {n=5, i=0, f=120} Formal Semantics
An abstract syntax • A simplified version of the syntax • Useful, since the parsing already done • The program is correct • To make it more compact we use: P: Program L: Statement list S: Statement E: Expression N: Number D: Digit I: Identifier A: Letter Formal Semantics
The abstract syntax P L L L1; L2 | S S I := E | if E then L1else L2fi | while E do L od E E1+ E2 | E1- E2 | E1* E2 | ( E1) | N N N1 D | D D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 I I1 A | A A a | b | c | … | z Formal Semantics
Abstract syntax, cont. • Semantic rules are defined for each right-hand side in terms of the semantics for their parts • Note that the indexes on the letters are important • Important to distinguish between + and the ordinary + • A rule will tell us to replace + by + later Formal Semantics
Operational semantics • Describes how a program is to be executed on a known machine • If the machine is a computer, the operational semantics is a translator (compiler) from the language to machine code for the computer • Fortran and C has been defined this way • The machine could also be an abstract machine, simple enough to be understood and simulated by hand Formal Semantics
A reduction machine • Our example language may be translated to semantic values using reduction rules: ( 3 + 4 ) * 5 (add the numbers) (7) * 5 (drop parentheses) 7 * 5 (multiply the numbers) 35 • The rules are similar to logical inference rules Formal Semantics
Logical inference rules • The rules are written in the form: • premise conclusion • For example: a + b = c b + a = c a b, b c a c Formal Semantics
Logical inference rules, cont. • If we don’t have a premise, the rule is called an axiom: a + 0 = a • Often written as: a + 0 = a Formal Semantics
Reduction rules: arithmetics • We have the following abstract semantics: E E1+ E2 | E1- E2 | E1* E2 | ( E1) | N N N1 D | D D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 E: Expression, N: Number, D: Digit • This may look complicated but we have to give the semantics somewhere Formal Semantics
Reduction rules, cont. • First all the digits (axioms) 0 0, 1 1, …, 9 9 (rule 1) • Numbers (rule 2): V0 10*V, V1 10*V+1,…, V9 10*V+9 • And the operations (rule 3-6): V1+ V2 V1 + V2 V1- V2 V1 - V2 V1* V2 V1 * V2 ( V ) V Formal Semantics
Reduction rules, expressions • Expressions may be reduced in steps if they contain of two expression with an operator between (rule 7-9) E E1 E + E2 E1 + E2 E E1 E - E2 E1 - E2 E E1 E * E2 E1 * E2 Formal Semantics
Reduction rules, cont. • If the left side of an operator has evaluated to a value, then evaluate the right side (rule 10-12): E E1 V + E V + E1 E E1 V - E V- E1 E E1 V * E V* E1 Formal Semantics
Reduction rules, cont. • Reduce the expression inside parentheses: (rule 13) • Transitivity – expressions may be evaluated in steps (rule 14) E E1 ( E ) ( E1 ) E E1, E1 E2 E E2 Formal Semantics
Example of a reduction • We start with the expression: 2 * (3 + 4) – 5 (rule 1, 7) 2 * (3 + 4) – 5 (rule 1, 10) 2 * (3 + 4) – 5 (rule 3) 2 * (7) – 5 (rule 1, 12) 2 * 7 – 5 (rule 5) 14 – 5 (rule 11, 1) 14 – 5 (rule 4) 9 Formal Semantics
Adding environments • Recall the rest of the abstract syntax: P L L L1; L2 | S S I := E | if E then L1else L2fi | while E do L od E E1+ E2 | E1- E2 | E1* E2 | ( E1) | N P: Program L: Statement list S: Statement E: Expression • We have to add an environment: Env : Identifier Integer {undef} • It has to be updated in the reduction rules Formal Semantics
Environments, cont. • <E | Env> indicates that E evaluates in the environment Env • Most rules do not change Env: • If Env changes, we have a side effect <E | Env> <E1 | Env> <E - E2 | Env> <E1 - E2 | Env> Formal Semantics
Environments, cont. • If I has the value V (rule 15): • A rule for assignment (rule 16): <I := V | Env> < | Env & {I = V}> • Expressions in assignments (rule 17): Env (I) = V <I| Env> <V| Env> <E | Env> <E1 | Env> <I := E| Env> <| := E1 | Env> Formal Semantics
Environments, cont. • A statement sequence (rule 18): • A program needs an empty environment (rule 19): L <L | Env0> <S | Env> < | Env1 > <S ; L| Env> <L | Env1> Formal Semantics
A small example • The program: a := 2 + 3 ; b := a * 4 ; a := b - 5 ; • Rule 19 gives: a := 2 + 3 ; b := a * 4 ; a := b - 5 ; < a := 2 + 3 ; b := a * 4 ; a := b - 5 | Env0> Formal Semantics
Example, cont. • Rules 3 gives: < a := 2 + 3 | Env0> < a := 5 | Env0> • And rule 16, 17: < a := 5 | Env0> < | Env0 & {a=5}> {a=5} • Rule 18: < a := 2 + 3 ; b := a * 4 ; a := b - 5 | Env0> <b := a * 4 ; a := b - 5 | {a=5}> Formal Semantics
Example, cont. • Rule 15, 9, 5, 16, 17: <b := a * 4 | {a=5}> <b := 5 * 4 | {a=5}> <b := 20 | {a=5}> <| {a=5} & {b=20}> <| {a=5, b=20} • Rule 18: <b := a * 4 ; a := b - 5 | {a=5}> <a := b – 5 | {a=5, b=20}> • And then: <a := b – 5 | {a=5, b=20}> <a := 15 | {a=5, b=20}> {a=5, b=20} & {a=15} = {a=15, b=20} Formal Semantics
if and while statements • Recall the abstract syntax: S I := E | if E then L1else L2fi | while E do L od I: Identifier L: Statement list S: Statement E: Expression • We need three rules for the if statement: Formal Semantics
The if statement <E | Env> <E1 | Env> <if E then L1 else L2 fi | Env> <if E then L1 else L2 fi | Env> V > 0 <if V then L1 else L2 fi | Env> <L1 | Env> V ≤ 0 <if V then L1 else L2 fi | Env> <L2 | Env> Formal Semantics
The while statement <E | Env> <V | Env>, V ≤ 0 <while E do L od | Env> Env <E | Env> <V | Env>, V > 0 <while E do L od | Env> <L ; while E do L od | Env> • Note that the second rule is recursive! Formal Semantics
A while example i := 3; f := 1 ; while i do f := f * i ; i := i - 1 od <f := f * i | {i=3, f=1}> <f := 1 * 3 | {i=3, f=1}> <f := 3 | {i=3, f=1}> {i=3, f=3} and <i := i - 1 | {i=3, f=3}> {i=2, f=3} Formal Semantics
A while example, cont. < while i do f := f * i ; i := i - 1 od | {i=3, f=1}> <f := f * i ; i := i - 1 ; while i do f := f * i ; i := i - 1 od | {i=3, f=1}> <i := i - 1 ; while i do f := f * i ; i := i - 1 od | {i=3, f=3}> <while i do f := f * i ; i := i - 1 od | {i=2, f=3}> <f := f * i ; i := i - 1 ; while i do f := f * i ; i := i - 1 od | {i=2, f=3}> <i := i - 1 ; while i do f := f * i ; i := i - 1 od | {i=2, f=6}> <while i do f := f * i ; i := i - 1 od | {i=1, f=6}> <f := f * i ; i := i - 1 ; while i do f := f * i ; i := i - 1 od | {i=1, f=6}> <while i do f := f * i ; i := i - 1 od | {i=0, f=6}> {i=0, f=6} • And we are finally done! Formal Semantics
Implementing operational semantics • An “executable specification” • Gives us an interpreter • Now we can test the language before we implement a real compiler • Easily done in Prolog! • 3 * (4 + 5) is represented as the fact: times(3, plus(4, 5)). • a := 2 + 3 ; b:= a * 4 ; a := b - 5 becomes: seq (assign (a, plus(2, 3)), seq (assign (b, times (a, 4)), assign (a, sub (b, 5)))). • This is actually a tree Formal Semantics
Implementing operational semantics, cont. • Reduction rule #3 could look like: reduce (plus(V1, V2), R) :- integer(V1), integer(V2), !, R in V1 + V2. • Rule #7 becomes: reduce (plus (E, E2), plus (E1, E2)) :- reduce (E, E1). • Lookup in an environment (rule 15): reduce (config (I, Env), config (V, Env)) :- atom (I), !, lookup (Env, I, V). • Update an environment (rule 16): reduce (config (assign (I, V), Env), Env1) :- integer (V), !, update (Env, value (I, V), Env1). Formal Semantics
Denotational semantics • Now we will use functions to describe the semantics: Val: Expression Integer • So Val (2 + 3 * 4) should be 14 • Val maps a syntactic domain (the set of correct arithmetic expressions) to a semantic domain (the set of integers) • We need more function to cover the example language: P: Program (Input Output) • Syntactic domain: The set of correct programs • Semantic domain: The set of functions that gives us the correct answer from the possible inputs Formal Semantics
Denotational semantics, cont. • We need three parts • Definitions of the syntactic domains • Definitions of the semantic domains • Definitions of the semantic functions • The syntactic domain looks almost like the abstract syntax: N N D | D D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 N: Number, D: Digit Formal Semantics
The Semantic domain • A set that contains all answers to possible inputs • For the numbers, it’s just {0, 1, …} • The integers with its operations could look like: Domain v: Integer = {…, -2, -1, 0, 1, 2, …} Operations: + : Integer x Integer Integer - : Integer x Integer Integer * : Integer x Integer Integer Formal Semantics
The Semantic function • For each syntactic domain we need a semantic function, for example: D: Digit Integer (defined by:) D[[0]] = 0, D[[1]] = 1, …, D[[9]] = 9 N: Number Integer N[[ND]] = 10 * N[[N]] + N[[D]] N[[D]] = D[[D]] Formal Semantics
Some more parts from the example • The semantic functions for the arithmetics: E: Expression Integer E[[E1+ E2]] = E[[E1]] + E[[E2]] E[[E1* E2]] = E[[E1]] * E[[E2]] E[[( E )]] = E[[E]] E[[N]] = N[[N]] etc. • The complete example in section 13.3.4-13.3.5 Formal Semantics
Environments • The environments forms a new semantic domain Domain Env: Environment = Identifier Integer {undef} • And expressions: E: Expression (Environment Integer┴ • The identifier rule: E[[I]] (Env) = Env (I) Formal Semantics
Statements • Recall the syntactic domain: S I := E | if E then L1else L2fi | while E do L od L: Statement list, S: Statement , E: Expression • And the semantic domain is: S: Statement Environment Environment • The semantic function for the if-statement: S [[if E then L1else L2fi]] (Env) = if E [[E]] (Env) > 0 then L[[L1]](Env) else L[[L2]](Env) Formal Semantics
Implementation • Denotational semantics fits very well into a functional programming language • The abstract syntax may look like: data Expr = Val Int | Ident String | Plus Expr Expr | Minus Expr Expr | Times Expr Expr • If we have implemented the environment with lookup and insert the evaluation function may look like: exprE :: Expr -> Environment -> Int exprE (Plus e1 e2) env = (exprE e1 env) + (exprE e2 env) exprE (Minus e1 e2) env = (exprE e1 env) - (exprE e2 env) exprE (Times e1 e2) env = (exprE e1 env) * (exprE e2 env) exprE (Val n) env = n exprE (Ident a) env = lookup env a Formal Semantics
Axiomatic semantics • The basis for mathematical proofs of programs • We define what should be true before and after the program is evaluated (sometimes called assertions): x := x + 1 • Precondition: {x = A} • Postcondition: {x = A+1} • This example works fine for all values A Formal Semantics
Axiomatic semantics, cont. • We have to make sure that A0 {A0, y=A} x := 1 / y {x = 1 / y} • A sort example: {n≥1, if 1 ≤ i ≤ n then a [i] = A [i]} sort-program {sorted (a), permutation (a, A)} Formal Semantics
Assertions • Some languages have support for assertions, in C for example: #include <assert.h> … assert (y != 0) x = 1/y; … • If y is 0 the program halts without executing the illegal division: Assertion failed at test.c line 27: y != 0 Exiting due to signal SIGABRT • Throwing exceptions in Java also works Formal Semantics
The weakest precondition • If we think of a program and its pre/post conditions as: {P} C {Q} • Then the weakest precondition (wp) is the one that is: • Strong enough as a precondition • Not stronger than any of the other possible preconditions • Example: for C= 1/y, y>0 or y<0 is enough but y0 is weaker, and strong enough • One way to describe {P} C {Q} is: {P} C {Q} iff P wp (C, Q) Formal Semantics