CSE 6341 (755) Programming Languages

CSE 6341 (755)Programming Languages Neelam Soundarajan Computer Sc. & Eng. Dreese Labs 579 e-mail: neelam@cse

Outline • Main Topic: • Ways to formally define syntax and semantics of PLs • Plus … a bit about programming methodologies • Tentative Schedule: • Attribute grammars: 3 weeks • Operational Semantics (including Lisp & its interpreter): 3.5 weeks • Axiomatic Semantics: 3.5 weeks • Denotational Semantics: 1 week • Other topics: ?? • Exams etc.: 0.5 week CSE 6341/755

References Unfortunately, no good books for the course. Mainly depend on slides, class discussions, your notes. DO NOT miss classes. Some useful references: • Formal specification of programming languages, F. Pagan • Formal syntax and semantics of programming languages,Kurtz and Slonnegar • Lisp 1.5 programmer’s manual, McCarthy and others Copies of all on reserve in the Sc./Eng. Library CSE 6341/755

Attribute Grammars (Ref.: Pagan (Ch. 2.3); Kurtz (Ch. 3)) Fact: Context-free conditions: specified using BNF Question: How do we specify context-sensitive (CS) conds? Answer: Using Attribute Grammars (AGs) An AG is: a BNF grammar + attributes + rules for evaluating attributes + conditions (to capture c.s. requirements) CSE 6341/755

Example L = { an bn cn | n >= 1 } <as> ::= a | a<as>1 Na(<as>)← 1 Na(<as>)← Na(<as>1)+1 <bs> ::= b | b<bs>1 Nb(<bs>)← 1 Nb(<bs>)← Nb(<bs>1)+1 <cs> ::= c | c<cs>1 Nc(<cs>)← 1 Nc(<cs>)← Nc(<cs>1)+1 <ls> ::= <as><bs><cs> Cond: Na(<as) = Nb(<bs>) = Nc(<cs>) Na: Synthesized attribute of <as> Nb: Synthesized attribute of <bs> Nc: Synthesized attribute of <cs> No inherited attributes in this grammar. CSE 6341/755

Some Comments • Consider how the grammar works with a parse tree,allowing, say, "aabbcc", and disallowing "aabcc" • Attributes are NOT program variables; can't have:Na(<as>) ← Na(<as>) + 1 • In rules/conditions, can only refer to attributes of non-terminal on the left and the non-terminals in the current alternative. Can't look at "grand children" etc. • Could have used N (instead of Na, Nb, Nc) as the name of all three attributes. CSE 6341/755

Example (revisited) L = { an bn cn | n >= 1 } <ls> ::= <as><bs><cs> ExpNb(<bs>) ← Na(<as>) ExpNc(<cs>) ← Na(<as>) <as> ::= a | a<as>1 Na(<as>)← 1 Na(<as>)← Na(<as>1)+1 <bs> ::= b | b<bs>1 Cond: ExpNb(<bs>) =1 ExpNb(<bs>1) ← ExpNb(<bs>) −1 <cs> ::= c | c<cs>1 Cond: ExpNc(<cs>) =1 ExpNc(<cs>1) ← ExpNc(<cs>) −1 Na: Synthesized attribute of <as> ExpNb: Inherited attribute of <bs> ExpNc: Inherited attribute of <cs> Consider all strings over {a,b,c} (i.e., a's, b's, c's may be in any order) and require: no. of a's = no. of b's = no. of c's (using synh. & synth+inh. attr) CSE 6341/755

Some Comments • An AG is a BNF grammar plus a set of attributes, some synth., others inh. Each attribute is associated with a specific non-terminal. • If S is a synth. attrib. of <N>, then for each alternative in <N>'s production, must have an eval. rule that will be used to compute S(<N>) whenever that alternative is used. • If I is an inh. attrib. of <N>, then for each occurrence of <N> on the right side of any production, must have eval. rule that will be used to computer I(<N>) if that alternative is used. • Conditions are associated with individual alternatives of individual productions. • If any condition in a tree evaluates to false, the tree collapses CSE 6341/755

Context-free condns. using AGs CF conditions can also be expressed using AGs. L = { an bn | n >= 1 } <ls> ::= <as><bs> Cond: Na(<as>) = Nb(<bs>) AGs can be used to capture precedence, i.e., specify how a string is to be parsed: Consider all strings over {a, b}.Given "abab", want to ensure it is parsed as a(b(a(b))), notas (ab)(ab) etc.: <str> ::= a | b N(<str>)← 1 N(<str>)← 1 | <str>1<str>2 N(<str>) ← N(<str>1) + N(<str>2) Cond: (N(<str>1) = 1) CSE 6341/755

Context-sens. condns. in PLs Main condition: Id's that are used must have been declared <prog> ::= <block> <block> ::= begin <decl seq> <stmt seq> end; How to ensure that in <stmt seq> we only use objects declared in <decl seq>? Using synthesized attributes: <block> ::= begin <decl seq> <stmt seq> end; Cond: UsedIds(<stmt seq>)  DeclIds(<decl seq>) Using inherited attributes: <block> ::= begin <decl seq> <stmt seq> end; AllowedIds(<stmt seq>) ← DeclIds(<decl seq>) CSE 6341/755

CS conditions in PLs (contd.) Problem: Nested blocks (what are they?) Solution: Use a sequence of sets, each containing Ids declared in a (surrounding) block <block> ::= begin <decl seq> <stmt seq> end; Nest(<stmt seq>) ← append(Nest(<block>), Decs(<decl seq>)) <stmt seq> ::= <stmt> Nest(<stmt>) ← Nest(<stmt seq>) | <stmt><stmt seq>1Nest(<stmt>), Nest(<stmt seq>1) ← Nest(<stmt seq>) <program> ::= <block> Nest(<block>) ← <> CSE 6341/755

CS conditions in PLs (contd.) Problem: Different types of Ids Solution: An element of Decs() is of the form: ("xy", int), or ("ab", bool), or ("PQ", proc) (For procedures, also need info about no./types of pars) Where do we check (that the CS conds. are satisfied)? <assign> ::= <id> := <int exp>; Cond: lastType(Name(<id>, Nest(<assign>) = int <id> := <bool exp>;Cond: lastType(Name(<id>, Nest(<assign>) = bool CSE 6341/755

CS conditions in PLs (contd.) <proc call> ::= call <id>(); Cond: lastType(Name(<id>), Nest(<proc call>)) = proc parTypes(Name(<id>), Nest(<proc call>)) = <> | call <id>(<arg list); Cond: lastType(Name(<id>), Nest(<proc call>)) = proc parTypes(Name(<id>), Nest(<proc call>)) = ... Question: What about double declarations? <ds> ::= <decl> Decs(<ds>) ← Decs(<decl>); Nest(<decl>)←Nest(<ds>) | <decl> <ds>1 Decs(<ds>) ← Decs(<decl>)  Decs(<ds>1) Nest(<decl>), Nest (<ds>1) ← Nest(<ds>) Cond: Decs(<decl>)  Decs(<ds>1) =  // Not quite? CSE 6341/755

**Note: Maybe better to look at e.g. in pp. 15/16 before grammar rules below** Questions: How do elements get into Decs()? How do procs access surr. block/call other procs? (Ans: Need to pass Nest also to the <decl>s.) <block> ::= begin <decl seq> <stmt seq> end; Nest(<stmt seq>) ← append(Nest(<block>), Decs(<decl seq>)) Nest(<decl seq>) ← append(Nest(<block>), Decs(<decl seq>)) <decl> ::= int <id>; Decs(<decl>) ← {(Name(<id>), int)} | bool <id>; Decs(<decl>) ← {(Name(<id>), bool)} | proc <id>() <block> Decs(<decl>) ← {(Name(<id>), proc, <>)} Nest(<block>) ← Nest(<decl>) // need to change? | proc <id>(<par list>) <block> Decs(<decl>) ← {(Name(<id>), proc, Partypes(<par list>))} Nest(<block>) ← append(Nest(<decl>), Decs(<par list>)) CSE 6341/755

<prog> <block> <ds> <ss> <d> <ds> int X, Y <d> <ds> <d> proc P(int U, bool Y) <block> proc Q() <block> <ss> <ds> <stmt> <d> <d> <block> proc X() proc X() <block> <block> <ds> <ss> CSE 6341/755

<prog> <block> D1 N1 <ds> <ss> D2 N2 D3 N3 <d> <ds> N5 D4 N4 D5 int X, Y <d> <ds> D7 N7 D6 N6 <d> proc P(int U, bool Y) D17 D18 <block> proc Q() <block> D8 N8 D18 N18 <ss> <ds> N9 D10 N10 D9 <stmt> <d> <d> D10 N10 <block> proc X() proc X() <block> <block> D12 N12 <ds> D11 N11 <ss> D14 N14 N13 D13 D15 N15 N16 D16 CSE 6341/755

Translational Semantics AGs also used to specify translations (code generation) (e.g.: YACC) (Ref: Pagan (ch. 3.2); Kurtz (ch. 7)) Basic idea: <stmt> ::= <stmt>1; <stmt>2 Code(<stmt>)← append(Code(<stmt>1), Code(<stmt>2)) but the details are more complex ... E.g.: How to ensure the same label is not used inCode(<stmt>1) and Code(<stmt>2) ? A simple imperative language (we will call it IMP; taken from Pagan): <prog> ::= <stmt> <stmt> ::= skip; | <assign> | <stmt>1;<stmt>2| if <be> then<stmt>1else <stmt>2 | while <be> do <stmt>1 <assign> ::= <id> := <ae>; <ae> ::= <id> | <int> | <ae>1+<ae>2 | <ae>1  <ae>2 | <ae>1 * <ae>2 <be> ::= true | false | <ae>1=<ae>2 | <ae>1 < <ae>2| <be> | <be>1  <be>2 | <be>1  <be>2 CSE 6341/755

Translational Semantics (contd.) Key attributes: • Code: a synth. attribute of <stmt>, <ae>, <be>: The seq. of assembly instructions corresponding to a particular <stmt>, <ae>, <be> • Labin (inh.), Labout (synth.): keep track of next available label • Temp (inh.): keeps track of next available memory loc. for temporary use <prog> ::= <stmt> Code(<prog>) ← Code(<stmt>) Labin(<stmt>) ← 1 <stmt> ::= <stmt>1;<stmt>2 Code(<stmt>) ← append(Code(<stmt>1), Code(<stmt>2)) Labin(<stmt>1) ← Labin(<stmt>) Labin(<stmt>2) ← Labout(<stmt>1) Labout(<stmt>) ← Labout(<stmt>2) CSE 6341/755

Translational Semantics (contd.) <stmt> ::= <assign> Code(<stmt>) ← Code(<assign>) Labout(<stmt>) ← Labin(<stmt>) // why? | if <be> then <stmt>1 else <stmt>2 Labin(<stmt>1) ← Labin(<stmt>) + 2 // why? Labin(<stmt>2) ← Labout(<stmt>1) Labout(<stmt>) ← Labout(<stmt>2) Code(<stmt>) ← append( Code(<be>), ("BZ", Labin(<stmt>)), // slight problem Code(<stmt>1), ("BR", label(Labin(<stmt>)+1)), (label(Labin(<stmt>)) "No-Op"), Code(<stmt>2), (label(Labin(<stmt>)+1) "No-Op") ) | while <be> do <stmt>1 Labin(<stmt>1) ← ... Labout(<stmt>) ← ... Code(<stmt>) ← ... BZ: "branch on zero"; BR: "unconditional branch"; "No-Op": "continue". CSE 6341/755

Translational Semantics (contd.) <assign> ::= <id> := <ae>; Code(<assign>) ← append(Code(<ae>), ("STO", Name(<id>)) Temp(<ae>) ← 1 <ae> ::= <int> Code(<ae>) ← <("LOAD" Value(<int>))> | <id> Code(<ae>) ← <("LOAD" Name(<id>))> | <ae>1+<ae>2 Code(<ae>) ← append( Code(<ae>1), ("STO" temp(Temp(<ae>)), Code(<ae>2), ("ADD" temp(Temp(<ae>))) ) Temp(<ae>1) ←Temp(<ae>) Temp(<ae>2) ←Temp(<ae>) + 1 // why? CSE 6341/755

Static Scope (Algol, Pascal, C, C++,...) Entities accessible in a procedure: Entities declared in that procedure +Entities declared in the “surrounding” procedure (less those with name conflicts) +Entities declared in procedure surrounding the surrounding procedure + ... Visualize: Each procedure is a box whose sides are one-way mirrors: you can look out of the box, but you can’t look into a box Some languages are not quite static scope but are close. CSE 6341/755

Example program A, B, C: integer; Q: procedure // no parameters begin B := B+2; C := C+2; print A, B, C; end (Q); R: procedure A: integer; begin A := 3; C := 2; call Q; B := A+C; print A, B, C; end (R); S: procedure A, C: integer; Q: procedure // nested in S C: integer; begin A := A+1; C := B+1; print A, B, C; end (S.Q); begin // body of S B := 3; C := 1; A := 4; print A, B, C; call R; print A, B, C; end (S); begin // main body A := 1; B := 1; C := 1; call R; print A, B, C; call S; print A, B, C; end (main); C A Q R A S C C Q A

CSE 6341 (755) Programming Languages