Syntax and Semantics: Overview and Description of Programming Languages

CS 331, Principles of Programming Languages Chapter 2

Overview • What’s the difference between syntax and semantics? • Expressions • Grammars • Tree representations • abstract syntax trees • parse trees

How to describe a PL • Tutorials - SNOBOL is still the best example • Reference manuals - ADA • Formal definitions - to describe both syntax and semantics, which is hard • Pascal, ADA, PL/I

Syntax vs. Semantics • Syntax - what is a legal program? • Semantics - what does a (legal) program mean? Three major approaches: • axiomatic, i.e. a set of proof rules • denotational, i.e. mathematical description • operational, i.e. operations on a real or abstract machine

How to describe syntax? • By example? • Possibly ambiguous or incomplete • Used to describe shells, e.g. man pages • By use of a meta-language • Also possibly ambiguous or incomplete • But probably more precise • Possible to give some semantics in the same notation

Context-free Grammars • There are lots of varieties of grammars • regular, context-free, context-sensitive, and unrestricted • CFGs are constrained so that exactly one non-terminal can appear on the left-side of a production • but a non-terminal may appear on the left-side of more than one production

CFG Notation • CFG productions have exactly one non-terminal on the left side, and zero or more non-terminals or terminals on the right side • Usually, nonterminals are enclosed in <anglebrackets> • Terminals (aka tokens) may be quoted for clarity

Backus-Naur Form (BNF) • BNF is a popular notation for CFGs • from a simple subset of Pascal <program> ::= <block> . <block> ::= <statement> <block> ::= begin <statements> end <statements> ::= <statement> <statements> ::= <statement>;<statements> <statement> ::= <if> | <while> | <repeat> | ... <if> ::= if <expr> then <block> <if> ::= if <expr> then <block> else <block>

BNF Operators • Sequence <A> ::= <B> c • Alternation <A> ::= <B> | <C> • Optional <A> ::= <B> [<C>] • Zero or more <A> ::= <B>* • One or more <A> ::= <B>+ • note that <B>* is a shorthand for [<B>+]

Formal Grammars • Set of terminal symbols (or tokens) • Set of non-terminal symbols • A designated start symbol • A set of productions (or rules) that specify how symbols are to be combined to form legal strings • G=<T, N, S, R>

Expressions • Prefix, postfix, or infix • Issues related to operators • arity (unary, binary, ternary, or whatever) • associativity • exponentiation is right-associative, usually • other ops are usually left-associative • precedence • follows rules from arithmetic

- 1 - 2 0 Abstract Syntax Trees • Useful for indicating how an expression is evaluated • The expression 2-0-1 is represented • Or is it?

Examples of Prefix and Postfix • Prefix • LISP operators use prefix • Postfix • Postscript operators use postfix • The simple expression 8-(7*3) is represented as: 8 7 3 mul sub • Old HP calculators did, too - no parens keys

To run LISP in emacs • Invoke emacs • M-x lisp-interaction-mode • type control-j at the end of each line • Or using an inferior emacs lisp process, • M-x ielm

(+ 2 2) 4 (sqrt 9) 3.0 (setq b 6) 6 (setq a 2) 2 (setq c 5) 5 a 2 (- b) -3 (+ (- b) (sqrt( - (* b b) (* (* 4 a) c)))) -2.5358983848622456

Prefix, Infix, Postfix • Given an abstract syntax tree, an expression can be represented in any of the three ways • Consider for example a+b*c/d • What does the abstract syntax tree look like? • What are the prefix and postfix expressions equivalent to the infix form given above?

Ambiguity • There may be many (equivalent) grammars for a language. • There may be more than one way to evaluate a string with respect to a grammar • A grammar is ambiguous if, for any string in the language, that string can be parsed in more than one way.

Dangling-Else • Suppose a grammar has the production • How should we parse this statement? if E then if E2 then S1 else S2 <stmt> ::= if <expr> then <stmt> | if <expr> then <stmt> else <stmt>

A different ambiguity header ::= <header> title (link? | script?) </header> title ::= <title> text </title> link ::= <link> text </link> script ::= <script> text </script> This grammar allows the <link> and <script> constructs to appear in either order. The grammar above is then ambiguous!

header ::= <header> title (link? | script?) </header> title ::= <title> text </title> link ::= <link> text </link> script ::= <script> text </script> How do we parse this string? <header> <title> Some Title </title> </header>

Syntax and Semantics: Overview and Description of Programming Languages

Syntax and Semantics: Overview and Description of Programming Languages

Presentation Transcript

Principles of Programming Languages

CS 331, Principles of Programming Languages

Principles of Programming Languages

Principles of Programming Languages

Principles of Programming Languages

Principles of Programming Languages

CS 331, Principles of Programming Languages

CS 331, Principles of Programming Languages

CS 331, Principles of Programming Languages

CS 331, Principles of Programming Languages

Principles of Programming Languages

Principles of Programming Languages

Principles Of Programming Languages

CS 331, Principles of Programming Languages

CS 331, Principles of Programming Languages

CS 331, Principles of Programming Languages

CS 331, Principles of Programming Languages

Principles of Programming Languages

CS 331, Principles of Programming Languages

CS 320 Principles of Programming Languages Simulating Recursion