1 / 22

Syntax and Semantics

Syntax and Semantics. The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter 3. What is Syntax and Semantics. Syntax and Semantics define a PL Syntax form or structure of program units

kateb
Download Presentation

Syntax and Semantics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Syntax and Semantics • The Purpose of Syntax • Problem of Describing Syntax • Formal Methods of Describing Syntax • Derivations and Parse Trees • Sebesta Chapter 3

  2. What is Syntax and Semantics • Syntax and Semantics define a PL • Syntax • form or structureofprogram units • expressions, statements, declarations, etc. • Semantics • meaningofprogram units • expressions, statements, declarations, etc. • Why do we need language definitions? • to design a language • to implementer a compiler/interpreter • to write a program (use the language)

  3. Syntax Elements • A sentence is • a string of characters over some alphabet • A language is • a set of sentences • A lexeme is • the lowest level syntactic unit of a language • e.g.,*, public, totalCount • A token is • a category of lexemes • e.g., identifier

  4. Describing Syntax • Recognizers • read an input string in the alphabet of the language (a sentence) and decide whether it belongs to the language • used in compilers • see Chapter 4 for details • Generators • produce sentences in a language • a sentence is syntactically correct if it can be generated by the generator

  5. Backus-Naur Form (BNF) • BNF is a meta-language • i.e. a language used to describe another language • invented by John Backus todescribe ALGOL 58 • used by Peter Naur to describe ALGOL 60 • BNF is equivalent to context-free grammars • a BNF grammar is defined by • a set of terminal symbols, • a set of nonterminal symbols • a set of rules • a start symbol (one of the terminal symbols)

  6. BNF Elements • terminal symbols • are the lexemes of the target PL • e.g., while, ( , ) • nonterminal symbols • represent classes of syntactic structures • they act like syntactic variables • e.g., <statement> • rules • define how a nonterminal symbol can by developed into a sequence of nonterminal and terminal symbols • e.g., <while_stmt>while(<logic_expr> )<stmt>

  7. BNF Rules • A rule has • a left-hand side (LHS) • then  • a right-hand side (RHS) • There can be several rules for one LHS <stmt> <assignment> <stmt> begin<stmt_list> end • Syntactic lists are described using recursion <ident_list> ident <ident_list> ident,<ident_list> • A grammar is • a finite nonempty set of rules

  8. EBNF • Extended BNF (EBNF) • is most often used • avoids having numerous rules for the same LHS • Extra meta-symbols (in addition to  ) • [… ] • enclosed symbols are optional (1 or 0 times) • e.g., <if_stmt> if ( <exp>) <stmt> [ else <stmt> ] • {…} • enclosed symbols can be repeated (0 to n times) • e.g., <ident_list> ident{,ident} • …|… • choice of one of the symbol sequences separated by | • e.g., <stmt> <assignment> |begin<stmt_list> end • (…) • groups enclosed symbols

  9. BNF vs. EBNF BNF EBNF <expr> <term> {(+|- )<term> } <term> <factor> { (*|/) <factor> } <factor> <exp> [ **<factor> ] <exp> (<expr> )| id <expr> <expr> + <term> <expr> <expr> - <term> <expr> <term> <term> <term> *<factor> <term> <term> /<factor> <term> <factor> <factor> <exp> **<factor> <factor><exp> <exp>(<expr> ) <exp> id

  10. Augmented EBNF • another meta-symbol = (equal) instead of  • meta-symbols for repetitions +means one or more times *means zero or more times <ident> =<letter>+( <letter> |<digit> )* • rules can use iteration instead of recursion • e.g.: • <stmt_list> <stmt> |<stmt> ; <stmt_list> • can be formulated as • <stmt_list> =<stmt> ( ; <stmt>)*

  11. Context-Free Grammar • Context-Free Grammars (CFG) • defined by Noam Chomsky • meant to describe the syntax of natural languages • Context-Free Grammar G = (S, T, N, P) • S = start symbol • T = set of terminal symbols – lexemes and tokens • N = set of non-terminal symbols - abstractions • P = production rules – definition of a LHS abstraction using RHS • A sentence • a sequence of terminal symbols

  12. A Small Language in EBNF <program> begin<stmt_list> end <stmt_list> <stmt> |<stmt> ;<stmt_list> <stmt> <var> =<expr> <expr> <term> +<term> | <term> -<term> <term> <var> |const <var> a|b|c

  13. Derivation • A derivation is • a repeated application of rules • starting with the start symbol • substitution of a nonterminal LHS by the RHS of a rule • ending with a sentence (all terminal symbols) • Every string of symbols in the derivation is • a sentential form • A sentence is • sentential form with only terminal symbols

  14. Derivation Types • A leftmost derivation • leftmost nonterminal in each sententialform is expanded first • A rightmost derivation • rightmost nonterminal is expanded first • A mixed derivation • an arbitrary nonterminal is expanded

  15. Derivation Example <program> begin<stmt_list> end <stmt_list> <stmt> |<stmt> ;<stmt_list> <stmt> <var> =<expr> <expr> <term> +<term> |<term> -<term> <term> <var> |const <var> a|b|c <program> => begin<stmt_list> end => begin<stmt> end => begin<var> =<expr>end => begin a = <expr> end => begin a = <term> +<term> end => begin a = <var> +<term> end => begin a = b + <term> end => begin a = b + const end

  16. Questions In the preceding slide: • Is the derivation a leftmost or a rightmost derivation? • State the "opposite" derivation. • I.e. if it is a leftmost derivation give rightmost one • or vice versa • What are the terminal symbols of the language, what are the nonterminal symbols and what is the start symbol? • Change a rule so that begin a = - b + const end is a legal sentence

  17. Parse Tree • Parse Tree is • a hierarchical representation of a derivation <program> begin <stmt_list> end <stmt> <var> = <expr> a <term> + <term> <var> const b

  18. Simple Assignment Language EBNF Grammar Parse tree of the sentence: a = b * (a + c) <assign> <assign> <id> =<expr> <expr> <id>+<expr> |<id> *<expr> |(<expr> ) |<id> <id> a|b|c <id> = <expr> a <id> * <expr> b ( <expr> ) <id> + <expr> a <id> c

  19. Ambiguous Grammars • A grammar is ambiguous • if and only if it generates asentential form that has two or more distinct parse trees • e.g. <assign> <id> =<expr> <expr> <expr> +<expr> |<expr> *<expr> | (<expr> ) |<id> <id> a|b|c

  20. Two Distinct Parse Trees add-first parse tree multiply-first parse tree a = b + c * d a = b + c * d <assign> <assign> <id> = <id> = <expr> <expr> a a <expr> * <expr> <expr> <expr> + <expr> + <expr> <id> <id> <expr> <expr> * b <id> <id> <id> <id> d c d b c

  21. An Unambiguous Expression Grammar • The same language can be defined with an unambiguous grammar! <assign> <id> =<expr> <expr> <expr> +<term> |<term> <term> <term>*<factor> |<factor> <factor> (<expr> ) |<id> <id> a|b|c

  22. Precedence Through Grammar • A grammar can enforce the precedence of operators • The parse tree shows how • (low levels are evaluated first) • e.g., <expr> <expr> +<term> |<term> <term> <term> * const | const <expr> <term> <expr> + <term> <term> const * const const

More Related