530 likes | 702 Views
Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Michela Pedroni. Lecture 11: Describing the Syntax. Goals of today’s lecture. Learn about languages that describes other languages Read and understand the syntax description for Eiffel
E N D
Einführung in die ProgrammierungIntroduction to ProgrammingProf. Dr. Bertrand MeyerMichela Pedroni Lecture 11: Describing the Syntax
Goals of today’s lecture Learn about languages that describes other languages Read and understand the syntax description for Eiffel Write simple syntax descriptions
Syntax: Conditional A conditionalinstructionconsists, in order, of: An “Ifpart”, of the form ifcondition. A “Thenpart” of the form thencompound. Zero ormore “Else ifparts”, each of the formelseifcondition thencompound. Zero orone “Else part” of the form elsecompound Thekeywordend. Hereeachconditionis a booleanexpression, and eachcompoundis a compound instruction.
Why describe the syntax formally? We know syntax descriptions for human languages: • e.g. grammar for German, French, … • Expressed in natural language • Good enough for human use • Ambiguous, like human languages themselves
The power of human mind I cdnoultblveleetaht I cluodaulacityuesdnatnrdwaht I was rdgnieg. The PaomnnehalPweor of the HmuanMnidAoccdrnig to a rscheearch at CmabrigdeUinervtisy, is deosn'tmttaer in wahtoredr the ltteers in a wrod are, the olnyiprmoatnttihng is taht the frist and lsatltteer be in the rghitpclae. The rset can be a taotlmses and you can sitllraed it wouthit any porbelm. Tihs is bcuseae the huamnmniddeos not raederveylteter by istlef, but the wrod as a wlohe. PtretyAmzanig Huh?
Why describe the syntax formally? Programming languages need better descriptions: • More precise: must tell us unambiguously whether given program text is legal or not • Use formalism similar to mathematics • Can be fed into compilersfor automatic processing of programs
Compilers need strict formal definition of programming language Why describe the syntax formally? Compilersuse algorithms to Determine if input is correct program text (parser) Analyze program text to extract specimens for abstract syntax tree Translate program text to machine instructions
Formal Description of Syntax Use formal language to describe programming languages. Languages used to describe other languages are called Meta-Languages Meta-Language used to describe Eiffel: BNF-E(Variant of the Backus-Naur-Form, BNF)
History 1954 FORTRAN: First widely recognized programming language (developed by John Backus et Al.) 1958 ALGOL 58: Joint work of European and American groups 1960 ALGOL 60: Preparation showed a need for a formal description John Backus (member of ALGOL team) proposed Backus-Normal-Form (BNF) 1964: Donald Knuth suggested acknowledging Peter Naur for his contribution Backus-Naur-Form Many variants since then, e.g. graphical variant by Niklaus Wirth
Formal description of a language BNF lets us describe syntacticalproperties of a language Remember: Description of a programming language also includes lexical andsemanticproperties other tools
Formal Description of Syntax A language is a set of phrases A phrase is a finite sequence of tokens from a certain “vocabulary” Not every possible sequence is a phrase of the language A grammar specifies which sequences are phrases and which are not BNF is used to define agrammarfor a programming language
Example of phrases class age: INTEGER -- Age end PERSON feature class PERSON feature age: INTEGER -- Age end
Grammar • Definition • A Grammar for a language is a finite set of rules for • producing phrases, such that: • Any sequence obtained by a finite number of applications of rules from the grammar is a phrase of the language. • Any phrase of the language can be obtained by a finite number of applications of rules from the grammar.
Elements of a grammar: Terminals TerminalsTokens of the language that are not defined by a production of the grammar. E.g. keywords from Eiffel such as if,then,endor symbols such as the semicolon “;” or the assignment “:=”
Elements of a grammar: Nonterminals Nonterminals Names of syntactical structures or substructures used to build phrases.
Elements of a grammar: Productions ProductionsRules that define nonterminals of the grammar using a combination of terminals and (other) nonterminals
Conditional: else end An example production Instruction Condition Instruction then if Terminal Nonterminal Production
A B BNF Elements: Concatenation Graphical representation: BNF: A B Meaning:Afollowed by B
A BNF Elements: Optional Graphical representation: BNF: [A ] Meaning:Aor nothing
A B BNF Elements: Choice Graphical representation: BNF: A|B Meaning: either Aor B
A BNF Elements: Repetition Graphical representation: BNF: { A }* Meaning: sequence of zero or more A
A BNF Elements: Repetition, once or more Graphical representation: BNF: { A}+ Meaning: sequence of one or more A
A A A B B A A BNF elements: Overview Concatenation: AB Optional: [ A ] Choice: A | B Repetition (zero or more): { A }* Repetition (at least once): { A }+
float_number: digit: A simple example 0 - digit 1 digit . 2 3 Examplephrases: .76 -.76 1.56 12.845 -1.34 13.0 Translate it to written form! 4 5 6 7 8 9
digit A simple example written in BNF: | | | | | | | | | 0 1 2 3 4 5 6 7 8 9 [ ] { }* { }+ - = = float_number digit digit .
Conditional: else end Conditional ] [ condition instruction instruction then end else if BNF Elements Combined instruction = condition instruction then if written in BNF:
[ ] if Then_part_list Else_part end { }* Then_part elseif Then_part else Compound BNF: Conditional with elseif Conditional = = = = Then_part_list then Then_part Boolean_expression Compound Else_part
If_part Then_part Else_list end Boolean_expression if Compound then { }* [ ] Elseif_part Compound else Boolean_expression Then_part elseif Different Grammar for Conditional Conditional If_part = = = = = Then_part Else_list Elseif_part
Simple BNF example • SentenceI[don’t ] Verb Names Quant Names Name {andName}* Name tomatoes|shoes|books|football Verb like|hate Quant a lot| alittle Which of the following phrases are correct? I like tomatoes and football I don’t like tomatoes a little I hate football a lot I like shoes and tomatoes a little I don’t hate tomatoes, football and books a lot Rewrite the BNF to include the incorrect phrases = = = = =
Simple BNF example (Solution) Which of the following phrases are correct? -I like tomatoes and football I don’t like tomatoes a little I hate football a lot I like shoes and tomatoes a little -I don’t hate tomatoes, football and books a lot Rewrite the BNF to include the incorrect phrases SentenceI[ don’t ] Verb Names [Quant] Names Name [{, Name}* and Name] Nametomatoes|shoes| books|football Verblike|hate Quanta lot|alittle = = = = =
BNF-E Used in official description of Eiffel. Every Production is one of ConcatenationABC[D] ChoiceAB|C|D RepetitionA{Bterminal... }* (also with +) = = = = A[ B { terminalB }* ]
BNF-E Rules • Every nonterminal must appear on the left-hand side of exactlyoneproduction, called its defining production • Every production must be of onekind: Concatenation, Choice or Repetition
Conditional with elseif (BNF) [ ] Conditional if Then_part_list Else_part end { }* = = = = Then_part_list Then_part elseif Then_part Then_part Boolean_expression then Compound Else_part else Compound
BNF-E: Conditional [ ] Conditional if Then_part_list Else_part end { ... }+ = = = = Then_part_list Then_part elseif Then_part Boolean_expression then Compound Else_part else Compound
Recursive grammars Constructs may be nested Express this in BNF with recursive grammars Recursion: circular dependency of productions
Recursive grammars Conditionals can be nested within conditionals: Else_part else Compound = = = { …}* ; Compound Instruction | | | Instruction Conditional Loop Call ...
Recursive grammars Production name can be used in its own definition Definition of Then_part_listwith repetition: Recursive definition of Then_part_list: = = { …}* Then_part_list Then_part elseif [ ] Then_part_list Then_part elseif Then_part_list
if a = bthen a := a - 1 b := b + 1 elseif a > bthen a := a + 1 else b := b + 1 end Conditional = = = = Then_part_list Else_part Conditional end if [ ] { Then_part Then_part_list elseif }+ ... Then_part Boolean_expression Compound then Compound Else_part else
BNF for simple arithmetic expressions Assume Number is defined as a positive integer number, and Variable is a one-letter alphabetical word. Is this a recursive grammar? How would the same grammar in BNF-E look like? Which of the following phrases are correct? a a + b -a + b a * 7 + b 7 / (3 * 12) – 7 (3 * 7) (5 + a ( 7 * b)) Expr Term { Add_op Term }* Term Factor { Mult_op Factor}* Factor Number | Variable | Nested Nested ( Expr ) Add_op + | – Mult_op * | / = = = = = =
BNF for simple arithmetic expressions (Solution) Expr {Term Add_op …}+ Term { Factor Mult_op …}+ Factor Number | Variable | Nested Nested ( Expr ) Add_op + | – Mult_op * | / Is this a recursive grammar? Yes (see Nested) How would the same grammar in BNF-E look like? (see yellow box below) Which of the following phrases are correct? a a + b -a + b- a * 7 + b 7 / (3 * 12) – 7 (3 * 7) (5 + a ( 7 * b)) - = = = = = =
Guidelines for Grammars Keep productions short. easier to read better assessment of language size Conditional ifBoolean_expressionthenCompound { elseifBoolean_expressionthenCompound }* [ elseCompound ] end =
Guidelines for Grammars Treatlexical constructs like terminals Identifiers Constant values IdentifierLetter {Letter| Digit | "_”}* Integer_constant [-]{Digit}+ Floating_point [-]{Digit}*“." {Digit}+ Letter "A" | "B" | ... | "Z" | "a" | ... | "z" Digit "0" | "1" | ... | "9“ = = = = =
Guidelines for Grammars Use unambiguous productions. Applicable production can be found by looking at one lexical element at a time ConditionalifThen_part_list[ Else_part ] end Compound { Instruction }* InstructionConditional | Loop | Call | ... = = =
One more exercise • Define a recursive grammar in BNF-E for boolean expressions with the following description: • Simple expressions limited to the variable identifiers x, y, or z, that contain operations of unary not, and binary and, or, and impliestogether with parentheses. • Valid phrases would include • not x and not y • (x or y implies z) • y or (z) • (x)
Solution binary expressions • B_expr ::= With_par | Expr • With_par ::= “(” Expr “)” • Expr ::= Not_term | Bin_term | Variable • Bin_term ::= B_exprBin_opB_expr • Bin_op ::= “implies”| “or” | “and” • Not_term ::= “not” B_expr • Variable ::= “x” | “y” | “z” • Note: Here we are using • ::= instead of • “x” to denote terminals =
Writing a Parser One feature per Production Concatenation:Sequence of feature calls for Nonterminals, checks for Terminals Choice:Conditional with Compound per alternative Repetition:Loop
Writing a Parser: EiffelParse Automatic generation of abstract syntax tree for phrase Based on BNF-E One class per production Classes inherit from predefined classes AGGREGATE, CHOICE, REPETITION, TERMINAL Featureproductiondefines Production
Writing a Parser: Tools Yooc: Translates BNF-E to EiffelParse classes Yacc / Bison: Translates BNF to C parser
BNF alike syntax descriptions DTD: Description of XML documents • <!ELEMENT collection (recipe*)> • <!ELEMENT recipe (title, ingredient*,preparation)> • <!ELEMENT title (#PCDATA)> • <!ELEMENT ingredient (ingredient*,preparation)?> • <!ATTLIST ingredient name CDATA #REQUIRED • amount CDATA #IMPLIED • unit CDATA #IMPLIED> • <!ELEMENT preparation (step*)> • <!ELEMENT step (#PCDATA)>
BNF alike syntax descriptions Unix/Linux: Synopsis of commands SYNOPSIS man [-acdfFhkKtwW] [--path] [-msystem] [-pstring] [-Cconfig_file] [-Mpathlist] [-Ppager] [-Ssection_list] [section] name ...