1 / 17

Compilation: Backus-Naur Form (BNF) and Context Free Grammars (CFGs)

Compilation: Backus-Naur Form (BNF) and Context Free Grammars (CFGs). We care about. Completeness of specification Determination of legal expressions Resolution of ambiguities Avoid small mistakes causing major errors. BNF/CFG. Backus-Naur Form Context Free Grammars

Download Presentation

Compilation: Backus-Naur Form (BNF) and Context Free Grammars (CFGs)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compilation: Backus-Naur Form (BNF) and Context Free Grammars (CFGs)

  2. We care about • Completeness of specification • Determination of legal expressions • Resolution of ambiguities • Avoid small mistakes causing major errors

  3. BNF/CFG • Backus-Naur Form • Context Free Grammars • Similar mechanisms for specifying legal syntax • Both focused on production rules

  4. t t t

  5. t t t

  6. Phases of compilation lexical analysis (tokenization) – grouping the characters into tokens (int, {, x, etc) (linear scan) Syntax analysis (parsing) – grouping tokens into expressions or statements ( int x = 10;) (recursion; CFG) semantic analysis (Syntax-directed translation), - type checking, implicit typecasting, check indices to arrays, variables declared before use, etc. Necessary because most programming languages can't be completely captured by CFG code generation (next) – actually generating assembler code code optimization – looking for ways to make the assembler faster

  7. Code generation: Using the parse tree to generate assembler code To prune the leaves of a parse tree means to eliminate all the leaves of a node, replacing the leaves (and their parent) with the intended “meaning” (a value or a chunk of code) After code generation, code optimizer

  8. Parse trees are central to compiler theory • They allow us to identify the production rule corresponding to the chunk of code, and from there replace that chunk of code with a chunk of assembly.

  9. Ambiguity • A grammar that produces more than one derivation for the same sentence is ambiguous • Eg: E E+E | E*E | (E) | -E | id • There are two derivations for id+id*id – find them both.

  10. Ambiguity continued • There are some languages for which there is no unambiguous grammar • HOWEVER there are some rules of thumb we can use to help us deal with many ambiguous grammars • Ambiguity often arises if the right side of a production rule contains 2 or more occurrences of the same non-terminal • Ambiguity can cause problems if the grammar needs to observe rules of precedence or associativity

  11. E E+E | E*E | (E) | -E | id

  12. E -> E+T | T • T -> T*F | F • F -> -P | P • P ->(E) | id

  13. Derivations • A leftmost derivation is one in which only the leftmost non-terminal in a sentential form is replaced at each step. • A rightmost derivation is one in which only the rightmost non-terminal in a sentential form is replaced at each step • Remember ambiguity? Leftmost and rightmost derivations are usually unique • Generally pick leftmost or rightmost and stick with it • if they generate different parse trees the language is ambiguous (not iff) • In general, proving a grammar is unambiguous is undecidable

  14. Leftmost/Rightmost derivation example

  15. 2*3+(1+2)*4 • E -> E+T | T • T -> T*F | F • F -> -P | P • P ->(E) | id

  16. Compilation vs interpretation An interpreted language is a programming language for which most of its implementations execute instructions directly, without previously compiling a program into machine-language instructions. The interpreter executes the program directly, translating each statement into a sequence of one or more subroutines already compiled into machine code.

More Related