1 / 13

Description of programming languages

Description of programming languages. Using regular expressions and context free grammars. Introduction. Programming languages must be described in an exact language No discussion whether a language element is legal or not I will introduce 2 description languages Regular expressions

anja
Download Presentation

Description of programming languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Description of programming languages Using regular expressions and context free grammars Description of programming languages

  2. Introduction • Programming languages must be described in an exact language • No discussion whether a language element is legal or not • I will introduce 2 description languages • Regular expressions • Used to describes the “small” parts of a programming language • Identifiers, numbers, etc. • Context free grammars • Used to describes the “bigger” parts of a programming language • Expressions, statements, classes, etc. Description of programming languages

  3. Regular expressions defined • We need an alphabet called Σ • Example alphabets: ASCII, UNICODE • Regular expressions are sets • Ø (the empty set) is a regular expression • { ε } is a regular set • ε means the empty string • All sets {a} where a is in the alphabet Σ are regular expressions • From two regular expressions R and S we can generate more regular expressions • R | S R U S • RS Concatenations of strings from R and from S • R* if R is {a} then R* is {ε, a, aa, aaa, … } Description of programming languages

  4. Regular expressions examples • Set of positive integers • (0|1|2|3|4|5|6|7|8|9) (0|1|2|3|4|5|6|7|8|9)* • Set of words in English • (a|b|…|z)(a|b|…|z)* • Not exactly English … • bbz is in the set, but is not an English word Description of programming languages

  5. Regular expressions, short hand notation • R+ means R R* • 1 or more occurrences • R? means ε | R • 0 or 1 occurrence • [a-z] means a|b|c|…|z • [a-zA-Z] means [a-z] | [A-Z] • Examples • Integer: -?[0-9]+ • Identifier: [a-zA-Z][a-zA-Z0-9]* Description of programming languages

  6. Regular expressions in Java • Java API which uses regular expressions • Class String • String[].split(String regex) • “Java is my favorite language”.split(“ “) • produces an array {Java, is, my, favorite, language} • “ “ is a very simple regular expression • Package java.util.regex • Class Pattern • Class Matcher Description of programming languages

  7. What regular expressions can’t do • Regular expression can describe simple languages. • Regular expressions have no “memory” • Cannot describe parenthesis structures • (((a + b) + c) + d) • if (…) { if (…) … else …} else … • We need something stronger! • Context free grammars Description of programming languages

  8. Context free grammars defined • A context free grammar consists of 4 parts • V is an alphabet • Σ is a set of terminals,Σ⊂ V • The elements of the set V − Σ are called non-terminals • R is a set of production rules, (V − Σ) X V* • S the start symbol, S ∈ V − Σ Description of programming languages

  9. Context free grammars examples • Example a, b • Alphabet {a, b, A} • Terminals { a, b } • Non-terminals { A } • Production • {A → Aa, A → Ab, A → a, A → b} • Some derivations • A → Aa → Aaa → Abaa → abaa • A → Ab → ab • A → Ab → bb Description of programming languages

  10. We only state the productions explicitly Terminals and non-terminals can be inferred by looking at the productions Convention Capital letters: Non-terminals Non-capital letters: Terminals Boolean expressions E → true E → false E → E && E E → E || E E → (E) E → !E Derivations E → E && E → E && (E) → E && (E || E) →* true && (false || true) Sometimes pictured as a (parse) tree. Example: Boolean expressions Description of programming languages

  11. What context free grammars can’t do • Context free grammars cannot be used to check that a variable is declared before it is used • And by no means to check the variables type Description of programming languages

  12. The phases of a compiler • Lexical analysis (scanning) • Using regular expressions • Syntax analysis (parsing) • Using context free grammars • Semantic analysis • Using a symbol table • Code generation Description of programming languages

  13. References • Wikipedia • Regular expression http://en.wikipedia.org/wiki/Regular_expression • Context-free grammar http://en.wikipedia.org/wiki/Context-free_grammar • FriedlMastering Regular Expressions, 2nd edition, O’Reilly 2002 • An entire book (460 pages) devoted to regular expressions • J2SE 5.0 API specification • package java.util.regex • Scott A. HommelRegular Expressions, The Java Tutorial • http://java.sun.com/docs/books/tutorial/extra/regex/index.html • Lewis & PapadimitriouElements of the Theory of Computation, Pearson 1997 • Introduction to regular expressions and context free grammars (and a lot more) • Aho, Sethi & UllmanCompilers: Principles, Techniques and Tools, Addison Wesley 1986 • A famous book on compilers. • Referred to as “The Dragon Book” Description of programming languages

More Related