130 likes | 260 Views
Description of programming languages. Using regular expressions and context free grammars. Introduction. Programming languages must be described in an exact language No discussion whether a language element is legal or not I will introduce 2 description languages Regular expressions
E N D
Description of programming languages Using regular expressions and context free grammars Description of programming languages
Introduction • Programming languages must be described in an exact language • No discussion whether a language element is legal or not • I will introduce 2 description languages • Regular expressions • Used to describes the “small” parts of a programming language • Identifiers, numbers, etc. • Context free grammars • Used to describes the “bigger” parts of a programming language • Expressions, statements, classes, etc. Description of programming languages
Regular expressions defined • We need an alphabet called Σ • Example alphabets: ASCII, UNICODE • Regular expressions are sets • Ø (the empty set) is a regular expression • { ε } is a regular set • ε means the empty string • All sets {a} where a is in the alphabet Σ are regular expressions • From two regular expressions R and S we can generate more regular expressions • R | S R U S • RS Concatenations of strings from R and from S • R* if R is {a} then R* is {ε, a, aa, aaa, … } Description of programming languages
Regular expressions examples • Set of positive integers • (0|1|2|3|4|5|6|7|8|9) (0|1|2|3|4|5|6|7|8|9)* • Set of words in English • (a|b|…|z)(a|b|…|z)* • Not exactly English … • bbz is in the set, but is not an English word Description of programming languages
Regular expressions, short hand notation • R+ means R R* • 1 or more occurrences • R? means ε | R • 0 or 1 occurrence • [a-z] means a|b|c|…|z • [a-zA-Z] means [a-z] | [A-Z] • Examples • Integer: -?[0-9]+ • Identifier: [a-zA-Z][a-zA-Z0-9]* Description of programming languages
Regular expressions in Java • Java API which uses regular expressions • Class String • String[].split(String regex) • “Java is my favorite language”.split(“ “) • produces an array {Java, is, my, favorite, language} • “ “ is a very simple regular expression • Package java.util.regex • Class Pattern • Class Matcher Description of programming languages
What regular expressions can’t do • Regular expression can describe simple languages. • Regular expressions have no “memory” • Cannot describe parenthesis structures • (((a + b) + c) + d) • if (…) { if (…) … else …} else … • We need something stronger! • Context free grammars Description of programming languages
Context free grammars defined • A context free grammar consists of 4 parts • V is an alphabet • Σ is a set of terminals,Σ⊂ V • The elements of the set V − Σ are called non-terminals • R is a set of production rules, (V − Σ) X V* • S the start symbol, S ∈ V − Σ Description of programming languages
Context free grammars examples • Example a, b • Alphabet {a, b, A} • Terminals { a, b } • Non-terminals { A } • Production • {A → Aa, A → Ab, A → a, A → b} • Some derivations • A → Aa → Aaa → Abaa → abaa • A → Ab → ab • A → Ab → bb Description of programming languages
We only state the productions explicitly Terminals and non-terminals can be inferred by looking at the productions Convention Capital letters: Non-terminals Non-capital letters: Terminals Boolean expressions E → true E → false E → E && E E → E || E E → (E) E → !E Derivations E → E && E → E && (E) → E && (E || E) →* true && (false || true) Sometimes pictured as a (parse) tree. Example: Boolean expressions Description of programming languages
What context free grammars can’t do • Context free grammars cannot be used to check that a variable is declared before it is used • And by no means to check the variables type Description of programming languages
The phases of a compiler • Lexical analysis (scanning) • Using regular expressions • Syntax analysis (parsing) • Using context free grammars • Semantic analysis • Using a symbol table • Code generation Description of programming languages
References • Wikipedia • Regular expression http://en.wikipedia.org/wiki/Regular_expression • Context-free grammar http://en.wikipedia.org/wiki/Context-free_grammar • FriedlMastering Regular Expressions, 2nd edition, O’Reilly 2002 • An entire book (460 pages) devoted to regular expressions • J2SE 5.0 API specification • package java.util.regex • Scott A. HommelRegular Expressions, The Java Tutorial • http://java.sun.com/docs/books/tutorial/extra/regex/index.html • Lewis & PapadimitriouElements of the Theory of Computation, Pearson 1997 • Introduction to regular expressions and context free grammars (and a lot more) • Aho, Sethi & UllmanCompilers: Principles, Techniques and Tools, Addison Wesley 1986 • A famous book on compilers. • Referred to as “The Dragon Book” Description of programming languages