1 / 42

Lesson 1

Lesson 1. CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg. Outline. Introduction to compilers Regular languages Regular expressions Finite automata Grammars. Introduction to compilers. What is a compiler?. Why study compiler theory?. Easily create language processors

kenda
Download Presentation

Lesson 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lesson 1 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg

  2. Outline • Introduction to compilers • Regular languages • Regular expressions • Finite automata • Grammars

  3. Introduction to compilers

  4. What is a compiler?

  5. Why study compiler theory? • Easily create language processors • Parser for configuration files (e.g. XML) • Translator from one language to another • Command line interpreter • Etc...

  6. Why study compiler theory? • Deeper understanding of compilers • How to write efficient code • Understand design decisions in a language

  7. Why study compiler theory? • Improve your programming skills • Top–down design • Thinking “recursively” • Processing of data structures • Easier learning new languages

  8. History • 1st generation: programming binary • 2nd generation: assembly code • 3rd generation: structured languages • Fortran, Ada, C, … • Increased productivity • Reduced logical errors

  9. History • 3rd generation required compilers • First Fortran compiler took 18 man years to complete • Today: a student can develop a compiler in a 10-week course!

  10. Lexical analysis • Characters → tokens • Lexemes

  11. Syntactical analysis (parsing) • Tokens → syntax tree

  12. Semantical analysis

  13. Intermediate code generation

  14. Code optimization

  15. Code generation

  16. Regular languages

  17. Definition: alphabet • Finite set of symbols • Examples: • Latin alphabet: { a, …, z, A, …, Z } • Decimal digits: { 0, …, 9 } • Binary digits: { 0, 1 } • Often denoted Σ

  18. Definition: string • Sequence of symbols from an alphabet • Examples: • ”Hello” over { a, …, z, A, …, Z } • ”16332” over { 0, …, 9 } • Length of a string: |Hello| = 5 • The empty string: ε, |ε| = 0

  19. Definition: language • (In)finite set of strings • Examples: • { January, …, December } • Alphabet: { a, …, z, A, …, Z } • { 0, …, 9, 10, …, 19, 20, … } • Alphabet: { 0, …, 9 } • The empty language: Ø • Does not even contain ε

  20. Operations on languages Let L1 = { ab, cd } and L2 = { ij, kl } • Concatenation: • L1L2 = { abij, abkl, cdij, cdkl } • L2L1 = { ijab, ijcd, klab, klcd } • Union • L1 U L2 = { ab, cb, ij, kl } • Kleeneclosure • L1* = { ε, ab, abab, abcd, abcdab,cdcdab, abcdcdcdab, ... }

  21. Examples of regular languages • Keywords • if, while, public, … • Identifiers • x, tmp1, tmp2, my_func, main, … • Numericliterals • 142, 0x23A0F, 23.8, … • Operators • +, -, +=, (, ), …

  22. Regular expressions • Specify regular languages • Used in e.g. sed, grep, Visual Studio • Mixes symbols and operators: • ab* • (bla)+ • K(ä|je|a|ae)llberg

  23. Operators in regular expressions • Concatenation • Kleene star: * • Union: | • Syntactic sugar • Character classes, +, ?, etc. • Operator precedence: • * and + • Concatenation • Union

  24. Examples of regular expressions • Date strings, e.g. “2011-04-04” • Regular definition: D → 0 | ... | 9 • Regular expression: DDDD-DD-DD

  25. Examples of regular expressions • E-mail addresses: s1@s2, where s1 and s2 are strings of letters, digits, and periods, where a period may not appear at the beginning or the end, and two periods may not appear in succession...

  26. Examples of regular expressions • Regular definition: S→ a | … | z | A | … | Z | 0 | … | 9 • Regular expression: S+(.S+)*@S+(.S+)*

  27. Exercise (1) • Write regular expressions for • valid identifier names in e.g. C, C#, or Java – a letter or an underscore (“_”) follow by zero or more letters, digits, or underscores. • strings over the alphabet { a, b } that begin and end with the same letter. • numbers evenly divisible by 2. Write expressions for both the decimal alphabet and the binary alphabet.

  28. Regular expressions on the web http://www.regular-expressions.info/

  29. DFA • Deterministic Finite Automata • States and state transitions • Consumes strings • Initial and final states

  30. NFA • Nondeterministic Finite Automata • More than one transition per(state, input symbol) • Several initial states • ε transitions

  31. Exercise (2) • Create DFA:s that accept • the languages Ø, { ε }, { a }, S+, and S* • e-mail addresses. • Recall: S+(.S+)*@S+(.S+)* • strings over { a, b } that start and end with the same letter.

  32. Limitations of regular languages • Example:The language of well-formed parenthesis expressions:{ ε, (), (())(), ()(()()), (()()((()))()), ... } • Try to create a finite automaton...

  33. Grammars • More powerful • Example: • Language: { a, ab, abb, abbb, ... } • Grammar:

  34. Grammars S → a B B → ε B → b B • a and b = terminals • S and B = nonterminals • S = starting symbol

  35. Grammars S → a B B → ε B → b B • Derivation of ”abb”: S ⇒ a B ⇒ a b B ⇒ a b b B ⇒ a b b

  36. Grammars S → a B B → ε B → b B • Derivation of ”abb”: S⇒ a B ⇒ a b B ⇒ a b b B ⇒ a b b

  37. Grammars S → a B B → εB → b B • Derivation of ”abb”: S ⇒ a B⇒ a b B ⇒ a b b B ⇒ a b b

  38. Grammars S → a B B → εB → b B • Derivation of ”abb”: S ⇒ a B ⇒ a b B⇒ a b b B ⇒ a b b

  39. Grammars S → a BB → ε B → b B • Derivation of ”abb”: S ⇒ a B ⇒ a b B ⇒ a b b B⇒ a b b

  40. Conclusion • Parts of a compiler • Regular languages • Regular expressions • Finite automata • Grammars

  41. Next time • Context-free languages • More grammars • Parsetrees • Push down automata

More Related