1 / 18

Formal Languages and Compiler Design: Overview and Operation

Dive into the intricacies of formal languages and compiler design with this comprehensive course. Explore the history, structure, and algorithms behind compilers, from scanning to symbol table management. Understand key concepts like token classification, error handling, and intermediary code optimization.

lorrij
Download Presentation

Formal Languages and Compiler Design: Overview and Operation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Formal languages and Compiler Design Simona Motogna S. Motogna - LFTC

  2. Why? S. Motogna - LFTC

  3. Organization Issues • Course – 2 h/ week • Seminar – 2h/week • Laboratory - 2 h/week PRESENCE IS MANDATORY 10 presences – seminar 12 presences - lab S. Motogna - LFTC

  4. Organization Issues • Final grade = 70% written exam + 20% lab + 10% seminar Lab: - all laboratory assignments are mandatory - delays NO more than 2 weeks Seminar: - solved problems, answers (blackboard), homeworks S. Motogna - LFTC

  5. References • See fișa disciplinei S. Motogna - LFTC

  6. What is a compiler? Interpreter? Object code / program Source code / program Compiler Assembler? S. Motogna - LFTC

  7. A little bit of history … • Java • 1995 • J. Gosling • C • 1969 - 1973 • D. Ritchie • Pascal • 1968 - 1970 • N. Wirth • Lisp • 1962 • McCarthy • Fortran • 1954-1957 • Backus S. Motogna - LFTC

  8. Structure of a compiler Error handling analysis Source code/ program tokens Syntax tree synthesis Adnotated syntax tree Symbol Table management Intermediary code Object code / program Optimized intermediary code S. Motogna - LFTC

  9. Chapter 1. Scanning Definition = treats the source program as a sequence of characters, detect lexical tokens, classify and codify them INPUT: source program OUTPUT: PIF + ST Algorithm Scanning v1 While (not(eof)) do detect(token); classify(token); codify(token); End_while S. Motogna - LFTC

  10. Detect I am a student. • Separators => Remark 1) + 2) • Look-ahead => Remark 3) if (x==y) {x=y+2} S. Motogna - LFTC

  11. Classify • Classes of tokens: • Identifiers • Constants • Reserved words (keywords) • Separators • Operators • If a token can NOT be classified => LEXICAL ERROR S. Motogna - LFTC

  12. Codify • Codification table • Identifier, constant => Symbol Table (ST) • PIF = Program Internal Form = array of pairs • Token – replaced by pair (code, position in ST) identifier, constant S. Motogna - LFTC

  13. Algorithm Scanning v2 While (not(eof)) do detect(token); iftoken is reserved word OR operator OR separator thengenFIP(code, 0) else if token is identifier OR constant then index = pos(token, ST); genFIP(code, index) else message “Lexical error” endif endif endwhile S. Motogna - LFTC

  14. Remarks: • genFIP = adds a pair (code, position) to PIF • Pos(token,ST) – searches token in symbol table ST; if found then return position; if not found insert in SR and return position • Order of classification (reserved word, then identifier) • If-then-else imbricate => detect error if a token cannot be classified S. Motogna - LFTC

  15. Symbol Table Definition = contains all information collected during compiling regarding the symbolic names from the source program identifiers, constants, etc. Variants: - Unique symbol table – contains all symbolic names - distinct symbol tables: IT (identifiers table) + CT (constants table) S. Motogna - LFTC

  16. ST organization Remark: search and insert • Unsorted table – in order of detection in source code O(n) • Sorted table: alphabetic (numeric) O(lg n) • Binary search tree (balanced) O(lg n) • Hash table O(1) S. Motogna - LFTC

  17. Hash table • K = set of keys (symbolic names) • A = set of positions (|A| = m; m –prime number) h : K → A h(k) = (val(k) mod m) + 1 • Conflicts: k1 ≠ k2 , h(k1) = h(k2) S. Motogna - LFTC

  18. Visibility domain (scope) • Each scope – separate ST • Structure -> inclusion tree S. Motogna - LFTC

More Related