250 likes | 397 Views
CSC 313 – Advanced Programming Topics. Lecture 6: Building a Compiler – Lessons in Good design. Today’s Goal. Make you forget reading that was assigned I went back & reviewed others; none like this one Highlight process used by modern compilers Consider steps taken & how design works
E N D
CSC 313 – Advanced Programming Topics Lecture 6:Building a Compiler –Lessons in Good design
Today’s Goal • Make you forget reading that was assigned • I went back & reviewed others; none like this one • Highlight process used by modern compilers • Consider steps taken & how design works • Where optimizations occur will be discussed • Review a kick-ass design honed through years • Understand decisions to see good design in action • See design process that results in design patterns
Translations • Translating docs from English to other language • But all of this material selected from college library • Undecipherable & meaningless documents included • Workers highly-trained & so saving time (& $$) critical • Reject untranslatable documents & explain why • Need to give workers chance to correct mistakes • Money important, so limit time on these documents
Translations • Translating docs from English to other language • But all of this material selected from college library • Undecipherable & meaningless documents included • Workers highly-trained & so saving time (& $$) critical • Reject untranslatable documents & explain why • Need to give workers chance to correct mistakes • Money important, so limit time on these documents How can we do this?
What Errors Can Occur? • Document read in unreadable or not English • Has “words”, but grammatically incorrect • Meaningless jumble meanscannot translate
What Errors Can Occur? • Document read in unreadable or not English αυηθιμςτϕϖκλσροπ χψτνγδζεεθμςτυ • Has “words”, but grammatically incorrect • Meaningless jumble meanscannot translate
What Errors Can Occur? • Document read in unreadable or not English αυηθιμςτϕϖκλσροπ χψτνγδζεεθμςτυ • Has “words”, but grammatically incorrect 4 u 2 c I sent U pix. ty • Meaningless jumble meanscannot translate
What Errors Can Occur? • Document read in unreadable or not English αυηθιμςτϕϖκλσροπ χψτνγδζεεθμςτυ • Has “words”, but grammatically incorrect 4 u 2 c I sent U pix. ty • Meaningless jumble meanscannot translate Mary flow over the swimming pizza
How To Translate? BUT WAIT
Money, money, money! • This project useful for many languages • Different alphabetor punctuation may be used • Will need to adjust to different grammar rules • Change in scoping rules & variable declaration types
Money, money, money! • This project useful for many languages • Different alphabetor punctuation may be used • Will need to adjust to different grammar rules • Change in scoping rules & variable declaration types Oops
Money, money, money! • This project useful for many languages • Different alphabetor punctuation may be used • Will need to adjust to different grammar rules • Change in scoping rules & variable declaration types • Could translate to many different languages • (Unless written by Microsoft or in Alabama)
How To Translate? (Redux) • Split translations process into multiple layers • Lexical analyzer will check legal words & punctuation • Grammar errors found by parser (syntactic analysis) • Semantic analysis will check document meaningful • Optimizations happen here, since program is valid • Code generator “lowers” results into new language
All About the Benjamins • Most layers’ actions language-dependent • Reusing details hard, but algorithms very similar • Between layers, pass language-independent data • Define protocols specifying how layers interact • Each layer independent & can change easily • Limit rewriting, instead reuse whenever possible • Must follow protocols, but these set out clearly • Popular approach: gcc, XCode, Google Translate
Planning Data & Interactions • Data to be passed depends on work layers do • Individual words & punctuation output by lexer • Parser goes through this data & find sentences • Build paragraphs by checking sentences meaningful • Interactionsbased on input size needed • Item-by-item analyses needed to build sentences • Semantic analysis works by using whole module • Entire program needed for optimizations & output
Planning Data & Interactions • Data to be passed depends on work layers do • Individual words & punctuation output by lexer • Parser goes through this data & find sentences • Build paragraphs by checking sentences meaningful • Interactionsbased on input size needed • Item iterationneeded to build sentences • Semantic analysis works by using whole module • Entire program needed for optimizations & output
Resulting Design • Lexer is an Iterator for words & punctuation • Illegal characters found & reported as errors • Parser builds syntax trees for each module • Function, class, or file may be language’s module • Must be conservative: outputs result for entire file • Good syntax tree output by semantic analysis • When use found, variable linked to declaration & type • Match calls with functions & all checked if legal • Result called Higher-order Intermediate Representation
Lexical Analysis In Detail • Creates “tokens” from file being compiled • While it does this, checkscharacters are legal • Keywords identified (for, int, while…) in input • Code parsed to identify literals (strings, numbers) • Does not understand code; only splitting it out • Only layer looking at code, so skips comments
How Does Lexer Work? • Regular expressions defined for items to find • They’re back!! Turned into big finite state machine • Defines enum or hierarchy for returned data struct Symbol { Token type; String value;}enum Token {NUMBER, STRING, WORD, GT, , ... }
Why Use State Machines? • Limits knowledge needed for coding a lexer • Machine has set of states connected by edges • State determines next state, given an input • Need to know initial & accepting states only • States define language; processing unchanged
Why Use State Machines? • Limits knowledge needed for coding a lexer • Machine has set of states connected by edges • State determines next state, given an input • Need to know initial & accepting states only • States define language; processing unchanged • Obviously, this is part of State design pattern (Covered later in term, in case you wondering) • Maximizes reuse of code within lexical analysis • Limits rework to developing regular expressions
For Next Lecture • Read pages 37 – 55 in book • Get back into easier code & design patterns • 1-to-many communication is problem why? • Why is this critical to all event-based coding? • Lindsay Lohan'ssecret to success related to this?