420 likes | 483 Views
Compiler Construction. Overview . Today ’ s Goals. Summary of the subjects we ’ ve covered Perspectives and final remarks. High-level View. Definitions Compiler consumes source code & produces target code usually translate high-level language programs into machine code
E N D
Compiler Construction Overview
Today’s Goals • Summary of the subjects we’ve covered • Perspectives and final remarks
High-level View Definitions • Compiler consumes source code & produces target code • usually translate high-level language programs into machine code • Interpreter consumes executables & produces results • virtual machine for the input code
Why Study Compilers? • Compilers are important • Enabling technology for languages, software development • Allow programmers to focus on problem solving, hiding the hardware complexity • Responsible for good system performance • Compilers are useful • Language processing is broadly applicable • Compilers are fun • Combine theory and practice • Overlap with other CS subjects • Hard problems • Engineering and trade-offs • Got a taste in the labs!
Lexical Analysis • Scanner • Maps character stream into tokens • Automate scanner construction • Define tokens using Regular Expressions • Construct NFA (Nondeterministic Finite Automata) to recognize REs • Transform NFA to DFA • Convert NFA to DFA through subset construction • DFA minimization (set split) • Building scanners from DFA • Tools • ANTLR, lex
Syntax Analysis • Parsing language using CFG (context-free grammar) • CFG grammar theory • Derivation • Parse tree • Grammar ambiguity • Parsing • Top-down parsing • recursive descent • table-driven LL(1) • Bottom-up parsing • LR(1) shift reduce parsing • Operator precedence parsing
Top-down Predictive Parsing • Basic idea Build parse tree from root. Given A → α | β,use look-ahead symbol to choose between α & β • Recursive descent • Table-driven LL(1) • Left recursion elimination
Bottom-up Shift-Reduce Parsing • Build reverse rightmost derivation • The key is to find handle (rhs of production) • All active handles include top of stack (TOS) • Shift inputs until TOS is right end of a handle • Language of handles is regular (finite) • Build a handle-recognizing DFA • ACTION & GOTO tables encode the DFA
Semantic Analysis • Analyze context and semantics • types and other semantic checks • Attribute grammar • associate evaluation rules with grammar production • Ad-hoc • build symbol table
Intermediate Representation • Front-end translates program into IR format for further analysis and optimization • IR encodes the compiler’s knowledge of the program • Largely machine-independent • Move closer to standard machine model • AST Tree: high-level • Linear IR: low-level • ILOC 3-address code • Assembly-level operations • Expose control flow, memory addressing • unlimited virtual registers
Procedure Abstraction • Procedure is key language construct for building large systems • Name Space • Caller-callee interface: linkage convention • Control transfer • Context protection • Parameter passing and return value • Run-time support for nested scopes • Activation record, access link, display • Inheritance and dynamic dispatch for OO • multiple inheritance • virtual method table
The Back-end Instruction selection • Mapping IR into assembly code • Assumes a fixed storage mapping & code shape • Combining operations, using address modes Instruction scheduling • Reordering operations to hide latencies • Assumes a fixed program (set of operations) • Changes demand for registers Register allocation • Deciding which values will reside in registers • Changes the storage mapping, may add false sharing • Concerns about placement of data & memory operations
Code Generation • Expressions • Recursive tree walk on AST • Direct integration with parser • Assignment • Array reference • Boolean & Relational Values • If-then-else • Case • Loop • Procedure call
Instruction Selection • Hand-coded tree-walk code generator • Automatic instruction selection • Pattern matching • Peephole Matching • Tree-pattern matching through tiling
Instruction Scheduling The Problem Given a code fragment for some target machine and the latencies for each individual operation, reorder the operations to minimize execution time Build Precedence Graph List scheduling NP-complete problem Heuristics work well for basic blocks • forward list scheduling • backward list scheduling Scheduling for larger regions • EBB and cloning • Trace scheduling
Register Allocation • Local register allocation • top-down • bottom-up • Global register allocation • Find live-range • Build an interference graph GI • Construct a k-coloring of interference graph • Map colors onto physical registers
Web-based Live Ranges • Connect common defs and uses • Solve the Reaching data-flow problem!
Interference Graph The interference graph, GI • Nodes in GI represent live ranges • Edges in GI represent individual interferences • For x, y ∈ GI, <x,y> ∈ iff x and y interfere • A k-coloring of GI can be mapped into an • allocation to k registers
Key Observation on Coloring • Any vertex n that has fewer than k neighbors in the interference graph (n°< k) can always be colored ! • Remove nodes n°< k for GI’, coloring for GI’ is also coloring for GI
Chaitin’s Algorithm • While ∃ vertices with < k neighbors in GI • Pick any vertex n such that n°< k and put it on the stack • Remove that vertex and all edges incident to it from GI • This will lower the degree of n’s neighbors • If GI is non-empty (all vertices have k or more neighbors) then: • Pick a vertex n (using some heuristic) and spill the live range associated with n • Remove vertex n from GI , along with all edges incident to it and put it on the stack • If this causes some vertex in GI to have fewer than k neighbors, then go to step 1; otherwise, repeat step 2 • If no spill, successively pop vertices off the stack and color them in the lowest color not used by some neighbor; otherwise, insert spill code, recompute GI and start from step 1
Brigg’s Improvement Nodes can still be colored even with > k neighbors if some neighbors have same color • While ∃ vertices with < k neighbors in GI • Pick any vertex n such that n°< k and put it on the stack • Remove that vertex and all edges incident to it from GI • This may create vertices with fewer than k neighbors • If GI is non-empty (all vertices have k or more neighbors) then: • Pick a vertex n (using some heuristic condition), push n on the stack and remove n from GI , along with all edges incident to it • If this causes some vertex in GI to have fewer than k neighbors, then go to step 1; otherwise, repeat step 2 • Successively pop vertices off the stack and color them in the lowest color not used by some neighbor • If some vertex cannot be colored, then pick an uncolored vertex to spill, spill it, and restart at step 1
Principles of Compiler Optimization • safety • Does applying the transformation change the results of executing the code? • profitability • Is there a reasonable expectation that applying the transformation will improve the code? • opportunity • Can we efficiently and frequently find places to apply optimization • Optimizing compiler • Program Analysis • Program Transformation
Program Analysis • Control-flow analysis • Data-flow analysis
Control Flow Analysis • Basic blocks • Control flow graph • Dominator tree • Natural loops • Dominance frontier • the join points for SSA • insert Ф node
Data Flow Analysis • “compile-time reasoning about the runtime flow of values” • represent effects of each basic block • propagate facts around control flow graph
DFA: The Big Picture • Set up a set of equations that relate program properties at different program points in terms of the properties at "nearby" program points • Transfer function • Forward analysis: compute OUT(B) in terms IN(B) • Available expressions • Reaching definition • Backward analysis: compute IN(B) in terms of OUT(B) • Variable liveness • Very busy expressions • Meet function for join points • Forward analysis: combine OUT(p) of predecessors to form IN(B) • Backward analysis: combine IN(s) of successors to form OUT(B)
Available Expression Basic block b • IN(b): expressions available at b’s entry • OUT(b): expressiongs available at b’s exit • Local sets • def(b): expressions defined in b and available on exit • killed(b): expressions killed in b • An expression is killed in b if operands are assigned in b • Transfer function • OUT(b) = def(b) ∪ (IN(b) – killed(b)) • Meet function • IN(b) =
More Data Flow Problems • AVAIL Equations • More data flow problems • Reaching Definition • Liveness
Compiler Optimization • Local optimization • DAG CSE • Value numbering • Global optimization enabled by DFA • Global CSE (AVAIL) • Constant propagation (Def-Use) • Dead code elimination (Use-Def) • Advanced topic: SSA
Perspective • Front end: essentially solved problem • Middle end: domain-specific language • Back end: new architecture • Verifying compiler, reliability, security
Interesting Stuff We Skipped • Interprocedural analysis • Alias (pointer) analysis • Garbage collection • Check the literature reference in EaC
How will you use the knowledge? • As informed programmer • As informed small language designer • As informed hardware engineer • As compiler writer
Informed Programmer • “Knowledge is power” • Compiler is no longer a black box • Know how compiler works • Implications • Use of language features • Avoid those can cause problem • Give compiler hints • Code optimization • Don’t optimize prematurely • Don’t write complicated code • Debugging • Understand the compiled code
Solving Problem the Compiler Way • Solve problems from language/compiler perspective • Implement simple language • Extend language
Informed Hardware Engineer • Compiler support for programmable hardware • pervasive computing • new back-ends for new processors • Design new architectures • what can compiler do and not do • how to expose and use compiler to manage hardware resources
Compiler Writer • Make a living by writing compilers! • Theory • Algorithms • Engineering • We have built: • scanner • parser • AST tree builder, type checker • register allocator • instruction scheduler • Used compiler generation tools • ANTLR, lex, yacc, etc On track to jump into compiler development!
Final Remarks • Compiler construction • Theory • Implementation • How to use what you learned in this lecture? • As informed programmer • As informed small language designer • As informed hardware engineer • As compiler writer … and live happily ever after