260 likes | 380 Views
Course Overview. PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion
E N D
Course Overview PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion • Interpretation 9 Review Supplementary material: Code optimization
What This Topic is About The code generated by our compiler is not efficient: • It computes some values at runtime that could be known at compile time • It computes some values more times than necessary We can do better! • Constant folding • Common sub-expression elimination • Code motion • Dead code elimination
Constant folding • Consider: • The compiler could compute 4 * pi / 3 as 4.18879 before the program runs. How many instructions would this save at run time? • Why shouldn’t the programmer just write 4.18879 * r * r * r? static double pi = 3.14159; double volume = 4 * pi / 3 * r * r * r;
Constant folding II • Consider: • If the address of holidays is x, what is the address of holidays[2].m? • Could the programmer evaluate this at compile time? Should the programmer do this? struct { int y, m, d; } holidays[6]; holidays[2].m = 12; holidays[2].d = 25;
Constant folding III • An expression that the compiler should be able to compute the value of is called “manifest”. • How can the compiler know if the value of an expression is manifest?
Common sub-expression elimination • Consider: • Computing x – y takes three instructions; could we save some of them? t = (x – y) * (x – y + z);
Common sub-expression elimination II t = (x – y) * (x – y + z); Naïve code: load x load y sub load x load y sub load z add mult store t Better code: load x load y sub dup load z add mult store t
Common sub-expression elimination III • Consider: • The address of holidays[i] is a common subexpression. struct { int y, m, d; } holidays[6]; holidays[i].m = 12; holidays[i].d = 25;
Common sub-expression elimination IV • But, be careful! • Is x – y++ still a common sub-expression? t = (x – y++) * (x – y++ + z);
Code motion • Consider: • Computing the address of name[i][j] is address[name] + (i * 10) + j • Most of that computation is constant throughout the inner loop char name[3][10]; for (int i = 0; i < 3; i++) { for (int j = 0; j < 10; j++) { name[i][j] = ‘a’; address[name] + (i * 10)
Code motion II • You can think of this as rewriting the original code: as: char name[3][10]; for (int i = 0; i < 3; i++) { for (int j = 0; j < 10; j++) { name[i][j] = ‘a’; char name[3][10]; for (int i = 0; i < 3; i++) { char *x = &(name[i][0]); for (int j = 0; j < 10; j++) { x[j] = ‘a’;
Code motion III • However, this might be a bad idea in some cases. Why? Consider very small values of variable k: char name[3][10]; for (int i = 0; i < 3; i++) { for (int j = 0; j < k; j++) { name[i][j] = ‘a’; char name[3][10]; for (int i = 0; i < 3; i++) { char *x = &(name[i][0]); for (int j = 0; j < k; j++) { x[j] = ‘a’;
Dead code elimination • Consider: • Computing t takes many instructions, but the value of t is never used. • We call the value of t “dead” (or the variable t dead) because it can never affect the final value of the computation. Computing dead values and assigning to dead variables is wasteful. int f(int x, int y, int z) { int t = (x – y) * (x – y + z); return 6; }
Dead code elimination II • But consider: • Now t is only dead for part of its existence. So it requires a careful algorithm to identify which code is dead, and therefore which code can be safely removed. int f(int x, int y, int z) { int t = x * y; int r = t * z; t = (x – y) * (x – y + z); return r; }
Optimization implementation • What do we need to know in order to apply an optimization? • Constant folding • Common sub-expression elimination • Code motion • Dead code elimination • Many other kinds of optimizations • Is the optimization correct or safe? • Is the optimization really an improvement? • What sort of analyses do we need to perform to get the required information?
Basic blocks • A basic block is a sequence of instructions that is entered only at the beginning and exited only at the end. • A flow graph is a collection of basic blocks connected by edges indicating the flow of control.
Finding basic blocks (Example: JVM code) iload 2 iload 3 imul dup istore 2 pop Label_3: iload 3 dup iconst_1 iadd istore 3 pop goto Label_1 Label_2: iload 2 ireturn iconst_1 istore 2 iconst_2 istore 3 Label_1: iload 3 iload 1 if_icmplt Label_4 iconst_0 goto Label_5 Label_4: iconst_1 Label_5: ifeq Label_2 Mark the first instruction, labelled instructions, and following jumps.
Finding basic blocks II iconst_1 istore 2 iconst_2 istore 3 iload 2 iload 3 imul dup istore 2 pop Label_1: iload 3 iload 1 if_icmplt Label_4 Label_3: iload 3 dup iconst_1 iadd istore 3 pop goto Label_1 iconst_0 goto Label_5 Label_4: iconst_1 Label_5: ifeq Label_2 Label_2: iload 2 ireturn
Flow graphs 0: iconst_1 istore 2 iconst_2 istore 3 5: iload 2 iload 3 imul dup istore 2 pop 1: iload 3 iload 1 if_icmplt 3 6: iload 3 dup iconst_1 iadd istore 3 pop goto 1 2: iconst_0 goto 4 3: iconst_1 4: ifeq 7 7: iload 2 ireturn
Local optimizations (within a basic block) • Everything you need to know is easy to determine • For example: live variable analysis • Start at the end of the block and work backwards • Assume everything is live at the end of the basic block • Copy live/dead info for the instruction • If you see an assignment to x, then mark x “dead” • If you see a reference to y, then mark y “live” live: 1, 2, 3 5: iload 2 iload 3 imul dup istore 2 pop live: 1, 3 live: 1, 3 live: 1, 3 live: 1, 3 live: 1, 2, 3 live: 1, 2, 3
Global optimizations • Global means “across all basic blocks” • We must know what happens across block boundaries • For example: live variable analysis • The liveness of a value depends on its later uses perhaps in other blocks • What values does this block define and use? 5: iload 2 iload 3 imul dup istore 2 pop Define: 2 Use: 2, 3
Global live variable analysis • We define four sets for each basic block B • def[B] = variables defined in B before they are used in B • use[B] = variables used in B before they are defined in B • in[B] = variables live at the beginning of B • out[B] = variables live at the end of B • These sets are related by the following equations: • in[B] = use[B] (out[B] – def[B]) • out[B] = S in[S] where S is a successor of B
Solving data flow equations • We want a fixed-point solution for this system of equations (there are two equations per each basic block). • Start with conservative initial values for each in[B] and out[B], and apply the formulas to update the values of each in[B] and out[B]. Repeat until no further changes can occur. • The best conservative initial value is {}, because no variables are live at the end of the program.
Dead code elimination • Suppose we have now computed all global live variable information • We can redo the local live variable analysis using correct liveness information at the end of each block: out[B] • Whenever we see an assignment to a variable that is marked dead, we can safely eliminate it
Dead code examples live: 1, 2, 3 live: 1, 2, 3 istore 5 iload 1 iload 2 isub iload 1 iload 2 isub iload 3 iadd imul dup istore 4 pop iload 5 ireturn iload 1 iload 2 imul istore 4 iload 4 iload 3 imul istore 5 live: 1, 2, 3 live: 1, 2, 3, 5 live: 1, 2, 3 live: 1, 2, 3, 5 live: 1, 2, 3 live: 1, 2, 3, 5 live: 1, 2, 3, 5 live: 1, 2, 3, 4 live: 1, 2, 3 live: 2, 3, 5 live: 1, 2, 3 live: 3, 5 live: 1, 2, 3 live: 3, 5 live: 5 live: 1, 2, 3 live: 5 live: 5 live: 5 live: 5 live: 5 live: live:
Code optimization? • Code optimization should be called “code improvement” • It is not practical to generate absolutely optimal code (too expensive at compile time ==> NP-hard) • There is a trade-off between compiler speed and execution speed • Many compilers have options that permit the programmer to choose between generating either optimized or non-optimized code • Non-optimized => debugging; optimized => release • Some compilers even allow the programmer to select which kinds of optimizations to perform