Chapter 9 Conclusion

Chapter 9Conclusion

Course Overview PART I: overview material 1 Introduction (today) 2 Language Processors (basic terminology, tombstone diagrams, bootstrapping) 3 The architecture of a Compiler PART II: inside a compiler 4 Syntax Analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion 9 Conclusion

What This Lecture is About The code generated by our compiler is not efficient: • It computes values at runtime that could be known at compile time • It computes values more times than necessary We can do better! • Constant folding • Common sub-expression elimination • Code motion • Dead code elimination

Constant folding • Consider: • The compiler could compute 4 / 3 * pi as 4.1888 before the program runs. This saves how many instructions? • What is wrong with the programmer writing 4.1888 * r * r * r? static double pi = 3.1416; double volume = 4/3 * pi * r * r * r;

Constant folding II • Consider: • If the address of holidays is x, what is the address of holidays[2].m? x+7 • Could the programmer evaluate this at compile time? Safely? struct { int y, m, d; } holidays[6]; holidays[2].m = 12; holidays[2].d = 25;

Constant folding III • An expression that the compiler should be able to compute the value of is called “manifest”. • How can the compiler know if the value of an expression is manifest?

Common sub-expression elimination • Consider: • Computing x – y takes three instructions, could we save some of them? int t = (x – y) * (x – y + z);

Common sub-expression elimination II int t = (x – y) * (x – y + z); Naïve code: iload x iload y isub iload x iload y isub iload z iadd Imult istore t Better code: iload x iload y isub dup iload z iadd Imult istore t

Common sub-expression elimination III • Consider: • The address of holidays[i] is a common subexpression. struct { int y, m, d; } holidays[6]; holidays[i].m = 12; holidays[i].d = 25;

Common sub-expression elimination IV • But, be careful! • Is x – y++ still a common sub-expression? int t = (x – y++) * (x – y++ + z);

Code motion • Consider: • Computing the address of name[i][j] is address[name] + (i * 10) + j • Most of that computation is constant throughout the inner loop char name[3][10]; for (int i = 0; i < 3; i++) { for (int j = 0; j < 10; j++) { name[i][j] = ‘a’; address[name] + (i * 10)

Code motion II • You can think of this as rewriting the original code: as char name[3][10]; for (int i = 0; i < 3; i++) { for (int j = 0; j < 10; j++) { name[i][j] = ‘a’; char name[3][10]; for (int i = 0; i < 3; i++) { char *x = &(name[i][0]); for (int j = 0; j < 10; j++) { x[j] = ‘a’;

Code motion III • However, this might be a bad idea in some cases. Why? char name[3][10]; for (int i = 0; i < 3; i++) { for (int j = 0; j < k; j++) { name[i][j] = ‘a’; char name[3][10]; for (int i = 0; i < 3; i++) { char *x = &(name[i][0]); for (int j = 0; j < k; j++) { x[j] = ‘a’;

Dead code elimination • Consider: • Computing t takes many instructions, but the value of t is never used. • We call the value of t “dead” (or the variable t dead) because it can never affect the final value of the computation. Computing dead values and assigning to dead variables is wasteful. int f(int x, int y, int z) { int t = (x – y) * (x – y + z); return 6; }

Dead code elimination II • But consider: • Now t is only dead for part of its existence. Hmm… int f(int x, int y, int z) { int t = x * y; int r = t * z; t = (x – y) * (x – y + z); return r; }

Chapter 9 Conclusion