290 likes | 307 Views
This course provides an introduction to compiler research, focusing on loop transformations. Students will learn about compiler advancements, compiler vs assembler, compiler goals, and current uses of compilers. The course aims to develop an understanding of compiler research and its applications.
E N D
Research in Compilers and Introduction to Loop TransformationsPart I: Compiler Research Tomofumi Yuki EJCP 2016 June 29, Lille
Background • Defended Ph.D. in C.S. on October 2012 • Colorado State University • Advisor: Dr. Sanjay Rajopadhye • Currently InriaChargé de Recherche • Rhône-Alpes, LIP @ ENS Lyon • Optimizing compiler + programming language • static analysis (polyhedral model) • parallel programming models • High-Level Synthesis EJCP 2016, June 29, Lille
What is this Course About? • Research in compilers • a bit about compiler itself • Understand compiler research • what are the problems? • what are the techniques? • what are the applications? • may be do research in compilers later on! Be able to (partially) understand work by “compiler people” at conferences. EJCP 2016, June 29, Lille
What is a Compiler? • What does compiler mean to you? EJCP 2016, June 29, Lille
Compiler Advances • Old compiler vs recent compiler • modern architecture • gcc -O3 vsgcc -O0 • How much speedup by compiler alone after 45 years of research? EJCP 2016, June 29, Lille
Proebsting’s Law • Compiler Advances Double Computing Power Every 18 Years • http://proebsting.cs.arizona.edu/law.html • Someone actually tried it: • On Proebsting’s Law, Kevin Scott, 2001 • SPEC95, compared against –O0 • 3.3x for int • 8.1x for float HW gives 60%/year EJCP 2016, June 29, Lille
Compiler Advances • Old compiler vs recent compiler • modern architecture • gcc -O3 vsgcc -O0 • 3~8x difference after 45 years • Not so much? EJCP 2016, June 29, Lille
Compiler Advances • Old compiler vs recent compiler • modern architecture • gcc -O3 vsgcc -O0 • 3~8x difference after 45 years • Not so much? “The most remarkable accomplishment by far of the compiler field is the widespread use of high-level languages.” by Mary Hall, David Padua, and KeshavPingali [Compiler Research: The Next 50 Years, 2009] EJCP 2016, June 29, Lille
Earlier Accomplishments • Getting efficient assembly • register allocation • instruction scheduling • ... • High-level language features • object-orientation • dynamic types • automated memory management • ... EJCP 2016, June 29, Lille
What is Left? • Parallelism • multi-cores, GPUs, ... • language features for parallelism • Security/Reliability • verification • certified compilers • Power/Energy • data movement • voltage scaling EJCP 2016, June 29, Lille
Agenda for today • Part I: What is Compiler Research? • Part II: Compiler Optimizations • Lab: Introduction to Loop Transformations EJCP 2016, June 29, Lille
What is a Compiler? • Bridge between “source” and “target” source target compile EJCP 2016, June 29, Lille
Compiler vs Assembler • What are the differences? source assembly target object compile assemble EJCP 2016, June 29, Lille
Compiler vs Assembler • Compiler • Many possible targets (semi-portable) • Many decisions are taken • Assembler • Specialized output (non-portable) • Usually a “translation” EJCP 2016, June 29, Lille
Goals of the Compiler • Higher abstraction • No more writing assemblies! • enables language features • loops, functions, classes, aspects, ... • Performance • while increasing productivity • speed, space, energy, ... • compiler optimizations EJCP 2016, June 29, Lille
Productivity vs Performance • Higher Abstraction ≈ Less Performance Python Java Abstraction C Fortran Assembly Performance EJCP 2016, June 29, Lille
Productivity vs Performance • How much can you regain? Python Python Java Java Abstraction C C Fortran Fortran Assembly Performance EJCP 2016, June 29, Lille
Productivity vs Performance • How sloppy can you write code? Python Python Java Java Abstraction C C Fortran Fortran Assembly Performance EJCP 2016, June 29, Lille
Compiler Research • Branch of Programming Languages • Program Analysis, Transformations • Formal Semantics • Type Theory • Runtime Systems • Compilers • ... EJCP 2016, June 29, Lille
Current Uses of Compiler • Optimization • important for vendors • many things are better left to the compiler • parallelism, energy, resiliency, ... • Code Analysis • IDEs • static vs dynamic analysis • New Architecture • IBM Cell, GPU, Xeon-Phi, ... EJCP 2016, June 29, Lille
Examples • Two classical compiler optimizations • register allocation • instruction scheduling EJCP 2016, June 29, Lille
Case 1: Register Allocation • Classical optimization problem 3 registers 8 instructions 2 registers 6 instructions C = A + B; D = B + C; naïve translation smart compilation load %r1, A load %r2, B add %r3, %r1, %r2 store %r3, C load %r1, B load %r2, C add %r3, %r1, %r2 store %r3, D load %r1, A load %r2, B add %r1, %r1, %r2 store %r1, C add %r1, %r2, %r1 store %r1, D EJCP 2016, June 29, Lille
Register Allocation in 5min. • Often viewed as graph coloring • Live Range: when a value is “in use” • Interference: both values are “in use” • e.g., two operands of an instruction • Coloring: conflicting nodes to different reg. a b b a b c d c = a + b; d = b + c; c d add %r1, %r1, %r2 add %r1, %r2, %r1 Interference Graph Live Range Analysis EJCP 2016, June 29, Lille
Register Allocation in 5min. • Registers are limited c = a + b; d = b + c; x = c + d; y = a + x; Live Range Splitting a a a a b b b b a a b b c c d d x x y y z a = load A; c = a + b; d = b + c; x = c + d; z = load A; y = z + x; c c c c d d d d x x x x z y z y EJCP 2016, June 29, Lille 24
Research in Register Allocation • How to do a good allocation • which variables to split • which values to spill • How to do it fast? • Graph-coloring is expensive • Just-in-Time compilation “Solved” EJCP 2016, June 29, Lille
Case 2: Instruction Scheduling • Another classical problem X = A * B * C; Y = D * E * F; naïve translation smart compilation R = A * B; X = R * C; S = D * E; Y = S * F; Pipeline Stall (if mult. takes 2 cycles) R = A * B; S = D * E; X = R * C; Y = S * F; Also done in hardware (out-of-order) EJCP 2016, June 29, Lille
Research in Instruction Scheduling • Not much anymore for speed/parallelism • beaten to death • hardware does it for you • Remains interesting in specific contexts • faster methods for JIT • energy optimization • “predictable” execution • in-order cores, VLIW, etc. EJCP 2016, June 29, Lille
Case 1+2: Phase Ordering • Yet another classical problem • practically no solution • Given optimization A and B • A after B vs A before B • which order is better? • can you solve the problem globally? • Parallelism requires more memory • trade-off: register pressure vs parallelism EJCP 2016, June 29, Lille
Job Market • Where do they work at? • IBM • Mathworks • amazon • Xilinx • start-ups • Many opportunities in France • Mathworks @ Grenoble • Many start-ups EJCP 2016, June 29, Lille