Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research

Research in Compilers and Introduction to Loop TransformationsPart I: Compiler Research Tomofumi Yuki EJCP 2016 June 29, Lille

Background • Defended Ph.D. in C.S. on October 2012 • Colorado State University • Advisor: Dr. Sanjay Rajopadhye • Currently InriaChargé de Recherche • Rhône-Alpes, LIP @ ENS Lyon • Optimizing compiler + programming language • static analysis (polyhedral model) • parallel programming models • High-Level Synthesis EJCP 2016, June 29, Lille

What is this Course About? • Research in compilers • a bit about compiler itself • Understand compiler research • what are the problems? • what are the techniques? • what are the applications? • may be do research in compilers later on! Be able to (partially) understand work by “compiler people” at conferences. EJCP 2016, June 29, Lille

What is a Compiler? • What does compiler mean to you? EJCP 2016, June 29, Lille

Compiler Advances • Old compiler vs recent compiler • modern architecture • gcc -O3 vsgcc -O0 • How much speedup by compiler alone after 45 years of research? EJCP 2016, June 29, Lille

Proebsting’s Law • Compiler Advances Double Computing Power Every 18 Years • http://proebsting.cs.arizona.edu/law.html • Someone actually tried it: • On Proebsting’s Law, Kevin Scott, 2001 • SPEC95, compared against –O0 • 3.3x for int • 8.1x for float HW gives 60%/year EJCP 2016, June 29, Lille

Compiler Advances • Old compiler vs recent compiler • modern architecture • gcc -O3 vsgcc -O0 • 3~8x difference after 45 years • Not so much? EJCP 2016, June 29, Lille

Compiler Advances • Old compiler vs recent compiler • modern architecture • gcc -O3 vsgcc -O0 • 3~8x difference after 45 years • Not so much? “The most remarkable accomplishment by far of the compiler field is the widespread use of high-level languages.” by Mary Hall, David Padua, and KeshavPingali [Compiler Research: The Next 50 Years, 2009] EJCP 2016, June 29, Lille

Earlier Accomplishments • Getting efficient assembly • register allocation • instruction scheduling • ... • High-level language features • object-orientation • dynamic types • automated memory management • ... EJCP 2016, June 29, Lille

What is Left? • Parallelism • multi-cores, GPUs, ... • language features for parallelism • Security/Reliability • verification • certified compilers • Power/Energy • data movement • voltage scaling EJCP 2016, June 29, Lille

Agenda for today • Part I: What is Compiler Research? • Part II: Compiler Optimizations • Lab: Introduction to Loop Transformations EJCP 2016, June 29, Lille

What is a Compiler? • Bridge between “source” and “target” source target compile EJCP 2016, June 29, Lille

Compiler vs Assembler • What are the differences? source assembly target object compile assemble EJCP 2016, June 29, Lille

Compiler vs Assembler • Compiler • Many possible targets (semi-portable) • Many decisions are taken • Assembler • Specialized output (non-portable) • Usually a “translation” EJCP 2016, June 29, Lille

Goals of the Compiler • Higher abstraction • No more writing assemblies! • enables language features • loops, functions, classes, aspects, ... • Performance • while increasing productivity • speed, space, energy, ... • compiler optimizations EJCP 2016, June 29, Lille

Productivity vs Performance • Higher Abstraction ≈ Less Performance Python Java Abstraction C Fortran Assembly Performance EJCP 2016, June 29, Lille

Productivity vs Performance • How much can you regain? Python Python Java Java Abstraction C C Fortran Fortran Assembly Performance EJCP 2016, June 29, Lille

Productivity vs Performance • How sloppy can you write code? Python Python Java Java Abstraction C C Fortran Fortran Assembly Performance EJCP 2016, June 29, Lille

Compiler Research • Branch of Programming Languages • Program Analysis, Transformations • Formal Semantics • Type Theory • Runtime Systems • Compilers • ... EJCP 2016, June 29, Lille

Current Uses of Compiler • Optimization • important for vendors • many things are better left to the compiler • parallelism, energy, resiliency, ... • Code Analysis • IDEs • static vs dynamic analysis • New Architecture • IBM Cell, GPU, Xeon-Phi, ... EJCP 2016, June 29, Lille

Examples • Two classical compiler optimizations • register allocation • instruction scheduling EJCP 2016, June 29, Lille

Case 1: Register Allocation • Classical optimization problem 3 registers 8 instructions 2 registers 6 instructions C = A + B; D = B + C; naïve translation smart compilation load %r1, A load %r2, B add %r3, %r1, %r2 store %r3, C load %r1, B load %r2, C add %r3, %r1, %r2 store %r3, D load %r1, A load %r2, B add %r1, %r1, %r2 store %r1, C add %r1, %r2, %r1 store %r1, D EJCP 2016, June 29, Lille

Register Allocation in 5min. • Often viewed as graph coloring • Live Range: when a value is “in use” • Interference: both values are “in use” • e.g., two operands of an instruction • Coloring: conflicting nodes to different reg. a b b a b c d c = a + b; d = b + c; c d add %r1, %r1, %r2 add %r1, %r2, %r1 Interference Graph Live Range Analysis EJCP 2016, June 29, Lille

Register Allocation in 5min. • Registers are limited c = a + b; d = b + c; x = c + d; y = a + x; Live Range Splitting a a a a b b b b a a b b c c d d x x y y z a = load A; c = a + b; d = b + c; x = c + d; z = load A; y = z + x; c c c c d d d d x x x x z y z y EJCP 2016, June 29, Lille 24

Research in Register Allocation • How to do a good allocation • which variables to split • which values to spill • How to do it fast? • Graph-coloring is expensive • Just-in-Time compilation “Solved” EJCP 2016, June 29, Lille

Case 2: Instruction Scheduling • Another classical problem X = A * B * C; Y = D * E * F; naïve translation smart compilation R = A * B; X = R * C; S = D * E; Y = S * F; Pipeline Stall (if mult. takes 2 cycles) R = A * B; S = D * E; X = R * C; Y = S * F; Also done in hardware (out-of-order) EJCP 2016, June 29, Lille

Research in Instruction Scheduling • Not much anymore for speed/parallelism • beaten to death • hardware does it for you • Remains interesting in specific contexts • faster methods for JIT • energy optimization • “predictable” execution • in-order cores, VLIW, etc. EJCP 2016, June 29, Lille

Case 1+2: Phase Ordering • Yet another classical problem • practically no solution • Given optimization A and B • A after B vs A before B • which order is better? • can you solve the problem globally? • Parallelism requires more memory • trade-off: register pressure vs parallelism EJCP 2016, June 29, Lille

Job Market • Where do they work at? • IBM • Mathworks • amazon • Xilinx • start-ups • Many opportunities in France • Mathworks @ Grenoble • Many start-ups EJCP 2016, June 29, Lille

Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research

Research in Compilers and Introduction to Loop Transformations Part I: Compiler Research

Presentation Transcript

Introduction to Research Design and Exploratory Research

Introduction to Research

Introduction to Clinical Research and Research Questions

Loop Transformations and Locality

Introduction to Compilers

Introduction to Our Research on Certifying Compiler

Compiler Research in HPC Lab

Introduction to Research

Introduction to Research

Introduction to Research

Introduction to Compilers

Loop Transformations

Introduction to Clinical Research and Research Questions

Research in EMS Introduction to research

Introduction to Clinical Research and Research Questions

Introduction to Compilers

Introduction to Compilers

Introduction to Research

Research PPT Part I

Loop Transformations

Optimizing Compilers CISC 673 Spring 2009 Dependence Analysis and Loop Transformations

Compilers Modern Compiler Design