1 / 21

LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. Chris Lattner lattner@cs.uiuc.edu. Vikram Adve vadve@cs.uiuc.edu. http://llvm.cs.uiuc.edu/ 2004 International Symposium on Code Generation and Optimization (CGO’04) March 22, 2004.

jerrica
Download Presentation

LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LLVM: A Compilation Framework forLifelong Program Analysis & Transformation Chris Lattner lattner@cs.uiuc.edu Vikram Adve vadve@cs.uiuc.edu http://llvm.cs.uiuc.edu/ 2004 International Symposium on Code Generation and Optimization (CGO’04) March 22, 2004

  2. Life-Long Program Optimization: • Multiple-stages of analysis & transformation: • compile-time, link-time, install-time, run-time, idle-time • Use aggressive interprocedural optimizations • Gather and exploit end-user profile information • Tune the application to the user’s hardware • But what constraints do we have to meet? • Can’t interfere with the build process! • Must support multiple source-languages! • Must integrate with legacy systems and components! Chris Lattner – lattner@cs.uiuc.edu

  3. Five key capabilities are needed: • A persistent, rich code representation • Enables analysis & optimization throughout lifetime • Offline native code generation • Must be able to generate high-quality code statically • Profiling & optimization in the field • Adapt to the end-user’s usage patterns • Language independence • No runtime, object model, or exception semantics • Uniform whole-program optimization • Allow optimization across languages and runtime Chris Lattner – lattner@cs.uiuc.edu

  4. What about previous approaches? Chris Lattner – lattner@cs.uiuc.edu

  5. Our approach: The LLVM System • Use a low-level, but typed, representation: • Type information allows important high-level analysis • Code representation is truly language neutral • Allow off-line and runtime native code generation • Our specific contributions: • Novel features for language independence: • Typed pointer arithmetic, exception mechanisms • Novel capabilities: • First to support all 5 capabilities for lifelong optzn: Chris Lattner – lattner@cs.uiuc.edu

  6. Why not a HLL VM like CLI/JVM? • Differing goals differing representations: • HLL VMs: classes, inheritance, mem. mgmt, runtime… • LLVM: calls, load/stores, arithmetic, addressing, etc… • Implications: • Managed CLI is not truly language neutral: • Managed C++: No multiple inheritance, no copy ctors • Cannot optimize VM code into the application code • HLL VMs require specific runtime environments • LLVM complements high-level VMs: • A HLL VM could be implemented in terms of LLVM! Chris Lattner – lattner@cs.uiuc.edu

  7. Outline • Introduction, problem statement • LLVM Virtual Instruction Set • LLVM Compiler Architecture • Evaluation • Summary Chris Lattner – lattner@cs.uiuc.edu

  8. LLVM Instruction Set Overview #1 • Low-level and target-independent semantics • RISC-like three address code • Infinite virtual register set in SSA form • Simple, low-level control flow constructs • Load/store instructions with typed-pointers loop: %i.1 = phiint [ 0, %bb0 ], [ %i.2, %loop ] %AiAddr = getelementptrfloat* %A, int %i.1 callvoid %Sum(float %AiAddr, %pair* %P) %i.2 = addint %i.1, 1 %tmp.4 = setltint %i.1, %N brbool %tmp.4, label %loop, label %outloop for (i = 0; i < N; ++i) Sum(&A[i], &P); Chris Lattner – lattner@cs.uiuc.edu

  9. LLVM Instruction Set Overview #2 • High-level information exposed in the code • Explicit dataflow through SSA form • Explicit control-flow graph (even for exceptions) • Explicit language-independent type-information • Explicit typed pointer arithmetic loop: %i.1 = phiint [ 0, %bb0 ], [ %i.2, %loop ] %AiAddr = getelementptrfloat* %A, int %i.1 callvoid %Sum(float %AiAddr, %pair* %P) %i.2 = addint %i.1, 1 %tmp.4 = setltint %i.1, %N brbool %tmp.4, label %loop, label %outloop for (i = 0; i < N; ++i) Sum(&A[i], &P); Chris Lattner – lattner@cs.uiuc.edu

  10. LLVM Type System Details: • The entire type system consists of: • Primitives: void, bool, float, ushort, opaque, … • Derived: pointer, array, structure, function • No high-level types: type-system is language neutral! • Source language types are lowered: • e.g. T&  T* • e.g. class T : S { int X; } { S, int } • Type system allows arbitrary casts: • Allows expressing non-type-safe languages, like C Chris Lattner – lattner@cs.uiuc.edu

  11. Pointer Arithmetic: getelementptr • Given a pointer, return element address: • “getelementptr {int, int}* A, uint 1” &A->field1 • “getelementptr [10 x int]* B, long i” &B[i] • A key feature for several high-level analyses: • Field information for field-sensitive alias analysis • Array subscript info for dependence analysis Chris Lattner – lattner@cs.uiuc.edu

  12. LLVM Exception Handling Support • Provide mechanisms to implement exceptions • Do not specify exception semantics (C vs C++ vs Java) • Critical for language independence • LLVM provides two simple instructions: • unwind: Unwind stack frames until reaching an invoke • invoke: Call to a function needing an exception handler • Supports general stack unwinding: • setjmp/longjmp implemented as “exceptions” • Full C++ exception handling model is implemented Sufficient for: C, C++, Java, C#, OCaml, etc. Chris Lattner – lattner@cs.uiuc.edu

  13. A simple C++ example: C++ { Class Object; // Has a dtor func(); // Might throw ... } LLVM ; Allocate stack space %Object = alloca %Class ; Construct object call %Class::Class(%Object) ; Call function invoke func() to B1, B2 Exception edges are visible to language and target independent optimizers! Unwind Edge Normal Edge B2: ; Destroy object call %Class::~Class(%Object) ; Continue propagation unwind B1: ... Chris Lattner – lattner@cs.uiuc.edu

  14. Outline • Introduction, problem statement • LLVM Virtual Instruction Set • LLVM Compiler Architecture • Evaluation • Summary Chris Lattner – lattner@cs.uiuc.edu

  15. Offline Optimizer JIT Profile info LLVM Shared Libraries Profile info Runtime Optimizer Native or LLVM Static Code Gen C, C++ Fortran OCAML LLVM Compiler Architecture Developer site Compiler 1 LLVM User site Linker + IP Optimizer • • • C, C++ LLVM Compiler N JVM MSIL Chris Lattner – lattner@cs.uiuc.edu

  16. LLVM provides all five capabilities: • A persistent, rich code representation: • LLVM to LLVM optimizations can happen at any time • Offline native code generation: • Generate high-quality machine code, retaining LLVM • Profiling & optimization in the field: • Runtime and offline profile-driven optimizers • Language independence: • Low-level inst set & types with transparent runtime • Uniform whole-program optimization: • Optimize across source-language boundaries Chris Lattner – lattner@cs.uiuc.edu

  17. Outline • Introduction, problem statement • LLVM Virtual Instruction Set • LLVM Compiler Architecture • Evaluation • Summary Chris Lattner – lattner@cs.uiuc.edu

  18. See paper Evaluation Overview • Does LLVM enable high-level analysis/optzn? • Yes! • How big are programs in LLVM? • Comparable to native machine code binaries • How reliable is the type information? • Is it useful for anything? • How fast is the optimizer? • Is it suitable for run-time & interprocedural optimization? Chris Lattner – lattner@cs.uiuc.edu

  19. How reliable is the type info? • Use static analysis to prove type information: • How many loads/stores are typed correctly? • Type info enables optzn: • e.g. structure reorganization • Even for C codes: • Most programs have extensive type information available! • Extensive use of custom allocators is the biggest hurdle Chris Lattner – lattner@cs.uiuc.edu

  20. How fast is the LLVM optimizer? IPO takes trivial time compared to GCC, even though GCC has no intermodule optimization: • Due to LLVM’s low-level and efficient IR! Optzns trigger many times: • vortex/DGE: 331 funcs, 557 globals • gcc/DAE: 103 args, 96 ret vals • gcc/Inline: 1368 call sites • … DGE = Dead Global (var/func) Elimination DAE = Dead Argument/Retval Elimination Optimization Time (s) Chris Lattner – lattner@cs.uiuc.edu

  21. LLVM Contributions: • Novel features: • As language independent as machine code, yet supports high-level optimizations • New abstraction for exceptions • Type-safe pointer arithmetic for high-level analysis/opzn • Novel capabilities: • First to provide all five capabilities • Practical LLVM is open source, has real users, and really works: try it out! http://llvm.cs.uiuc.edu/ Chris Lattner – lattner@cs.uiuc.edu

More Related