Open LLVM Project

Open LLVM Project - What is this? - LLVM Subproject: Clang and VMKit - Improving the Current System 1.Implementing Code Cleanup bugs 2.Compile Programs with the LLVM Compiler 3.Add programs to the llvm-test suite 4. Benchmark the LLVM compiler 5. Miscellaneous Improvements - Adding new capabilities to LLVM 1. Extend the LLVM intermediate representation 2. Pointer and Alias Analysis 3. Profile-Guided Optimization 4. Code Compaction 5. New Transformations and Analyses 6. Code Generator Improvements 7. Miscellaneous Additions

What is this? • This document is meant to be a sort of ”big TODO list” for LLVM. • Each project in this document is something that would be useful for LLVM to have, and would also be a great way to get familiar with the system. • specific project are filed as unassigned enhancements in the LLVM bug tracker • See the list of currently outstanding issues if you wish to help improve LLVM.

LLVM Subproject: Clang and VMKit(1) • The Clang Open Project This list is provided to generate ideas, it is not intended to be comprehensive. • Undefined behavior Checking CodeGen could insert runtime checks for all sorts of different undefined behaviors, form reading uninitialized variables, buffer overflows • Improve target support The current tart interfaces are heavily stubbed out and need to be implemented fully. The actual target implementations also need to be completed. • Implement and tool to generate code documentation Clang’s library-based design allows it to be used by a variety of tools that reason about source code. Clang would be to build an auto-documentation system like doxygen • Use clang libraries to implement better versions of existing tools Clang is built as a set of libraries, which means that it is possible to implement capabilities similar to other source language tools, improving them in various ways. Three examples are distcc, the delta testcase reduction tool, and the “indent” source reformatting tool.

LLVM Subproject: Clang and VMKit(2) • Use clang libraries to extend Ragel with a JIT Regel is a state machine compiler that lets tou embed C code into state machines and generate C code • Continue work on C++’11 support There are several neat ways to improve the quality of clang by self-testimg. Some examples: Improve the reliability of AST printing and serialization by ensuring that the AST produced by clang on an input doesn’t change when it is reparsed or unserialized. Improve parser reliability and error generation by automatically or randomly changing the input checking that clang doesn’t crash and that it doesn’t generate excessive errors for small input changes • Continue work on C++’11 support C++’98 is feature complete, but there is still a lot of c++’11 features to implement

LLVM Subproject: Clang and VMKit(3) • If you hit a bug with clang, it is very useful for us if you reduce the code that demonstrates the problem down to something small. There are many ways to do this; ask on cfe-dev for advice. - StringRef’ize APIS - Universal Driver - XML Representation of ASTs

LLVM Subproject: Clang and VMKit(4) • VMKit open Project List This list is provided to generate ideas, it is not intended to be comprehensive. • Port VMKit’s JVM to Harmony or OpenJDK VMKit currently uses GNU Classpath for the standard Java classes. Interfacing with another library such as Apache Harmony (http://harmony.apache.org) or Sun's OpenJDK (http://openjdk.java.net) may help improving support for latest benchmarks (http://www.spec.org/jvm2008). On the LLVM side, here are a few interesting projects that would help VMKit: • Adaptive Optimization System Being able to adaptively optimize JIT-compiled code would dramastically help the startup time of VMKit. All the non-adaptive bits are already there in LLVM: baseline compiler (with the -fast command line and the simple register allocator), optimized compiler (with the linear scan register allocator), and a full set of optimizations changeable at runtime • Type-based alias-analysis Safe languages such as ones supported by VMKit (Java, C#) benefit a lot from a type based alias analysis. LLVM currently lacks full support of this feature for safe languages • Misceallenous Java-related optimizations Removal of array bounds checks, null pointer checks, devirtualization, inlining, etc.

Improving the Current System(1) Improvements to the current infrastructure are always very welcome and tend to be fairly straight-forward to implement.Here are Some of the key area that can use improvement….. 1. Implementing Code Cleanup bugs The LLVM Bug Tracker occasionally has “code-cleanup” bugs filed in it. Taking one of these and fixing it is a good way to get your feet wet in the LLVM code and discover how some of its components work Som specific ones that would be great to have: - Fix the design of GlobalAlias to not require dest type to match source type - Redesign ConstantExpr’s - SimplifyLibCalls should be merged into instcombine - Static constructors should be purged from LLVM 2. Add programs to the llvm-testestsuit The llvem-test testsuite is a large collection of programs we use for nightly testing of generated code performance, compile times, correctness, etc. Having a large testsuite gives us a lot of coverage of programs and enables us to spot and improve any problem areas in the compiler. 3. Compile programs with the LLVM Compiler We are always looking for new testcases and benchmarks for use with LLVM.Ifit doesn't compile, try to figure out why or report it to the llvm-bugs list. If you get the program to compile, it would be extremely useful to convert the build system to be compatible with the LLVM Programs testsuite so that we can check it into SVN and the automated tester can use it to track progress of the compiler.

Improving the Current System(2) 4. Benchmark The LLVM compiler Find benchmarks either using our test result or on your own, where LLVM code generators do not produce optimal code or simply where another compiler produces better code. Try to minimize the test case that demonstrates the issue. Then, either submit a bud with your testcase and the code that LLVM produces vs. the code that itshould produce, or even better, see if you can improve the code generator and submit a patch 5. Miscellaneous Improvements - Completely rewrite bugpoint. In addition to being a mess, bugpoint suffers from a number of problems where it will "lose" a bug when reducing. It should be rewritten from scratch to solve these and other problems - Add support for transactions to the PassManager for improved bugpoint. - Improve bugpoint to support running test in parallel on MP machines. - Add MC assembler/disassembler and JIT support to the SPARC port. -Move more optimizations out of the -instcombine pass and into InstructionSimplify. The optimizations that should be moved are those that do not create new instructions, for example turning sub i32 %x, 0into %x. Many passes use InstructionSimplify to clean up code as they go, so making it smarter can result in improvements all over the place.

Adding new capabilities to LLVM(1) Sometimes creating new things is more fun than improving existing things. These projects tend to be more involved and perhaps require more work, but can also be very rewarding. • Extend the LLVM intermediate representation Many proposed extensions and improvements to LLVM core are awiting design and implementation • Improvements for debug information Generation • EG support for non-call exceptions • Many ideas for feature requests are stored in LLVM bugzilla. Just search for bugs with a “new-feafure” keyword. 2. Pointer and Alias Analysis • We have a strong base for development of both pointer analysis based optimizations as well as pointer analyses themselves. It seems natural to want to take advantage of this: 1)The globals mod/ref pass basically does really simple and cheap bottom-up context sensitive alias analysis. It being simple and cheap are really important, but there are simple things that we could do to better capture the effects of functions that access pointer arguments 2)The alias analysis API supports the getModRefBehavior method, which allows the implementation to give details analysis of the functions. 3) We need some way to reason about error 4) There are lots of ways to optimize out and improve handling of memcpy/memset

Adding new capabilities to LLVM(2) 3. Profile-Guided Optimization We now have a unified infrastructure for writing profile-guided transformations, which will work either at offline-compile-time or in the JIT Ideas for profile-guided transformations: • Superblock formation (with many optimizations) • Loop unrolling/peeling • Profile directed inlining • Code layout • ... Improvements to the existing support: • The current block and edge profiling code that gets inserted is very simple and inefficient. Through the use of control-dependence information, many fewer counters could be inserted into the codeYoucould implement one of the "static profiling" algorithms which analyze a piece of code an make educated guesses about the relative execution frequencies of various parts of the code. • You could add path profiling support, or adapt the existing LLVM path profiling code to work with the generic profiling interfaces. 4. Code Compaction LLVM aggressively optimizes for performance, but does not yet optimize for code size. With a new ARM backend, there is increasing interest in using LLVM for embedded systems where code size is more of an issue.

Adding new capabilities to LLVM(3) 5. New Transformations and Analyses • Implement a Loop Dependence Analysis InfrastructureDesign some way to represent and query dep analysis • Value range propagation pass • More fun with loops: Predictive Commoning • Type inference (aka. devirtualization) • Value assertions (also PR810). 6. Code Generator Improvements • Merge the delay slot filling logic that is duplicated into (at least) the Sparc and Mipsbackends into a single target independent pass • Implement 'stack slot coloring' to allocate two frame indexes to the same stack offset if their live ranges don't overlap. This can reuse a bunch of analysis machinery from LiveIntervals. • Implement 'shrink wrapping', which is the intelligent placement of callee saved register save/restores.A • Finish adapting existing targets to use the calling convention description mechanism • Implement interprocedural register allocation. The CallGraphSCCPass can be used to implement a bottom-up analysis that will determine the *actual* registers clobbered by a function. • Implement a verifier for codegen level instructions. To help track down malformed machineinstrs sooner and make debugging problems easier.

Adding new capabilities to LLVM(4) 7. Miscellaneous Additions • Port the BiglooSchemecompiler, from Manuel Serrano at INRIA Sophia-Antipolis, to output LLVM bytecode. It seems that it can already output .NET bytecode, JVM bytecode, and C, so LLVM would ostensibly be another good candidate. • Write a new frontend for some other language (Java? OCaml? Forth?) • Random test vector generator: Use a C grammar to generate random C code, e.g., quest; run it through llvm-gcc, then run a random set of passes on it using opt • Add sandbox features to the Interpreter: catch invalid memory accesses, potentially unsafe operations (access via arbitrary memory pointer) etc. • Port Valgrind to use LLVM codegeneration and optimization passes instead of its own. • Create an LLVM pass that adds memory safety checks to code, like Valgrind does for binaries, or like mudflapdoes for gcc compiled code. • Write LLVM IR level debugger (extend Interpreter?) • Write an LLVM Superoptimizer. It would be interesting to take ideas from this superoptimizer for x86

Open LLVM Project

Open LLVM Project

Presentation Transcript

Project Invoicing using ] project-open [

Open Pseudonymiser Project

Project/Open Vision

Open Relevance Project

LLVM IR

Open Pseudonymisation Project

on LLVM

LLVM

Compiling Linux with LLVM

Open Door Project

LLVM Primer

LLVM

Open Borders Project

Open Directory Project

TCG to LLVM

Open Road Project

Project Invoicing using ] project-open [

Open Borders Project

LLVM

LLVM Primer