1 / 44

Specialization

Specialization. Run-time Code Generation in the context of Dynamic Compilation. presented by Mujtaba Ali. Earlier in the Semester. “VCODE: A Retargetable, Extensible, Very Fast Dynamic Code Generation System” Generate “assembly” at run-time

corin
Download Presentation

Specialization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Specialization Run-time Code Generation in the context of Dynamic Compilation presented by Mujtaba Ali

  2. Earlier in the Semester • “VCODE: A Retargetable, Extensible, Very Fast Dynamic Code Generation System” • Generate “assembly” at run-time • Run-time Code Generation, but not in the realm of Dynamic Compilation • “Fast, Effective Dynamic Compilation” • Our only foray into Run-time Specialization • Same authors as the first paper we will discuss • Precursor to DyC – we’ll call it the “DyC paper”

  3. Example Compiler Optimizations

  4. Order Matters foo (b) { a = 3; print(a); a = b; print(a); } foo (b) { a = 3; print(3); a = b; print(a); } foo (b) { a = 3; print(3); a = b; print(b); } foo (b) { print(3); print(b); } Constant Propagation Copy Propagation DA Elimination

  5. Typically: Constant Propagation Constant Folding Loop Unrolling Inlining Partial Evaluation Complete Set of Compiler Optimizations Partial Evaluation

  6. DyC Paper Refresher

  7. Towards Automatic Construction of Staged Compilers Matthai Philipose, Craig Chambers, and Susan J. Eggers

  8. Contributions • Eases the burden on the compiler writer • Compiler writer simply writes optimizations as usual • Supporting contributions: • Enhancements to Partial Evaluation • Enhancements to Dead-assignment Elimination

  9. SCF-ML Language in which compiler optimizations are written First-order side-effect free subset of ML Staged Compilation Framework Compiler Writer • Stager • Specializes optimizations written in SCF-ML to separate compile-time and run-time optimizations • Requires “approximate” information about program input Compiler User

  10. High-level Overview of SCF

  11. Concrete Example • Function to be optimized: • Say this function is the only contents of “mul_add.c”

  12. Concrete Example (con’t) • At compile-time, compiler user does the following: O1: const prop (SCF-ML) O2: copy prop (SCF-ML) O3: dae (SCF-ML) stub code stager stager stager I1 AST of mul_add.c I2 I3 O1’: staged const prop (SCF-ML) O2’: staged copy prop (SCF-ML) O3’: staged dae (SCF-ML) Info: a is const at run-time

  13. Concrete Example (con’t) • In ML:

  14. Concrete Example (con’t) • What does the stub function do? • All this is happening at run-time P1 O1’: staged const prop (SCF-ML) O2’: staged copy prop (SCF-ML) O3’: staged dae (SCF-ML) AST of mul_add.c P2 P3 Popt Run-time const value of a

  15. Concrete Example (con’t) • In ML:

  16. Stager • Stager specializes the optimizations • Partial Evaluator optimizes writer’s optimization and generates run-time/staged part of optimization • Performance is crucial for run-time part of optimizations hence Dead-assignment Elimination • It appears as if there are two level of specialization • One of the “alleged” levels is just the automatic separation between compile-time and run-time

  17. Alternate View of Specialization • M’ is specialized to I • Note that O1’ will execute faster than O1 Standard compiler: P O1 M Specializing compiler: P O1’ O1’/2 M’ run it M/2 I

  18. Compare and Contrast with DyC • Compiler writer’s job is much easier • Stager automatically does all the work • Not specific to C • Not limited to optimizations that come with the staged compiler • Compiler user still has the same API: • Input (run-time constraints) • User program

  19. Benchmarking • Definitely easier than DyC • But is automatic staging as effective wrt: • Unstaged (standard) compiler • Hand-staged optimizations (as in DyC) • How fast are the combined static-time and run-time stages of the staged compiler? • Also, what about the size of the staged compiler?

  20. Benchmarking (con’t) • vs. Unstaged (standard) compiler • Noticeably faster (it better be) • vs. Hand-staged optimizations • Hand-staged optimizations perform much better • Don’t know why, though • Unstaged compiler vs. staged compiler (both optimized) • Who cares since the unstaged compiler does every thing at static-time • Size of the automatically staged compiler • In practice, size grows linearly in the size of the input program

  21. Drawbacks • Compiler user must understand specialization • Doesn’t seem to be a way around this • Not clear if code generation during stub function execution is language-specific • Implemented in ML • Language interoperability (ML  C) • ML is garbage collected and bounds checked • Not robust enough for public consumption

  22. Supporting Contributions • Partial Evaluators have been developed for many domains • However, partially evaluating optimization programs presents its own challenges • SCF’s Partial Evaluator: • Knows about map and set types • Uses identity tags for better equality analysis

  23. Supporting Contributions(con’t) • Also, SCF employs a smarter representation of abstract values • More amenable to specialization of code implementing optimizations • Abstract values are used to loosely interpret code during partial evaluation

  24. Towards Automatic Specialization of Java Programs Ulrik Pagh Schultz, Julia L. Lawall, Charles Consel, and Gilles Muller

  25. Contributions • Specialization is useful in an OO setting • Supporting contributions: • OO languages lend themselves to specialization • “Quick and dirty” proof of concept • Performance benchmarks

  26. General Idea • The more generic you get, the more your performance suffers • OO is generic by nature • Specialization is our savior • Specialization will convert a generic program to a specific one Genericity Specificity Sluggish Efficient

  27. Automatic Specialization? • What is meant by “automatic”? • Specialized code is not generated by hand • What is different about the “specialization” here? • Let’s try and mimic run-time specialization at compile-time • Specialize for each member of the most likely inputs • If actual input at run-time is not in this set, fall back on the generic, non-specialized code

  28. Concrete Example • Consider this Java class for the power function:

  29. Concrete Example (con’t) • Assume the exponent is always 3 • We can specialize the calculate() method • JSCC would allow us to automatically specialize calculate()

  30. Concrete Example (con’t) • With JSCC, we would define a specialization class as such: • JSCC will generate the specialized code and…

  31. Concrete Example (con’t) • … JSCC adds a method for switching between specialized implementations:

  32. Specialization in an OO Context • How does OO naturally lend itself to specialization? • Data Encapsulation • Virtual Method Calls • In the case of Java, specializing the “VM”

  33. Specializing Data Encapsulation • Traditional program specialization, but… • … save on sequence of pointer dereferences (due to OO)

  34. Specializing Object Types • Information fed to specializer can help eliminate dynamic dispatch overhead

  35. Specializing the Virtual Machine • Such low-level optimizations are possible because Java is eventually translated to C in this scheme

  36. Two Levels of Specialization (Again) • Specialization of original Java program • JSCC • Example earlier • Specialization of translated C program • Tempo • Only the compile-time specialization capabilities of Tempo are exploited

  37. “Quick and Dirty” Prototype

  38. Benchmarks Image Processing Application

  39. Benchmarks (con’t) • What are “Prospective” Benchmarks? • Benchmarks on manual back-translation to Java Sparc Pentium

  40. Future Work • Run-time specialization • Tempo already supports run-time specialization • Automatic back translation to Java • Why? Portability. • Apply to situations involving extreme genericity • Software components • For example: JavaBeans

  41. Drawbacks • Lack of run-time specialization • Forces tedious compile-time specialization • How will exploiting Tempo’s run-time specialization affect backporting to Java? • Looks and feels like an ugly hack

  42. The Present • JSPEC is the new incarnation • Run-time specialization • Java-to-Java translation • JSPEC paper is unpublished but available

  43. Compare and Constrast with SCF • Unlike SCF, adding additional optimizations to the specialization is non-trivial • You’re stuck with the provided partial evaluator • But partial evaluation covers a lot of ground • No run-time specialization • Most significant drawback

  44. That’s All Folks! • Use Cyclone because it’s good for you! • Cyclone is the bestest language.

More Related