220 likes | 307 Views
Extended Whole Program Paths. Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona. Control Flow and Dependence Traces. Control Flow Traces Sequence of basic blocks. Identification of hot paths. Path Sensitive Instruction Scheduling and Optimization.
E N D
Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona
Control Flow and Dependence Traces • Control Flow Traces • Sequence of basic blocks. • Identification of hot paths. • Path Sensitive Instruction Scheduling and Optimization. • Path Prediction and Instruction Fetching. • Dependence Traces • Capture data dependences. • Flow from a definition to a use. • Data Speculative Optimizations for Itanium. • Computation of Dynamic Slices.
Control Flow and Dependence Traces • Control Flow Traces are smaller than Dependence Traces and can be compressed well. • Average size for Spec 2K benchmarks is 179 MB. • Compression Factor • Sequitur – 681 • VPC – 442 • Dependence Traces are large and do not compress as well as Control Flow Traces. • Average size for Spec 2K benchmarks is 565 MB. • Compression Factor • Sequitur – 1.31 • VPC – 5.8 • Is there an alternative trace representation ?
Our Approach • Extended Control Flow Trace – Unified Trace Representation. • Capture both control flow and dependence information. • The data dependences are embedded as control flow. • The unified trace is smaller than control flow + dependence traces. • Our compressed unified trace is also smaller than the compressed control flow + compressed dependence traces.
If p==&X 5 = X 6 Goals in Designing the eCF 1 X = _ • The dependence can now not be recovered due to possible aliasing. • Additional Control Flow can capture the dependence. • The dependence can be recovered from the Control Flow. 3 2 X = _ *p = _ = X 4 4
Cost of Capturing Dependences • No-cost capture • For these dependences, no disambiguation checks are needed. • Fixed cost capture • The number of disambiguation checks needed is a constant. • Variable cost capture. • The number of disambiguation checks varies.
No Cost Capture • All instances of the dependence can be recovered from the control flow trace.
Fixed Cost Capture • A single disambiguation check is sufficient to capture this dependence. Single Check
Variable Cost Capture • The instances of the dependence can be caused by any instance of the definition statement. Multiple Checks
Cost of Instrumentation and Trace Compressibility • Reducing the number of checks • Reducing the size of the generated trace. • Reduction in run-time overhead. • Improving the Compressibility • Similar Control Flow Signatures.
Two Phased Approach • Conservative nature of Static Pointer Analysis. • Too many potential dependences per use. • Two phased Approach • Filtering Phase • Find all dependences exercised. • Profiling Phase • Add disambiguation checks only for those dependences exercised.
Binary Search vs. Linear Search • Track the last definition and instance of every write to a memory address. • Search the address array using binary search instead of linear search.
Experimental Results • Implementation on the Microsoft Phoenix RDK. • Spec 2K benchmark binaries were rewritten to obtain instrumented versions. • Easy to implement using Phoenix. • Intermediate representation was low-level x86 instruction set. • Split dependences into register and memory. • Register dependences are always recoverable from control flow trace. • Memory dependences were recovered using our approach.
Register and Memory dependences • A Significant (76 %) of dependences (register) can be recovered from the control flow trace
Uncompressed Trace Sizes Cont. + Dep. Unified Ratio • The unified trace is 62 % of the size of Control Flow + Dependence Trace
Sequitur Compressed Cont. + Dep. Unified Ratio • The compressed unified trace is 4 % of the size of compressed Control Flow + Dependence Trace
VPC Compressed Cont. + Dep. Unified Ratio • The compressed unified trace is 21 % of the size of compressed Control Flow + Dependence Trace
Memory Dependence Types • 30 % of dependences can be recovered at no cost.
Address Comparisons • Binary Search reduces the address comparisons by 4 orders of magnitude.
Run-time Overhead • There is a 20 % increase in run-time overhead in collecting the unified trace.
Conclusions • We have designed an extended control flow trace that captures both control flow and data dependence history. • The key to the unified trace is the ability to convert memory data dependences into control flow. • The resulting unified trace is smaller than the combined control flow + dependence trace. • The run-time overhead increases by 20 %. Our Thanks to Hoi Vo of Microsoft Corporation and the Phoenix Compiler Infrastructure Group.