330 likes | 345 Views
Evaluate the impact of thread escape analysis on memory consistency optimizations. Discuss memory models, Pensieve system, escape analyses, and their impact on delay set analysis and synchronization analysis. Explore experimental results and conclusions.
E N D
Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations Chi-Leung Wong, Zehra Sura, Xing Fang, Kyungwoo Lee, Samuel P. Midkiff, Jaejin Lee and David Padua University of Illinois at Urbana-Champaign IBM T.J. Watson Research Center Purdue University Seoul National University
Outline • Memory Models • The Pensieve System • Escape Analyses • Qualitative Impact of Escape Analyses on Delay Set Analysis and Synchronization Analysis • Experimental Results • Conclusion
Memory Models • Consider the following code segments: • Thread 1 : data = 100; data_ready = true; • Thread 2 : while (!data_ready); t = data; • Can t == 0? • Yes if reordering happens • Thread 1 : data_ready = true; data = 100; • Can be done by compiler and hardware • Memory models tell us the answer • Sequential Consistency says no
Objective of the Pensieve Project • Sequential consistency (SC) on top of Intel x86 memory models • Implementation based on Jikes RVM • All analyses done in JIT time • Need to minimize both analysis and application execution time
Enforcing SC • Done by enforcing memory accesses orders • not all orderings need to be enforced • only enforce orders really needed • Delay Set Analysis (DSA) [SS88] computes such orders • Our approach : Approximation of DSA • Orders enforced by inserting fences in generated code
x x’ x y x y’ y x’ Original DSA • Program edge • x executes before y in the same thread • Conflict edge • x and x’ conflict accesses • Order of access affects program outcome • In this paper: • to the same memory location • one of them is a write
x y’ Not mixed x x y’ Not minimal z y x’ x y y x’ Mixed Minimal y Original DSA (Cont’d) • Critical cycle • Minimal • Cannot form smaller cycle using subset of nodes • Mixed • Contains both edges • Enforce program edges on a critical cycle
Approximate DSA • Approximate of critical cycle • x precedes y • Conflict accesses for • x and x’ • y and y’ • y’ precedes x’ • Enforce program edges on approx critical cycle x y’ y x’
Source Program Thread Escape Analysis Synchronization Analysis Program Analyses Delay Set Analysis Program Analyses Orders to Enforce Code Optimizations FenceInsertion & Optimization Target Program The Pensieve System
Escape Analyses • Identify objects which may be accessed by two or more threads • Output: set of variables • {v | v points to an object may be accessed by >= 2 threads}
x Impact on Delay Set Analysis • x, y, y’, x’ must be escaping accesses • Cannot form a cycle if one of them is not escaping access • Fewer escaping accesses implies fewer possible pairs of (x,y) • Fewer checks to be done • Fewer delays y’ y x’
Impact on Synchronization Analysis • Synchronization analysis reduces number of conflict edges considered by DSA • Consider synchronized construct • Calls to start() and join() • Our system only consider t1.join() • if it can match some t2.start() call • t1 and t2 are not escaping • More precise escape info • more join() calls matched • more precise DSA result
Escape Analyses Comparison • In this study, we compare 4 algorithms: • Connectivity Analysis (Pensieve) • Field Base Analysis (Pensieve) • For comparison purposes • Bogda’s Analysis • Removing Unnecessary Synchronization in Java. (OOPSLA 1999) • Ruf’s Analysis • Effective Synchronization Removal for Java. (PLDI 2000)
Connectivity Escape Analysis • An object is escaping if both • Reachable by more than one thread due to two possible cases: • Reachable by a static field • Passed from a thread constructor • Accessed by more than one thread • Do not assume this escaping in run() by default • Field insensitive for most memory accesses • I.e. do not distinguish x.f vs x.g • Except accesses to Runnable objects
Field Base Escape Analysis • An object is escaping if • Reachable from a static field • Passed from a thread constructor • Do not assume this escaping in run() by default • Similar to connectivity base analysis, • Field sensitive • Suppose O1, O2 of same type • O1.f different from O1.g • O1.f same as O2.f
Bogda’s Escape Analysis • An object is escaping if it is reachable: • By a static field • By a Runnable object • Via more than 1 field reference
Ruf’s Escape Analysis • An object is escaping if both • Reachable from either • A static field or • A Runnable object • Synchronized by more than one thread • Adapted for our own use • “synchronized” “accessed”
Experimental Settings (Machine) • Intel (Dell PowerEdge 6600 SMP) • 4 Intel hyperthreaded 1.5Ghz Xeon processors • with 1MB cache each • 6G system memory.
Experimental Settings (Software) • Original • default Jikes RVM implementation • base case for performance comparison • Enforcing SC • Empty • Arg Escaping • Connectivity analysis • Field-base analysis • Bogda’s analysis (bogda) • Ruf’s analysis
Measurements • Escape Analysis Time • Impact on Delay Set Analysis Time • Impact on Synchronization Analysis Time • Slowdown due to fence insertion • Delay Set Analysis only • Delay Set Analysis with Synchronization Analysis
Escape+DSA+ Synchronization Analysis Time / Compilation Time
Conclusions • Evaluate interaction between escape analysis and synchronization/delay set analysis • Montecarlo and jbb motivates enabling field sensitivity for connectivity base analysis