540 likes | 712 Views
ILPc : A novel approach for scalable timing analysis of synchronous programs. Hugh Wang Partha S. Roop Sidharta Andalam. Outline. Timing analysis of concurrent programs Problem statement Our approach - ILPc R esults Conclusions. Acronyms. ILP – Integer Linear Program
E N D
ILPc: A novel approach for scalable timing analysis of synchronous programs Hugh Wang Partha S. Roop SidhartaAndalam
Outline • Timing analysis of concurrent programs • Problem statement • Our approach - ILPc • Results • Conclusions
Acronyms • ILP – Integer Linear Program • ILPs – ILP sequential • ILPc – ILP concurrent • EOT = End of Tick – also known as “pause” • TCCFG – Timed concurrent control flow graph (Intermediate format)
Synchronous languages abort [loop foo1(); pause; foo2(); pause; end loop ] || [loop foo3(); pause; foo4(); pause; end loop ] when s
Synchronous languages abort [loop foo1(); pause; foo2(); pause; end loop ] || [loop foo3(); pause; foo4(); pause; end loop ] when s
Synchronous languages abort [loop foo1(); pause; foo2(); pause; end loop ] || [loop foo3(); pause; foo4(); pause; end loop ] when s abort start abort end
Synchronous languages abort [loop foo1(); pause; foo2(); pause; end loop ] || [loop foo3(); pause; foo4(); pause; end loop ] when s abort condition
Synchronous languages abort [loop foo1(); pause; foo2(); pause; end loop ] || [loop foo3(); pause; foo4(); pause; end loop ] when s abort condition False True
Synchronous languages abort [loop foo1(); pause; foo2(); pause; end loop ] || [loop foo3(); pause; foo4(); pause; end loop ] when s fork
Synchronous languages abort [loop foo1(); pause; foo2(); pause; end loop ] || [loop foo3(); pause; foo4(); pause; end loop ] when s
Synchronous languages abort [loop foo1(); pause; foo2(); pause; end loop ] || [loop foo3(); pause; foo4(); pause; end loop ] when s
Synchronous languages Read Inputs Emit outputs Time Computation
Synchronous languages Tick 4 Tick 3 Tick 1 Tick 2 Time
Synchronous languages • Worst Case Reaction Time analysis • WCRT analysis • Synchrony hypothesis • Reactive system operates infinitely fast compared to the environment. • Validation • Min(inter-arrival-time of events) > Max (Reaction Time) Longest tick Time
Outline • Timing analysis of concurrent programs • Problem statement • Our approach - ILPc • Results • Conclusions
Motivating example T2 T1 T3 A1 B1 C1 A2 B2 C2 A3 T1 || T2 || T3
Motivating example T2 T1 T3 10 5 10 A1 B1 C1 15 A2 3 10 B2 C2 20 A3 Execution Time: 25
Motivating example T2 T1 T3 10 5 10 A1 B1 C1 15 A2 3 10 B2 C2 20 A3 Execution Time: 28
Motivating example T2 T1 T3 10 5 10 A1 B1 C1 15 A2 3 10 B2 C2 20 A3 Execution Time: 35
Conventional approaches • Max-plus Sum the largest execution time of each thread WCRT M. Boldt and C. Traulsen and R. Hanxleden. Worst Case Reaction Time Analysis of Concurrent Reactive Programs. ENTCS, 203(4), 2008. L. Ju, B. K. Huynh, A. Roychoudhury, and S. Chakraborty. Performance debugging of Esterelspecifications. CODES-ISSS 2008.
Conventional approaches • State exploration Find all feasible states Find the largest execution time WCRT L. Ju, B. K. Huynh, S. Chakraborty, and A. Roychoudhury. Context-sensitive timing analysis of Esterel programs. DAC, 2009. S. Andalam, P. S. Roop, and A. Girault. Pruning infeasible paths for tight WCRT analysis of synchronous programs. DATE, 2011. M. Kuo, R. Sinha, and P. S. Roop. Efficient WCRT analysis of synchronous programs using reachability. DAC, 2011. ILPs Model checking Reachability
Motivating example T2 WCRT analysis -Max-Plus WCRT = Max(T1) + Max(T2) + Max(T3) = A3 + B1 + C2 = 20 + 10 + 10 = 40cycles -State Exploration A1+B1+C1 = 25 A2+B2+C2 = 28 A3+B1+C1 = 35 A1+B2+C2 = 23 A2+B1+C1 = 30 A3+B2+C2 = 33 WCRT = Max(25,28,35,23,30,33) = 35 cycles T1 T3 10 5 10 A1 B1 C1 Tick alignment 15 A2 3 10 B2 C2 20 A3 Scalability
Tradeoff Precision State exploration Max-plus Analysis Time
Motivating example T2 T1 T3 10 5 10 A1 B1 C1 15 A2 15 3 10 B2 C2 20 A3
Motivating example WCRT analysis -Max-Plus WCRT = Max(T1) + Max(T2) + Max(T3) = A3 + B2 + C2 = 20 + 15 + 10 = 45 cycles -State Exploration A1+B1+C1 = 25 A2+B2+C2 = 40 A3+B1+C1 = 35 A1+B2+C2 = 35 A2+B1+C1 = 30 A3+B2+C2 = 45 WCRT = Max(25,40,35,35,30,45) = 45 cycles T2 T1 T3 10 5 10 A1 B1 C1 15 A2 15 10 B2 C2 20 A3
The problem statement State exploration Precision Max-plus Analysis Time
Our approach • Our approach (ILPc) • Inspired by counter example guided model checking • Also has some ideas similar to local model checking Find largest execution time using max-plus approach Fail Verify tick alignment Success WCRT
Outline • Timing analysis of concurrent programs • Problem statement • Our approach - ILPc • Results • Conclusions
Overview of ILPc TCCFG ILPmodel
ILP model • Objective function: • Conventional nodes: • Features: • Solver sets to 1 whenever possible. • Definitions: • EOT arrows E1 E6 E2 E12 E7 E11 E8 E13 E3 E4 E16 E9 E14 E15 E10 E5 E17
ILP model EOT: Fork: 2 E1 E6 E2 E12 E7 E11 E8 E13 E3 E4 E16 E9 E14 E15 E10 E5 E17
ILP model Abort start: Abort end: Preemption: E1 E6 E2 E12 E7 E11 E8 E13 E3 E4 E16 E9 E14 E15 E10 E5 E17
ILP • Compared with the conventional ILP • Directly capture features of synchronous languages. • More precise WCRT estimates with minimum overhead.
Overview of ILPc TCCFG ILPmodel Tick expressions
Tick Expressions 0 (1,3,5…)
Overview of ILPc TCCFG ILPmodel ILP solver WCRT & Execution path Tick expressions Ticks can be aligned?
Verifying tick alignment 0 ILP model No integer solution
Overview of ILPc TCCFG ILPmodel ILP solver WCRT & Execution path Refinement Tick expressions Ticks can be aligned? Fail Success WCRT
Outline • Timing analysis of concurrent programs • Problem statement • Our approach - ILPc • Results • Conclusions
Benchmarking • Compared with 3 existing approaches [1,2,3] • Conducted in 2 phases • Phase 1: Theoretical performance • Phase 2: Real-world applications • Benchmark computer • Windows based • Quad-core 1.6 GHz CPU • 8 GB memory [1] L. Ju, B. K. Huynh, S. Chakraborty, and A. Roychoudhury. Context-sensitive timing analysis of Esterel programs, DAC, 2009. [2] S. Andalam, P. S. Roop, and A. Girault. Pruning infeasible paths for tight WCRT analysis of synchronous programs, DATE 2011. [3] M. Kuo, R. Sinha, and P. S. Roop. Efficient WCRT analysis of synchronous programs using reachability, DAC, 2011.
Scalability • Analysis time vs. program states. • Analysis time of ILPc • Time taken for each iteration. • Number of program states. • The number of iteration. • Structure and cost distribution.
Benchmarking: Phase 1 • Two sets of benchmarks • Set A - Maximum number of required iterations to find the WCRT. • Set B - Minimumnumber of required iterations to find the WCRT. (Iteration = 1)
Benchmarking: Phase 1 5 5 10 10 5 … 10 Set A 10 5 5 A1 0 0 0 0 10 A2 0 End
Benchmarking: Phase 1 5 5 10 10 5 … 10 Set A 10 5 5 A1 0 0 0 0 10 A2 5 5 5 5 Set B … 0 10 10 10 10 End 0 0 0 0
Benchmarking: Phase 1 Set A
Benchmarking: Phase 1 Set B
Benchmarking: Phase 1 • Same precision. • Analysis time of ILPcdepends heavily on the number of iterations rather than the number of program states. • On average, analysis time of ILPc should be between the worst case and best case scenarios.
Benchmarking: Phase2 Small Large L. H. Yoong and G. D. Shaw. Auckland function block benchmark. University of Auckland, 2010. www.ece.auckland.ac.nz/~pretzel/Auckland_FB_Benchmark.zip
Benchmarking: Phase2 Benchmarks 1-4 (less than 20 threads)
Benchmarking: Phase2 Benchmarks 5-7 (more than 20 threads) >1 hr Out of Memory Out of Memory