270 likes | 416 Views
Accurate Timing Analysis by Modeling Caches, Speculation and their Interaction. Xianfeng Li Tulika Mitra Abhik Roychoudhury National University of Singapore. Why Timing Analysis?. Timing guarantees for real time embedded system Real time scheduling:
E N D
Accurate Timing Analysisby Modeling Caches, Speculation and their Interaction Xianfeng Li Tulika Mitra Abhik Roychoudhury National University of Singapore
Why Timing Analysis? • Timing guarantees for real time embedded system • Real time scheduling: • Worst case bound on execution time • Tasks are guaranteed to be schedulable irrespective of inputs • Tight bound to avoid idle processor cycles • Extremely important for safety critical systems
Worst Case Execution Time (WCET) • Maximum execution time of a program on a micro-architecture for all possible inputs • Measurement • Execute program for all inputs: impractical • Execute program for selected inputs to get a lower bound on WCET (Observed WCET) • Analysis • Employ static analysis to compute an upper bound on WCET (Estimated WCET) Estimated Actual Observed
WCET Analysis • Program path analysis [Shaw’89, Healy’98,..] • All possible paths in program are not feasible • Micro-architectural modeling • Dynamically variable instruction execution time • Cache, Pipeline [Li’99, Theiling’00, Schneider’99,..] • Speculative execution (branch prediction) [Mitra’02] • Combined modeling of cache + speculative execution
Speculative Execution • No Speculative Execution • Misprediction • Correct prediction B N T S Misprediction penalty
Cache + Speculation: Destructive Effect Cache Execution B N T S Cache Miss 1: Loading into cache from speculated path & N T map to same cache block Cache Miss 2: Loading into cache from correct path
Cache miss penalty (CMP) along speculative path Fully masked by branch misprediction penalty (BMP) Partially masked by BMP wait for cache miss to be serviced before executing correct path Cache miss penalty along correct path due to fetch along speculative path Destructive Effect: Extra Cache Misses BMP BMP CMP CMP
Cache + Speculation: Constructive Effect Cache Execution B N S Cache Miss 1: Loading into cache from speculated path & B S map to same cache block Cache Hit: Correct block already loaded into cache
Technique: Integer Linear Programming • Integrate program analysis and micro-architectural modeling in an ILP framework [Li and Malik 1995] • Input: • Control Flow Graph (CFG) of the program • User provided loop bounds, recursion depth etc. • Specification of micro-architecture • Objective function: Execution time (maximized) • Constraints • Flow constraints from Control Flow Graph • Constraints from micro-architectural modeling • ILP formulation of instruction cache + speculative exec.
Objective Function WCET = (costB × countB + BMP x mispredictionB + CMP x missB + mp_delayB) • costB × countB: Execution time of basic block B without cache miss and branch misprediction • BMP x mispredictionB:Penalty due to mispredictions • CMP x missB: Penalty due to cache misses • Includes constructive and destructive effect of speculation along correct path • mp_delayB: Penalty due to partially masked cache misses along speculative path (variable CMP)
Flow Constraints: Easy !! • es,1 +e3,1 = count1 = e1,2 + e1,4 • e1,2 + e2,2 = count2 = e2,3 + e2,2 • e2,3 + e4_3 = count3 = e3,1 + e3,E • e1_4 = count4 = e4,3 • Loop bounds: e2,2 100 e3,1 10 Bounds countB Inflow = Basic Block Execution Count = Outflow Bound on maximum loop iterations B1 B2 B4 B3
Other Constraints • Branch misprediction constraints • Bounds mispredictionsB • Details appeared in an earlier paper • Timing Analysis of Embedded Software for Speculative Processors. T. Mitra, A. Roychoudhury and X. Li. In ACM Intl. Symposium on System Synthesis (ISSS) 2002 • Instruction cache miss constraints • Bounds missB[Li, Malik and Wolfe 1999]
Modeling Cache-Speculation Interaction • Modify instruction cache miss constraints to model constructive/destructive effect of speculation along correct path • Add additional constraints on mp_delayB: Penalty due to partially masked cache misses along speculative path
Modeling Instruction Cache S B1 pS_1 p1_3 B1 B3 B2 B4 p3_E p3_1 E B3 Cache Conflict Graph Flow among blocks mapping to the same cache line pS_1 + p3_1 = count1 = p1_3 miss1 = pS_1 + p3_1
Constructive Effect of Speculation B1 Miss T N B1 B3 T B2 B4 Miss N T B3 (2,T) B3 Partially Masked CMP N Speculative Path Correct Path
Constructive Effect of Speculation B1 Miss T N B1 B3 T B2 B4 Miss Hit N T B3 (2,T) B3 Partially Masked CMP N Speculative Path Correct Path miss3 will decrease by the amount of flow between B3 (2,T) and B3
Destructive Effect of Speculation B1 T N B2 B4 T B2 B4 Hit Miss N T B4 (1,N) B3 Partially Masked CMP Miss N Speculative Path Correct Path miss2 will increase by the amount of flow between B4 (1,N) and B2
General Flow Involving Extra Nodes b b n X X X X Case 1 m (b,X) b1 m (b,X) n1 Case 2 Case 4 Y Y Case 3 Case 2 m2 (b1,Y) m1 (b,X)
Additional Constraints b X X B1 B2 CMP > BMP BMP Bn i-1 count (mi(b,X)) = misprediction(b, X) - miss(mk(b,X)) k=1 n mp_delay (b, X) = miss(mk(b,X))×delay (mk(b,X)) k=1 i-1 delay (mi(b,X)) = CMP – (BMP - cost (mk(b, X)) k=1 And some others ….
Experimental Methodology • Observed WCET: simulation • SimpleScalar cycle-accurate architectural simulator • In-order exec, No pipeline, No Data Cache misses • Branch misprediction penalty = 5 cycles • Cache miss penalty = 10 cycles • Estimated WCET: Prototype analyzer • Input: benchmark in assembly code, -arch parameters, loop bounds • Output: ILP constraints • Feed the constraints to CPLEX: a commercial ILP solver
Summary • Micro-architectural modeling is crucial for tight estimation of Worst Case Execution Time (WCET) • Existing methods typically focus on a single micro-architectural feature • Cache • Pipeline • Speculation • A step towards combining micro-architectural features which effect each other • Cache misses/hits due to speculation