1 / 27

A Unified WCET Analysis Framework for Multi-core Platforms

A Unified WCET Analysis Framework for Multi-core Platforms. Sudipta Chattopadhyay , Chong Lee Kee, Abhik Roychoudhury National University of Singapore Timon Kelter, Peter Marwedel Heiko Falk TU Dortmund, Germany Ulm University, Germany. Timing Analysis .

Download Presentation

A Unified WCET Analysis Framework for Multi-core Platforms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Unified WCET Analysis Framework for Multi-core Platforms Sudipta Chattopadhyay, Chong Lee Kee, Abhik Roychoudhury National University of Singapore Timon Kelter, Peter MarwedelHeiko Falk TU Dortmund, Germany Ulm University, Germany RTAS 2012, Beijing

  2. Timing Analysis • Hard real time systems require absolute timing guarantees • System level analysis • Single task analysis • Worst case execution time (WCET) analysis • An upper bound on execution time for all possible inputs • Sound over-approximation is obtained by static analysis RTAS 2012, Beijing

  3. WCET Analysis WCET of basic blocks Infeasible path constraints Program Micro-architectural modeling IPET Loop bound Control flow graph constraints Path analysis IPET = Implicit Path Enumeration Technique RTAS 2012, Beijing

  4. Architecture Core 1 Core n L1 cache L1 cache Shared bus Shared L2 cache Memory RTAS 2012, Beijing

  5. Micro-architectural Modeling Li et. al RTSS’09 branch predictor shared cache Chattopadhyay et. al SCOPES’10 Kelter et. al ECRTS’11 Interactions cache pipeline shared bus Rosen et. al RTSS’07 Single Core Multi Core Unified Multi-core timing analysis RTAS 2012, Beijing

  6. Timing Anomaly (shared Cache) hit miss miss miss hit hit miss hit miss hit miss hit miss hit miss hit May not be the worst case path RTAS 2012, Beijing

  7. Timing Anomaly (Shared Bus) delaymax delaymin delaymax delaymin delaymin delaymax May not be the worst case path RTAS 2012, Beijing

  8. Background • Representing each pipeline stage as a timing interval start [1,3] finish [3,7] [4,10] latency EX WB R1 := R2 + 5 IF ID CM Structural dependency CM IF ID EX WB EX WB CM IF ID R5 := R1 * R7 IF ID EX WB CM Contention IF ID EX WB CM R3 := R5 * 5 A fixed-point analysis derives the timing of each stage as an interval RTAS 2012, Beijing

  9. Shared Cache + Pipeline Abstract interpretation – hit, miss or unclear Timing interval miss unclear L1 hit T := T + [1, 1] T := T + [ miss1 + 1, miss1 + 1] T := T + [miss1 + 1, miss1 + miss2 + 1] L2 (shared) hit unclear T := T + [1, miss1 + miss2 + 1] hit latency = 1 cycle miss1 L1 cache miss penalty miss2 L2 cache miss penalty RTAS 2012, Beijing

  10. Shared Bus Analysis • Time Division Multiple Access (TDMA) • Offset abstraction Core 0 Core 1 Core 0 Core 1 Core 0 Core 1 Core 0 Core 1 delay = 0 offset delay offset round round T’ (core 0) T (core 1) RTAS 2012, Beijing

  11. Shared bus + pipeline IF1 ID1 IF2 ID2 O1 O2 IF3 ID3 Oin (approximate timing by static analysis) IF2 finishes after ID1 ID1 finishes after IF2 ID1  IF2 Oin = O1 IF2  ID1 Oin = O2 IF2  ID1 Oin = O1 U O2 Property: Offset content monotonically decreases over different iterations RTAS 2012, Beijing

  12. Loop Construct Ci = bus context of the loop body at i-th iteration Bus contexts …… C3 C100 C1 C2 Unrolling loop iterations EXPENSIVE RTAS 2012, Beijing

  13. Loop Construct Bus context flow graph C1 C2 C3 C4 C5 C5 C3 How do we define bus context? Property: If Ci Cj, then Ci+k  Cj+k for any k > 0 RTAS 2012, Beijing

  14. Loop Construct Bus context flow graph C1 C2 Bus offsets of all pipeline stages of all instructions? C3 There could be thousands of nodes C4 How do we define bus context? RTAS 2012, Beijing

  15. Loop Construct EX WB previous iteration IF ID CM CM IF ID EX WB EX WB CM current iteration IF ID IF ID EX WB CM How do we define bus context? Property: If the bus offsets of the cross-iteration edges do not change, WCET of the loop iteration cannot change RTAS 2012, Beijing

  16. Loop Construct Bus context flow graph C1 C2 Compute WCET for each bus context C3 Generate ILP flow constraints: E(C1) + E(C2) + E(C3) + E(C4) ≤ loop bound E(C1) ≥ E(C2) E(C1) = number of times context C1 is executed C4 RTAS 2012, Beijing

  17. Branch prediction + Cache m Cache conflict m Cache hit Cache miss m’ m evicted from cache branch correctly predicted branch incorrectly predicted RTAS 2012, Beijing

  18. Branch prediction + Cache Cache content m Branch location JOIN m Maximum number of speculated instructions m’ Cache content Unclear cache access RTAS 2012, Beijing

  19. Overall Picture WCET of basic blocks Infeasible path constrains shared cache branch predictor IPET cache pipeline shared bus Loop bound Multi Core constraints Bus context constraints Path analysis RTAS 2012, Beijing

  20. Experimental Setup (Chronos Toolkit) GCC simplescalar C source Binary code CFG Micro architectural modeling Flow constraints Private cache pipeline Branch prediction ILP WCET Shared cache Shared bus Micro-architectural constraints RTAS 2012, Beijing

  21. Cache Sharing vs Cache Partitioning 4 4 4 Core 1 8 8 8 Core 1 Core 2 Core 2 Shared Cache between 2 cores Horizontally partition Vertically partition RTAS 2012, Beijing

  22. Evaluation (cache + pipeline) Imprecision of shared cache analysis jfdctint statemate RTAS 2012, Beijing

  23. Evaluation (Cache + pipeline + Speculation) Imprecision of modeling speculation RTAS 2012, Beijing

  24. Evaluation (Bus + pipeline) Imprecision of shared bus analysis Imprecision of path analysis RTAS 2012, Beijing

  25. Evaluation (Bus + pipeline + Speculation) Imprecision of path analysis Imprecision of shared bus analysis RTAS 2012, Beijing

  26. Conclusion • A unified WCET analysis framework • Handles interaction of shared cache and bus with pipeline and branch prediction • Timing anomaly is possible, state explosion is handled by timing interval abstraction • Detailed information of the tool and extensive results are available at: • http://www.comp.nus.edu.sg/~rpembed/chronos-multi-core.html RTAS 2012, Beijing

  27. Questions Thank You RTAS 2012, Beijing

More Related