1 / 81

Precision Timed Embedded Systems Using TickPAD Memory

Precision Timed Embedded Systems Using TickPAD Memory. Matthew M Y Kuo* Partha S Roop* Sidharta Andalam † Nitish Patel* *University of Auckland, New Zealand † TUM CREATE, Singapore. Introduction. Hard real time systems Need to meet real time deadlines

mareo
Download Presentation

Precision Timed Embedded Systems Using TickPAD Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Precision Timed Embedded Systems Using TickPAD Memory Matthew M Y Kuo* Partha S Roop* Sidharta Andalam† Nitish Patel* *University of Auckland, New Zealand †TUM CREATE, Singapore

  2. Introduction • Hard real time systems • Need to meet real time deadlines • Catastrophic events may occur when missed • Synchronous execution approach • Good for hard real time systems • Deterministic • Reactive • Aids static timing analysis • Well bounded programs • No unbounded loops or recursions

  3. Synchronous Languages • Executes in logical time • Ticks • Sample input → computation → emit output • Synchronous hypothesis • Tick are instantaneous • Assumes system is executes infinitely fast • System is faster than environment response • Worst case reaction time • Time between two logical ticks • Languages • Esterel • Scade • PRET-C • Extension to C

  4. Synchronous Languages • Executes in logical time • Ticks • Sample input → computation → emit output • Synchronous hypothesis • Tick are instantaneous • Assumes system is executes infinitely fast • System is faster than environment response • Worst case reaction time • Time between two logical ticks • Languages • Esterel • Scade • PRET-C • Extension to C

  5. PRET-C • Light-weight multithreading in C • Provides thread safe memory access • C extension implemented as C macros

  6. Introduction • Practical System require larger memory • Not all applications fit on on-chip memory • Require memory hierarchy • Processor memory gap [1] Hennessy, John L., and David A. Patterson. Computer Architecture: A Quantitative Approach. San Francisco, CA: Morgan Kaufmann, 2011.

  7. Introduction • Traditional approaches • Caches • Scratchpads • However, • Scant research for memory architectures tailored for synchronous execution and concurrency.

  8. Caches CPU Main Memory

  9. Caches CPU Main Memory • Traditionally Caches • Small fast piece of memory • Temporal locality • Spatial locality • Hardware Controlled • Replacement policy Cache

  10. Caches CPU Main Memory • Hard real time systems • Needs to model the architecture • Compute the WCRT • Caches models • Trade off between length of computation time and tightness • Very tight worse case estimate is not scalable Cache

  11. Scratchpad CPU Main Memory • Scratchpad Memory (SPM) • Software controlled • Statically allocated • Statically or dynamically loaded • Requires an allocation algorithm • e.g. ILP, Greedy SPM

  12. Scratchpad CPU Main Memory • Hard real time systems • Easy to compute tight the WCRT • Reduces the worst case performance • Balance between amount of reload points and overheads • May perform worst than cache in the worst case performance SPM

  13. TickPAD CPU Main Memory Cache SPM • Good at overall performance • Hardware controlled • Good at worst case performance • Easy for fast and tight static analysis

  14. TickPAD CPU Main Memory TPM Cache SPM • Good at overall performance • Hardware controlled • Good at worst case performance • Easy for fast and tight static analysis

  15. TickPAD CPU Main Memory TPM • TickPAD Memory • TickPAD - Tick Precise Allocation Device • Memory controller • Hybrid between caches and scratchpads • Hardware controlled features • Static software allocation • Tailored for synchronous languages • Instruction memory

  16. TickPAD Design flow

  17. PRET-C main int main() { init(); PAR(t1,t2,t3); ... } void thread t1() { compute; EOT; compute; EOT; } t1 t3 t2

  18. PRET-C main Computation int main() { init(); PAR(t1,t2,t3); ... } void thread t1() { compute; EOT; compute; EOT; } t1 t3 t2

  19. PRET-C main Spawn children threads int main() { init(); PAR(t1,t2,t3); ... } void thread t1() { compute; EOT; compute; EOT; } t1 t3 t2

  20. PRET-C main End of tick – Synchronization boundaries int main() { init(); PAR(t1,t2,t3); ... } void thread t1() { compute; EOT; compute; EOT; } t1 t3 t2

  21. PRET-C main Child thread terminate int main() { init(); PAR(t1,t2,t3); ... } void thread t1() { compute; EOT; compute; EOT; } t1 t3 t2

  22. PRET-C main Main thread resume int main() { init(); PAR(t1,t2,t3); ... } void thread t1() { compute; EOT; compute; EOT; } t1 t3 t2

  23. PRET-C Execution main t1 t3 t2 Sample inputs Time

  24. PRET-C Execution main t1 t3 t2 main Time

  25. PRET-C Execution main t1 t3 t2 main t1 Time

  26. PRET-C Execution main t1 t3 t2 main t1 t2 Time

  27. PRET-C Execution main t1 t3 t2 main t1 t2 t2 Time

  28. PRET-C Execution main t1 t3 t2 Emit Outputs main t1 t2 t2 Time

  29. PRET-C Execution main t1 t3 t2 1 tick (reaction time) main t1 t2 t2 Time

  30. PRET-C Execution main t1 t3 t2 local tick main t1 t2 t2 Time

  31. Assumptions 4 Instructions 1 Cache Line Takes 1 burst transfer from main memory Cache miss, takes 38 clock cycles [2] buffer Buffers are 1 cache line in size Each instructions takes 2 cycles to execute 2. J. Whitham and N. Audsley. The Scratchpad Memory Management Unit for Microblaze: Implémentation, Testing, and Case Study. Technical Report YCS-2009-439, University of York, 2009.

  32. TickPAD - Overview

  33. TickPAD - Overview • Spatial memory pipeline • To accelerate linear code

  34. TickPAD - Overview • Associative loop memory • For predictable temporal locality • Statically allocated and Dynamically loaded

  35. TickPAD - Overview • Tick address queue • Stores the resumptions address of active threads

  36. TickPAD - Overview • Tick instruction buffer • Stores the instructions at the resumption of the next active thread • To reduce context switching overhead at state/tick boundaries

  37. TickPAD - Overview • Command table • Stores a set of commands to be executed by the TickPAD controller.

  38. TickPAD - Overview • Command buffer • A buffer to store operands fetched from main memory • Command requiring 2+ operands

  39. Spatial Memory Pipeline • Cache – on miss • Fetches from main memory on to cache • First instruction miss, subsequence instructions on that line hits • Requires history of cache needed for timing analysis • Scratchpad – unallocated • Executes from main memory • Miss cost for all instructions • Simple timing analysis

  40. Spatial Memory Pipeline • Memory controller • Single line buffer • Simple analysis • Analyse previous instruction • First instruction miss, subsequence instructions on that line hits Main Memory CPU

  41. Spatial Memory Pipeline • Computation required many lines of instructions • Exploit spatial locality • Predictability prefetch the next line of instructions • Add another buffer

  42. Spatial Memory Pipeline • To preserve determinism • Prefetch only active if no branch

  43. Spatial Memory Pipeline

  44. Spatial Memory Pipeline

  45. Spatial Memory Pipeline

  46. Spatial Memory Pipeline

  47. Spatial Memory Pipeline

  48. Spatial Memory Pipeline

  49. Spatial Memory Pipeline

  50. Spatial Memory Pipeline

More Related