1 / 27

Tornado: Maximizing Locality and Concurrency in a SMMP OS

Tornado: Maximizing Locality and Concurrency in a SMMP OS. Contents. Types of Locality Locality: A closer look Requirements for locality Design Basics of Tornado Test Results Conclusion. Types of Locality*. Temporal locality

niesha
Download Presentation

Tornado: Maximizing Locality and Concurrency in a SMMP OS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tornado: Maximizing Locality and Concurrency in a SMMP OS

  2. Contents • Types of Locality • Locality: A closer look • Requirements for locality • Design Basics of Tornado • Test Results • Conclusion

  3. Types of Locality* • Temporal locality “The concept that a resource that is referenced at one point in time will be referenced again sometime in the near future.” • Spatial locality “The concept that the likelihood of referencing a resource is higher if a resource near it has been referenced.” • Sequential locality “The concept that memory is accessed sequentially.” *Source: Wikipedia

  4. x x Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 Processor # 2 Cache Cache x Memory

  5. Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 Processor # 2 x x Cache Cache x Memory

  6. Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } Processor # 1 Processor # 2 x x Cache Cache x Memory

  7. Locality: A closer look, Read only case bool x = true; while (x) { // Do some work // reading but not // writing x… } • Notes: • No accesses on the bus • Because accesses are reads that are satisfied in local caches and no invalidations are sent Processor # 1 Processor # 2 x x Cache Cache x Memory

  8. x x Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 Cache x Memory

  9. Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 x x x Memory

  10. Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 x x Invalidate block containing x x Memory

  11. Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 1. Cache miss x x 2. Read request x Memory

  12. Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } bool x = true; while (x) { x = false; // Do other // work… } Processor # 1 Processor # 2 1. Cache miss x x 2. Read request 3. Data x Memory

  13. Locality: A closer look, Read/Write case bool x = true; while (x) { x = false; // Do other // work… } bool x = true; while (x) { x = false; // Do other // work… } • Notes: • x becomes a bottleneck, the valid copy keeps jumping from one cache to the other • Every write access causing invalidation • Almost every read causing a read miss and a bus read Processor # 1 Processor # 2 1. Cache miss 4. Write x x 2. Read request 3. Data 5. Invalidate block containing x x Memory

  14. x,y Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } bool y = true; while (y) { y = false; // Do other // work… } • Notes: • x & y have different addresses but fall into the same cache line (block)! Processor # 1 Processor # 2 x,y x 0x0 y 0x4 Memory

  15. Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } bool y = true; while (y) { y = false; // Do other // work… } • Notes: • Read doesn’t cause any problem Processor # 1 Processor # 2 x,y x,y x 0x0 y 0x4 Memory

  16. Locality: A closer look, Effect of Cache Line Length bool x = true; while (x) { x = false; // Do other // work… } bool y = true; while (y) { y = false; // Do other // work… } • Notes: • Remember: Invalidations are per cache-line/block not word! • So we have pretty much the same behavior as the read/write case on a single variable Processor # 1 Processor # 2 x,y x,y Invalidate block containing x & y x 0x0 y 0x4 Memory

  17. Requirements for Locality • Spatial and temporal locality • Minimizing read/write and write sharing • Minimize false sharing • Minimize the distance between the accessing processor and the target memory module.

  18. Design Basics for Tornado • Individual resources are individual objects • Clustering objects • Protected procedure calls (PPC) • Semi-automatic garbage collection

  19. Clustered Objects • Appears as a single object from the outside but is internally split into reps • Each rep handles requests from one or more processors • Lots of advantages to this design

  20. Clustered Objects (cont.) • Per-processor translation tables • Partitioned global translation table • Default “miss” handlers

  21. Protected Procedure Calls • Microkernel: relies on servers to carry on part of the OS job • As many server threads as there are clients • A request is handled on the same processor where it was issued *Image source: Wikipedia

  22. Garbage Collection • Semi-automatic • Makes distinction between temporary and persistent references to objects • Eliminates the need for two locks to guarantee existence and locking altogether for read only data

  23. Test Results: Effect of rep Count (1)

  24. Test Results: Effect of rep Count (2)

  25. Test Results: Effect of Cache Associativity

  26. Test Results: Tornado vs. Commercial OSes

  27. Conclusion • Tornado performs much better than many commercial OSes • The concept of clustered objects gives it a lot of advantage • High locality of data • Diminished need for locking • Higher degree of sharing, concurrency and modularity

More Related