1 / 22

Operating System Issues in Multi-Processor Systems

Operating System Issues in Multi-Processor Systems. John Sung Hardware Engineer Compaq Computer Corporation. Outline. Multi-Processor Hardware Issues Snoopy Bus System Architecture AMD Athlon’s Snoopy Protocol ccNUMA System Architecture AMD Athlon’s LDT System Bus

Download Presentation

Operating System Issues in Multi-Processor Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Operating System Issues in Multi-Processor Systems John Sung Hardware Engineer Compaq Computer Corporation

  2. Outline • Multi-Processor Hardware Issues • Snoopy Bus System Architecture • AMD Athlon’s Snoopy Protocol • ccNUMA System Architecture • AMD Athlon’s LDT System Bus • SGI Origion’s ccNUMA System Architecture • Alpha 21364 System Architecture • ccNUMA and CPU Scheduling • Conclusion

  3. Multi-Processor Hardware Issues • Bandwidth/Latency • Processor to Processor • Processor to Memory • Processor to I/O • Scalability • Increase performance as you increase CPU/Memory • Coherency/Synchronization • Give software coherent view of memory • Provide synchronization primitives

  4. Snoopy Bus System Architecture

  5. Snoopy Bus System Architecture • A bus Connects Processors,Memory,and I/O • Scales upto ~16 processors • Limited by bus bandwidth • Cache Coherency Protocol • Snoops the bus for memory traffic • Each set has to “listen” for addresses in it’s cache • Does the “right thing” to give software coherent view of memory

  6. Snoopy Bus System Architecture CPU Core CPU Core CPU Core Cache Cache Cache Bus Memory I/O Memory I/O Memory I/O

  7. ccNUMA System Architecture

  8. ccNUMA System Architecture • Cache-Coherent Non-Uniform Memory Access • Memory is distributed and attached to processors • Some network connects each processor/memory sets • Each processor owns part of the memory space • Cache coherency protocol • Gives software coherent view of memory • Protocol primitives for synchronization • Directory to keep track of who has a copy of memory

  9. ccNUMA System Architecture CPU Core CPU Core Cache Cache Memory Directory I/O Network Router Memory Directory I/O Network Router Network Fabric

  10. SGI Origin System Architecture

  11. SGI CrayLinkTM • Node = 2 CPU and their cache • Module = Memory + Directory + HUB • 2 Modules per Router • System = Modules + Routers + CrayLinkTM Network

  12. SGI CrayLinkTM

  13. Processor System Network

  14. Bisectional Bandwidth

  15. ccNUMA and CPU Scheduling Issues

  16. OS’s Questions • Single CPU System • What to schedule next? • ccNUMA System • What to schedule next? • Which cpu to schedule it to? • Where should the process information be located at? • 1 or many instances of OS?

  17. OS’s Choices for a Process • Single CPU System • Process has1 choice • Process information has 1 choice • ccNUMA System with N CPU’s and M Memory • Process has N choices • Process information M choices per virtual page • “Distance” between process and it’s information

  18. Context Switch Penalty • Single CPU System • Saving/Restoring process state (PCB) • Scheduling routine • ccNUMA System • Saving/Restoring process state (PCB) • Scheduling routine • Moving process’s information

  19. Some Common Sense • Replicate parts of the OS across processors • System calls will happen often • Minimize process movement • Cost of moving a process to another CPU is high • Less than swaping to disk, most of the time • Higher than simple context switching • But if you have to move a process • Minimize the amount of information to move • Opportunity for a cache????

  20. Conclusion • Hardware • Bandwidth and Latency for performance • Cache Coherency for correctness • Operating System • ccNUMA adds complexity in CPU scheduling • HW performance = Lower Context Switch Penalty => flexibility in scheduling choices for a process

  21. References • Alpha • http://www.digital.com/alphaoem/present/ev7forum98.ppt • http://www.compaq.com/InnovateForum99/presentation/session31/ • http://www.digital.com/alphaoem/ • AMD • http://www.amd.com/products/cpg/mpf/speech/slides99.ppt • SGI • http://www-europe.sgi.com/origin/numa_tech.html • BenchMarks • http://www.spec.org/ • http://www.tpc.org/

  22. Abbreviation Index • AMD - Advanced Micro Devices • SGI - Silicon Graphics Inc. • ECC - Error Correction Code • SECDED - Single Error Correct Double Error Detect • API - Alpha Processor Inc • AGP - Accelerated Graphics Port • DDR DRAM - Double Data Rate Dynamic RAM • LTD - Lightning Data Transport • PCI - Peripheral Component Interconnect • CMOS - Complementary Metal Oxide Semiconductor • CAS - Column Address Strobe • TPC-C -Transaction Processing Performance Council Benchmark • ccNUMA - Cache-Coherent Non-Uniform Memory Access • SMP - Symmetric Multi-Processing

More Related