1 / 48

Multithreading and Parallelism: Maximizing Processor Usage and Resource Efficiency

Explore the concepts of multithreading, parallelism, and cache coherence to maximize processor usage and improve resource efficiency in modern complex systems. Learn about different types of threading, their benefits and challenges, and the impact on performance and memory access.

marcov
Download Presentation

Multithreading and Parallelism: Maximizing Processor Usage and Resource Efficiency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallelism 2

  2. Multithreading

  3. Process vs Thread • Thread : Instruction sequence • Own registers/stack • Share memorywith otherthreads in process (program)

  4. Threaded Code • Demo…

  5. Multithreading • Multithreading • Alternate or combine threads to maximize use of processor • Hardware required • Multiple register sets • Track "owner" of pipeline instructions

  6. Resource Usage • Code running in a superscalar pipeline • Can't always fill all 4 issue slots • Have bubbles from memory access, page faults, etc… Issue Slots

  7. Threading Examples • Assumptions: • Three threads of work • In order execution • Must obay stalls (i.e. A3 is 3+ cycles after A2)

  8. Threading Examples • Assumptions: • Three threads of work • In order execution • Must obay stalls (i.e. A3 is 3+ cycles after A2) • Two wide pipeline (two instructions per cycle)

  9. Multithreading • Corse Grained Multitasking • Threads run until stall • Cache miss, page fault • Other long event • On stall, drain pipeline, and start next thread

  10. Multithreading • Corse Grained Multitasking • Threads run until stall • Cache miss, page fault • Other long event • On stall, drain pipeline, and start next thread

  11. Corse Example • Course Multi threading • Avoids waiting for long periods • Wastes tme on context switches • 16/30 possible units of work

  12. Multithreading • Course Grained • Assumption1 cycle to retire after stall Threads to run Single Pipeline Time 

  13. Multithreading • Course Grained • Assumption1 cycle to retire after stall Threads to run Dual Pipeline Time 

  14. Multithreading • Corse Grained Multitasking • Avoids wasting time on large stalls • Context switches waste time • Ex: Does work in 16/30 possible slots

  15. Multithreading • Fine Grained Multitasking • Every cycle, switch threads

  16. Multithreading • Fine Grained • Switch eachcycle to nextready thread Threads to run Single Pipeline Time 

  17. Multithreading • Fine Grained • Switch eachcycle to nextready thread Threads to run Dual Pipeline Time  A6 can’t run until 4 after A5Gets skipped at time 10

  18. Multithreading • Fine Grained Multithreading • More responsive for each thread • Significant hardware required • Multiple register sets • Track "owner" of pipeline instructions • Ex: Finishes in 15 steps24 out of 30 possible units of work

  19. Latency vs Throughput • Multithreading favors throughput over latency • Longer to do any one task • Shorter overall to do all

  20. Multithreading • SMT : Simultaneous Multithreading • AKA Hyperthreading • Can issue instructions from multiple threads in one cycle

  21. SMT • SMT : Simultaneous Multithreading • AKA Hyperthreading • Execution units can each work on different threads

  22. Multithreading SMT • Switch like fine grained • Do work from multiplethreads if needed to fillpipelines Threads to run B4 not ready, but C3 is Time 

  23. Multithreading SMT • Switch like fine grained • Do work from multiplethreads if needed to fillpipelines Threads to run C5 not ready but A5 is Time 

  24. Multithreading SMT • Switch like fine grained • Do work from multiplethreads if needed to fillpipelines Threads to run B4, C5, A6 all waiting Time 

  25. Multithreading • Simultaneous Multi Threading • Better potential to use all hardware execution units • Depends on complimentary work loads • More book keeping required

  26. SMT Challenges • Resources must be duplicated or split • Split too thin hurts performance… • Duplicate everything and you aren't maximizing use of hardware…

  27. Intel vs AMD • Variations on SMT

  28. Intel vs AMD • AMD Zen architecture

  29. Multicore / Multiprocessors

  30. Shared Memory Architecures

  31. Development • Single Core

  32. Development • Single Core with Multithreading • 2002 Pentium 4 / Xeon

  33. Development • Multi Processor • Multiple processors coexisting in system • PC space in ~1995

  34. Development • Multi Core • Multiple CPU's on one chip • PC space in ~2005

  35. Development • Modern Complexity • Many cores • Private / Shared cache levels

  36. Development • Massively Parallel Systems

  37. Parallelism & Memory

  38. UMA • Uniform Memory Access • Every processor sees every memory using same addresses • Same access time for any CPU to any memory word

  39. NUMA • Non Uniform Memory Access • Single memory address space visible to all CPUs • Some memory local • Fast • Some memory remote • Accessed in same way, but slower

  40. Sunway Architecture • One chip : 256 cores~1.5Ghz • Computer :40,000+ chips

  41. Multiprocessing & Memory • Memory demo…

  42. Memory Access • Race conditions : unpredictable effects of sharing memory • May add 10, 1 or 11 to x

  43. Memory Access • Syncronization – using locks to prevent others from accessing memory

  44. Memory Access • Syncronization issues: • No longer parallel • Deadlock

  45. Cache Coherence • Cache Coherence : Trying to make sure cached memory stays synched

  46. Cache Coherence • Cache Coherence : • Need ability to snoop on activity and/or broadcast changes

  47. Cache Coherence • Cache Coherence : • Need ability to snoop on activity and/or broadcast changes A broacasts write on X, B knows it no longer has valid value

  48. Cache Coherence • Cache Coherence : • Need ability to snoop on activity and/or broadcast changes A snoops on B asking for X, provides New value and updates memory

More Related