190 likes | 310 Views
Parallel Processors. Todd Charlton Eric Uriostique . Current Technology. Hard to find a single core processor anymore. Cell phones, Labtops, etc. Large systems can contain up to 512+ processors . The Motivation. Divide and Conquer – Higher Throughput Lower Power Consumption
E N D
Parallel Processors Todd CharltonEric Uriostique
Current Technology • Hard to find a single core processor anymore. • Cell phones, Labtops, etc. • Large systems can contain up to 512+processors
The Motivation • Divide and Conquer – Higher Throughput • Lower Power Consumption P = CV2f
The Motivation • We need more performance on same power budget. How? • Remember: P = CV2f • Scale voltage and frequency to80% • P = C * .82 [V] * .8 [f] • This drops power by 50% • Add additional core • Result = 1.6x Speedup with same power
The Motivation • How about reducing power consumption but keeping the same performance? • Remember: P = CV2f • Scale voltage and frequency by 50% • P = C * .52 [V] * .5 [f] • This drops power to 12.5% • Add additional core • Result = 25% of original power consumption with same performance
Amdahl’s Law • “Speed-up is limited by amount of work that can be done in parallel” Credit: watermint.org
Ways To Parallelize • Multi-Threading: • Multi-thread your application on one chip • More elegant • Multi-Processing: • Flash serial code to separate chips • No worrying about scheduling!
Let’s Multi-Thread • One Application: Counting maize pixels 2 Processors 4 Processors
Multi-Threading in µProcessors • Spin Propeller Processor • Multi-Thread on8 cores • One application run on 8 cores • Uses it’s own high level language and a form of Assembly • In CMU Cam4
Problems with Multi-Threading • Steep learning curve • Learning the Language • Parallel Slowdown • Lot of time to set up a new thread. If that thread does not have much work, not worth the overhead
Multi-Threading Libraries • Cannot program serially to take advantage of Parallel Processing • Intel’s Thread Building Blocks (TBB) • OpenMP • Boost and pthread • All of these are libraries in C/C++
Multi-Processing:Beaglebone • Processor • 720 MHz ARM Cortex-A8 • 3D graphics accelerator • ARM Cortex-M3 for power management • 2x Programmable Realtime Unit RISC CPUs • PRUs share memory space with A8
Multi-Processing:Custom with Message Passing • Designate a processor for each frequent tasks • Send messages to "Boss" as necessary • Since every processor's workload is minimal, slower and low power chips can be used • Overall = Same system performance
Problems with Multi-Processing • Shared Memory Space • Boards like this are hard to find and configure • Message Passing • Can’t assume messages are received immediately
Recap • Go parallel if you want: • Higher Throughput • Lower Power • Two Ways: • Multi-Threading – Spin • Speed up one Application • Multi-Processing – Beaglebone • Do more tasks at same time • Don’t forget Amdahl’s Law!