200 likes | 211 Views
This article discusses the trend of multicore processors in various devices and the challenges they present in terms of power consumption, parallel scaling, and programming for parallelism. It also explores potential solutions and research areas related to multicore processors.
E N D
Multicore: Panic or Panacea? Mikko H. Lipasti Associate Professor Electrical and Computer Engineering University of Wisconsin – Madison http://www.ece.wisc.edu/~pharm
Multicore Mania • First, servers • IBM Power4, 2001 • Then desktops • AMD Athlon X2, 2005 • Then laptops • Intel Core Duo, 2006 • Soon, your cellphone • ARM MPCore, prototypes for a while now Mikko Lipasti-University of Wisconsin
What is behind this trend? • Moore’s Law • Chip power consumption • Single-thread performance trend [source: Intel] Mikko Lipasti-University of Wisconsin
Dynamic Power • Static CMOS: current flows when active • Combinational logic evaluates new inputs • Flip-flop, latch captures new value (clock edge) • Terms • C: capacitance of circuit • wire length, number and size of transistors • V: supply voltage • A: activity factor • f: frequency • Future: Fundamentally power-constrained Mikko Lipasti-University of Wisconsin
Easy answer: Multicore Core Core Core Core Core Core Core Mikko Lipasti-University of Wisconsin
Amdahl’s Law f – fraction that can run in parallel 1-f – fraction that must run serially n f # CPUs 1 f 1-f Time Mikko Lipasti-University of Wisconsin
Fixed Chip Power Budget • Amdahl’s Law • Ignores (power) cost of n cores • Revised Amdahl’s Law • More cores each core is slower • Parallel speedup < n • Serial portion (1-f) takes longer • Also, interconnect and scaling overhead n # CPUs f 1 1-f Time Mikko Lipasti-University of Wisconsin
Fixed Power Scaling • Fixed power budget forces slow cores • Serial code quickly dominates Mikko Lipasti-University of Wisconsin
Predictions and Challenges • Parallel scaling limits many-core • >4 cores only for well-behaved programs • Optimistic about new applications • Interconnect overhead • Single-thread performance • Will degrade unless we innovate • Parallel programming • Express/extract parallelism in new ways • Retrain programming workforce Mikko Lipasti-University of Wisconsin
Research Agenda • Programming for parallelism • Sources of parallelism • New applications, tools, and approaches • Single-thread performance and power • Most attractive to programmer/user • Chip multiprocessor overheads • Interconnect, caches, coherence, fairness Mikko Lipasti-University of Wisconsin
Finding Parallelism • Functional parallelism • Car: {engine, brakes, entertain, nav, …} • Game: {physics, logic, UI, render, …} • Automatic extraction [UW Multiscalar] • Decompose serial programs • Data parallelism • Vector, matrix, db table, pixels, … • Request parallelism • Web, shared database, telephony, … Mikko Lipasti-University of Wisconsin
Balancing Work • Amdahl’s parallel phase f: all cores busy • If not perfectly balanced • (1-f) term grows (f not fully parallel) • Performance scaling suffers • Manageable for data & request parallel apps • Very difficult problem for other two: • Functional parallelism • Automatically extracted • Scale power to mismatch [Multiscalar] Mikko Lipasti-University of Wisconsin
Coordinating Work • Synchronization • Some data somewhere is shared • Coordinate/order updates and reads • Otherwise chaos • Traditionally: locks and mutual exclusion • Hard to get right, even harder to tune for perf. • Research: Transactional Memory [UW Multifacet] • Programmer: Declare potential conflict • Hardware and/or software: speculate & check • Commit or roll back and retry Mikko Lipasti-University of Wisconsin
Single-thread Performance • Still most attractive source of performance • Speeds up parallel and serial phases • Can use it to buy back power • Must focus on power consumption • Performance benefit ≥ Power cost Mikko Lipasti-University of Wisconsin
Single-thread Performance • Hardware accelerators and circuits • Domain-specific [UW MESA] • Reconfigurable [UW Compton] • VLSI and design automation [UW WISCAD, Kursun] • Increasing frequency • Seems prohibitive: clock power • Clever clocking schemes can help [UW Pharm] • Increasing instruction-level parallelism [UW Multiscalar, UW Pharm, UW Smith] • Without blowing power budget • Alternatively, reduce power for same performance Mikko Lipasti-University of Wisconsin
Chip Multiprocessor Overheads • Core Interconnect [UW Pharm] • 80% of chip power [Borkar, ISLPED ‘07 panel] • Need fundamentally different approach • Revisit circuit switching • Cache coherence [UW Multifacet, Pharm] • Match workload behavior • Optimize for on-chip communication Mikko Lipasti-University of Wisconsin
Chip Multiprocessor Overheads • Shared caches [UW Multifacet, Multiscalar, Smith] • On-chip memory can be shared • Optimize replacement, replication • Fairness [UW Smith] • Maintain Performance isolation • Share resources fairly (memory, caches) Mikko Lipasti-University of Wisconsin
Research Groups @ UW Mikko Lipasti-University of Wisconsin
Conclusion • Forecast • Limited multicore (≤4) is here to stay • Manycore (>4) will find its place • Hardware Challenges • Single-thread performance and power • Multicore overhead • Software Challenges • Finding application parallelism • Creating correct parallel programs • Creating scalable parallel programs Mikko Lipasti-University of Wisconsin
Questions? http://www.ece.wisc.edu/~pharm Mikko Lipasti-University of Wisconsin