290 likes | 468 Views
21st Century Computer Architecture A community white paper http:// cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf. Technion , Haifa Israel, June 2013 Information & Commun . Tech’s Impact Semiconductor Technology’s Challenges Computer Architecture’s Future
E N D
21st CenturyComputer ArchitectureA community white paperhttp://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf Technion, Haifa Israel, June 2013 Information & Commun. Tech’s Impact Semiconductor Technology’s Challenges Computer Architecture’s Future Example: Bypassing Paged Virtual Memory
White Paper Participants “*” contributed prose; “**” effort coordinator Thanks of CCC, Erwin Gianchandani & Ed Lazowska for guidance and Jim Larus & Jeannette Wing for feedback
20th Century ICT Set Up • Information & Communication Technology (ICT)Has Changed Our World • <long list omitted> • Required innovations in algorithms, applications, programming languages, … , & system software • Key (invisible) enablers (cost-)performance gains • Semiconductor technology (“Moore’s Law”) • Computer architecture (~80x per Danowitz et al.)
Enablers: Technology + Architecture Architecture Technology • Danowitz et al., CACM 04/2012, Figure 1
21st Century Promise • ICT Promises Much More • Data-centric personalized health care • Computation-driven scientific discovery • Human network analysis • Much more: known & unknown • Characterized by • Big Data • Always Online • Secure/Private • … Whither enablers of future (cost-)performance gains?
(Finding 2) Classic CMOS Dennard Scaling: the Science behind Moore’s Law Source: Future of Computing Performance: Game Over or Next Level?, National Academy Press, 2011 Scaling: Voltage: V/a tOX/a Oxide: Results: 1/a2 Power/ckt: Power Density: ~Constant National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)
Post-classic CMOS Dennard Scaling Post Dennard CMOS Scaling Rule TODO: Chips w/ higher power (no), smaller (), dark silicon (), or other (?) Scaling: Voltage: V/a V tOX/a Oxide: Results: 1 1/a2 Power/ckt: Power Density: a2 ~Constant National Research Council (NRC) – Computer Science and Telecommunications Board (CSTB.org)
Technology’s Challenges 2/2 How should architects step up as technology falters?
What Research Exactly? • Research areas in white paper (& backup slides) • Architecture as Infrastructure: Spanning Sensors to Clouds • Energy First • Technology Impacts on Architecture • Cross-Cutting Issues & Interfaces • Much more research developed by future PIs! • E.g.: Efficient Virtual Memory for Big Memory Servers • Basu, Gandhi, Chang, Hill, & Swift [ISCA 2013] • Big Memory: graph500, memcached, databases • Self-manage most memory (e.g., bufferpool)
Execution Time Overhead: TLB Misses Significant waste Larger memory? Byte-addr NVM? Lower is better
Hardware: Direct Segment 2 Direct Segment Conventional Paging 1 BASE LIMIT VA OFFSET PA • Why Direct Segment? • Matches Big Memory Workload needs • NO Paging => NO TLB Miss
Execution Time Overhead: TLB Misses 92-100% TLB “misses” to direct segment Requires: Both small SW + small HW changes
21st CenturyComputer ArchitectureA community white paperhttp://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf Technion, Haifa Israel, June 2013 Information & Commun. Tech’s Impact Semiconductor Technology’s Challenges Computer Architecture’s Future Example: Bypassing Paged Virtual Memory
Back Up Slides • Detailed research areas in white paper • Architecture as Infrastructure: Spanning Sensors to Clouds • Energy First • Technology Impacts on Architecture • Cross-Cutting Issues & Interfaces http://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf • Findings on National Academy “Game Over” Study • Glimpse at DARPA/ISAT Workshop “Advancing Computer Systems without Technology Progress”
1. Architecture as Infrastructure: Spanning Sensors to Clouds • Beyond a chip in a generic computer • To pillar of 21st century societal infrastructure. • Computation in context (sensor, mobile, …, data center) • Systems often large & distributed • Communication issues can dominate computation • Goals beyond performance (battery life, form factor) • Opportunities (not exhaustive) • Reliable sensors harvesting (intermittent) energy • Smart phones to Star Trek’s medical “tricorder” • Cloud infrastructure suitable for both “Big Data” streams& low-latency qualify-of-service with stragglers • Analysis & design tools that scale
2. Energy First • Beyond single-core performance computer • To (cost-)performance per watt/joule • Energy across the layers • Circuit/technology (near-threshold CMOS, 3D stacking) • Architecture (reducing unnecessary data movement) • Software (communication-reducing algorithms) • Parallelism to save energy • Vast (fined-grained) homogeneous & heterogeneous • Improved SW stack • Applications focus (beyond graphic processing units) • Specialization for performance & energy efficiency • Abstractions for specialization (reducing 1-time cost) • Energy-efficient memory hierarchies • Reconfigurable logic structures
3. Technology Impacts on Architecture • Beyond CMOS, Dram, & Disks of last 3+ decades to • Using replacement circuit technologies • Sub/near-threshold CMOS, QWFETs, TFETs, and QCAs • Non-volatile storage • Beyond flash memory to STT-RAM, PCRAM, & memristor • 3D die stacking & interposers • logic, cache, small main memory • Photonic interconnects • Inter- & even intra-chip • Design automation • from circuit-design w/ new technologies to • pre-RTL functional, performance, power, area modeling of heterogeneous chips & systems
4. Cross-Cutting Issues & Interfaces • Beyond performance w/ stable interfaces to • New design goals (for pillar of societal infrastructure) • Verifiability (bugs kill) • Reliability (“dependability” computing base?) • Security/Privacy (w/ non-volatile memory?) • Programmability (time to correct-performant solution) • Better Interfaces • High-level information (quality of service, provenance) • Parallelism ((in)dependence, (lack of) side-effects) • Orchestrating communication ((recursive) locality) • Security/Reliability (fine-grain protection)
Executive summary (Added to National Academy Slides) Source: Future of Computing Performance: Game Over or Next Level?, National Academy Press, 2011 Mark Hill talk (http://www.cs.wisc.edu/~markhill/NRCgameover_wisconsin_2011_05.pptx) • Highlights of National Academy Findings (F1) Computer hardware has transitioned to multicore (F2) Dennardscaling of CMOS has broken down (F3) Parallelism and locality must be exploited by software (F4) Chip power will soon limit multicore scaling • Eight recommendations from algorithms to education • We know all of this at some level, BUT: Are we all acting on this knowledge or hoping for business as usual? Thinking beyond next paper to where future value will be created? • Questions Asked but Not Answered Embedded in NA Talk • Briefly Close with Kübler-Ross Stages of Grief: Denial … Acceptance
The Graph Our Focus New Technology System Capability (log) CMOS Fallow Period 80s 90s 00s 10s 20s 30s 40s 50s Source: Advancing Computer Systems without Technology Progress, ISAT Outbrief(http://www.cs.wisc.edu/~markhill/papers/isat2012_ACSWTP.pdf)Mark D. Hill and Christos Kozyrakis, DARPA/ISAT Workshop, March 26-27, 2012. Approved for Public Release, Distribution Unlimited The views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government.
Surprise 1 of 2 • Can Harvest in the “Fallow” Period! • 2 decades of Moore’s Law-like perf./energy gains • Wring out inefficiencies used to harvest Moore’s Law HW/SW Specialization/Co-design (3-100x) Reduce SW Bloat (2-1000x) Approximate Computing (2-500x) --------------------------------------------------- ~1000x = 2 decades of Moore’s Law!
“Surprise” 2 of 2 • Systems must exploit LOCALITY-AWARE parallelism • Parallelism Necessary, but not Sufficient • As communication’s energy costs dominate • Shouldn’t be a surprise, but many are in denial • Both surprises hard, requiring “vertical cut” thru SW/HW