370 likes | 521 Views
Research Directions for 21 st Century Computer Systems ASPLOS 2013 Panel. Impact? $15M NSF XPS (Exploiting Parallelism & Scalability) cites 1 & 4. 0. Mark Hill: Introduction Kathryn McKinley on NAS Report The Future of Computing Performance : Game Over or Next Level?
E N D
Research Directions for 21st Century Computer SystemsASPLOS 2013 Panel Impact? $15M NSF XPS (Exploiting Parallelism & Scalability) cites 1 & 4. 0. Mark Hill: Introduction Kathryn McKinley on NAS ReportThe Future of Computing Performance: Game Over or Next Level? JosepTorrellason CCC WorkshopsAdvancing Computer Architecture Research (ACAR) Mark Hill on ISAT WorkshopAdvancing Computer Systems without Technology Progress SaritaAdveon CCC White Paper21st Century Computer Architecture Emmett Witchelunbounded Q: Do to facilitate, transcend, or refute these partially overlapping visions?
The Futureof ComputingPerformance: Samuel H. Fuller, Chair March 22, 2011 Computer Science and Telecommunications Board (CSTB) National Research Council (NRC) Game Over or Next Level? Thanks to Sam Fuller & Mark Hill
Committee On Sustaining Growth In Computing Performance Experts Addressed the Problem • SAMUEL H. FULLER, Analog Devices Inc., Chair • LUIZ ANDRÉ BARROSO, Google, Inc. • ROBERT P. COLWELL, Independent Consultant • WILLIAM J. DALLY, NVIDIA Corporation and Stanford University • DAN DOBBERPUHL, PA Semi/Apple • PRADEEP DUBEY, Intel Corporation • MARK D. HILL, University of Wisconsin–Madison • MARK HOROWITZ, Stanford University • DAVID KIRK, NVIDIA Corporation • MONICA LAM, Stanford University • KATHRYN S. McKINLEY, University of Texas at Austin • CHARLES MOORE, Advanced Micro Devices • KATHERINE YELICK, University of California, Berkeley Staff • LYNETTE I. MILLETT, Study Director • SHENAE BRADLEY, Senior Program Assistant
Executive Summary • Computer hardware has transitioned to multicore • Dennard scaling of CMOS has broken down • Parallelism and locality must be exploited by software • Chip power will soon limit multicore scaling
doubling of transistors Virtuous Cycle Devices 2x more capable, efficient, cheaper, smaller, … Software Innovation Hardware Complexity Software Complexity Sequential Interface Sequential Interface
doubling of transistors Breaks in Virtuous Cycle end of Dennard Scaling Devices 2x more capable, efficient, cheaper, smaller, … Software Innovation Hardware Complexity Software Complexity Sequential Interface Sequential Interface Sequential Interface
Next StepsInnovate within and across layers • Algorithms • Programming “systems” • Architecture • Technology • Education
Community No news here? But… Are we all acting on this knowledge or are we acting business as usual? Are we thinking beyond next paper to where to create future value? Denial … Acceptance Act?
2. Advancing Computer Architecture Research (ACAR) • Two workshops sponsored by CCC • 25 + 19 attendees • Organizers: J. Torrellas (U Illinois) & M. Oskin (U Wash.) • Issued a community-wide call for white papers • Selection committee picked most relevant papers • Included industry folks • Also invited DARPA, DOE, NSF program managers http://www.cra.org/ccc/docs/ACAR_Report_Popular-Parallel-Programming.pdfhttp://www.cra.org/ccc/docs/ACAR2-Report.pdf
What We Found Data centers and extreme scale computing Architectures for programmability Specialized architectures and heterogeneity Energy and power consumption are the key limiters Performance scaling: • Past: no SW changes • Now: extensive SW+HW changes Ultimate goal: fully automated generation of app-specific HW for programs
What We Found End of road for conventional ISA Secure, reliable and predictable from the HW up Exploiting emerging technologies Foundation of computing is breaking apart; malicious parties are exploiting it Architecture research enables new technologies to enter the market quickly Modern systems are skyscrapers built on the ISA of a bungalow
Discussion Points • Many directions of research are relevant: • Computer systems research is broadening • Focus on increasing funding pie, not re-distributing it • Need to create coalitions with other communities: • Big data • New computing materials and devices • Healthcare • … • Need to move away from incrementalism
Advancing Computer Systems without Technology Progress Our Focus New Technology System Capability (log) CMOS Fallow Period 80s 90s 00s 10s 20s 30s 40s 50s Seek ~1000x = two decades of Moore Law via four thrusts The views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government. Approved for Public Release, Distribution Unlimited
A. Spectrum of Hardware Specialization Approved for Public Release, Distribution Unlimited
C. Reduce Software Bloat(e.g., matrix multiply) Can we achieve PHP productivity at BLAS efficiency? Approved for Public Release, Distribution Unlimited
D. Locality-aware Parallelism • Now: Seek (vast) parallelism • e.g., simple, energy efficient cores • But remote communication >100x cost of compute = 1200 pJ (24x) Approved for Public Release, Distribution Unlimited
C. Approximate Computing Example SECOND ORDER DIFFERENTIAL EQUATION ON ANALOG ACCELERATOR WITH DIGITAL ACCELERATOR. Approved for Public Release, Distribution Unlimited
Workshop Takeaway • Can Harvest in the “Fallow” Period! A. HW/SW Specialization/Co-design B. Reduce SW Bloat C. Approximate Computing --------------------------------------------------- ~1000x = 2 decades of Moore’s Law! • D. Systems must exploit LOCALITY-AWARE parallelism • HILL’s TWO CENTS: Move beyond General-Purpose • Systems that do new things, e.g., Kinect • Optimizations that help some, e.g., big memory workloads Approved for Public Release, Distribution Unlimited
21st Century Computer Architecture A Community White Paper, April-May 2012 + Jim Larus & Jeannette Wing gave feedback + CCC, Erwin Gianchandani, Ed Lazowska guided process
Technology’s Challenges How should architects step up as technology falters?
Some Thoughts Architecture ??? ??? ASPLOS 2014 ASPLOS PL OS Need to step up for agency positions NSF CCF Division Director Search
The 90s SUCKED
Jerry Garcia Dead 1995
The Verve The Verve PIPE
ArchitectureWas Boring
microarchitectureprovides performance Architecture Microarchitecture or Clock rate 1. Buy machine 2. Wait 18 months 3. Buy next one
Life is better now
architecture changesprovide value 1. Consider app 2. Buy machine 3. Goto 1 VT-x (11/05) Extended Page Tables (11/08) VT-d (11/08) VPID (11/08) (tagged TLB!)
Hardware + Software Cooperation necessary • Security • Mobile • Data centers • Concurrency • GPU/Accelerator The ‘10s belong to ASPLOS
Research Directions for 21st Century Computer SystemsASPLOS 2013 Panel 0. Mark Hill: Introduction Kathryn McKinley on NAS ReportThe Future of Computing Performance: Game Over or Next Level? JosepTorrellason CCC WorkshopsAdvancing Computer Architecture Research (ACAR) Mark Hill on ISAT WorkshopAdvancing Computer Systems without Technology Progress SaritaAdveon CCC White Paper21st Century Computer Architecture Emmett Witchelunbounded
Kathryn S. McKinley Kathryn S. McKinley is a Principal Researcher at Microsoft and an Endowed Professor of Computer Science at The University of Texas at Austin. She and her collaborators have produced widely used tools: the DaCapo Java Benchmarks, TRIPS Compiler, Hoard memory manager, MMTk garbage collector toolkit, and Immix garbage collector. Her awards include: NSF Career, ASPLOS 2009 Best Paper, 2012 IEEE Top Picks, CACM Research Highlights (2006, 2012), Most Influential OOPSLA Paper from 2002 (awarded 2012), the 2011 ACM SIGPLAN Distinguished Service Award, and the 2012 ACM SIGPLAN Programming Languages Software Award. She has graduated 17 PhD students. She is an IEEE Fellow and ACM Fellow.
JosepTorrellas JosepTorrellas is a Professor of Computer Science at the University of Illinois Urbana-Champaign. He is the Director of the Center for Programmable Extreme Scale Computing, and the Director of the Illinois-Intel Parallelism Center (I2PC). He has also been a Willett Faculty Scholar and lead the OpenSPARC Center of Excellence. He is the past Chair of the IEEE Technical Committee on Computer Architecture, and currently serves as a Council Member of CRA's Computing Community Consortium. He is a Fellow of IEEE and ACM. He has made many technical contributions in the areas of shared-memory parallel computer architecture, low-power design, hardware reliability, and software dependability. He has graduated 30 Ph.D. students, who are now leaders in academia and industry. He is currently working on the Bulk Multicore Architecture, and on the DARPA-funded Runnemede Extreme Scale Architecture, both in collaboration with Intel.
Mark Hill Mark D. Hill (www.cs.wisc.edu/~markhill) is professor in both the computer sciences department and the electrical and computer engineering department at the University of Wisconsin--Madison, where he also co-leads the Wisconsin Multifacet (www.cs.wisc.edu/multifacet/) project with David Wood. His research interests include parallel computer system design, memory system design, computer simulation, deterministic replay and transactional memory. He earned a PhD from University of California, Berkeley. He is an ACM Fellow and a Fellow of the IEEE.
SaritaAdve SaritaAdve is Professor of Computer Science at the University of Illinois at Urbana-Champaign. Her research interests are in computer architecture and systems, parallel computing, and power and reliability-aware systems. Her honors include the Anita Borg Institute Women of Vision award in innovation, the ACM SIGARCH Maurice Wilkes award, the University Scholar recognition by the University of Illinois, and an Alfred P. Sloan Research Fellowship. She is a fellow of the ACM and the IEEE. She serves on the boards of the Computing Research Association and ACM SIGARCH. She received the Ph.D. in Computer Science from the University of Wisconsin-Madison in 1993.
Emmitt Witchel Emmett Witchel is an associate professor in computer science at The University of Texas at Austin. He and his group are interested in operating systems, security, and architecture. Most of his current research is about secure systems, GPU systems, and concurrent systems. He received his doctorate from MIT in 2004.