60 likes | 74 Views
Dive into Dave Patterson's groundbreaking ideas from 2001 on shifting focus from performance to software quality, tackling bugs through lightweight transactions, and harnessing existing hardware and software mechanisms to enhance reliability. Discover a fresh perspective on the evolving landscape of software development and hardware design.
E N D
Patterson Consulting Radical Proposal: Let’s help real problems Dave Patterson University of California at Berkeley Patterson@cs.berkeley.edu April 2001
Execution 2 Bus Intf D cache TLB Out-Of-Order branch SS Icache What have designers been doing? • Performance, Performance, Performance; 2X/18 months • Superscalar, 3 levels of cache, branch prediciton, out-of-order execution, … • If performance right goal, then > 1GHz => sales jump • ~ Year 2000 v. 1999 car sales • What Happened? • US PC market shrank 4-8 % 1Q01; 1st shrink in 7 yrs! • Performance no longer king? Pentium III
Time to help on other problems? • Software quality? • Fry’s Law: 2X programming productivity (speed of reliable SW functionality) every 18 years • Last architectural assist was virtual memory protection, ~1970? • SW Engineering perspective on SW bugs: • Bugs reproducible from inputs will be repaired • Transient errors very hard to fix • Jim Gray hypothesis: • Most production software bugs are soft - Heisenbugs • Bohrbugs, like the Bohr atom, are solid, easily detected by standard techniques, and hence boring • Then can repair most SW bugs by restart “Will Software Ever Work?” H. Lieberman and C.Fry, Comm. ACM Mar 01, p 122-124
What should HW designers do? • Already have heavyweight transactions in, databases, operating systems • Atomic event that can be completely undone if fails midstream • Expensive so done only on some disk operations • Support lightweight transactions in CPU 1) Help with restart of routines to fix Heisenbugs 2) Make Software error recovery more reliable • Start transaction • SW detects error • Back out all evidence of work to original place
What will it take? • Mechanisms in modern CPUs for performance speculation lay foundation • Speculative execution via branch prediction, out-of-order execution, in-order completion allows “transactions” per branch, preserving interrupts: Reorder Buffer, Memory Buffer, Commit Table, … • Transmeta Crusoe provides • software control of Write Buffer, allowing SW to discard results of speculative SW execution; • shadowed registers so can go to old values • Expand these mechanisms to support transactions to help with SW bugs • Shadow registers/Reorder Buffer, Much bigger write buffer (1MB?) under SW control
Summary • Performance no longer the only problem • Last 15 years: Processors 1000X faster, memories 1000X bigger allow SW 1000X bigger • No help to SW in last 15 years • 1 bugs / 1000 lines of code, millions of lines of code • Real problem is SW • Simultaneous Multithreading uses existing OOO HW to improve throughput of threads • Transaction support uses existing OOO HW and SW mechanisms to provide undo for SW bugs, SW recovery • Solve an important problem, such as SW reliability; performance is not the problem!