360 likes | 580 Views
Harnessing Moore’s Law (with Selected Implications). Mark D. Hill Computer Sciences Department University of Wisconsin-Madison http://www.cs.wisc.edu/~markhill This talk is based, in part, on an essay I wrote as part of a National Academy of Sciences study panel. Motivation.
E N D
Harnessing Moore’s Law(with Selected Implications) Mark D. Hill Computer Sciences Department University of Wisconsin-Madison http://www.cs.wisc.edu/~markhill This talk is based, in part, on an essay I wrote as part of a National Academy of Sciences study panel.
Motivation • What the do the following intervals have in common? • Prehistory-2003 • 2004-2005 • Answer: Equal progress in absolute computer speed • Furthermore, more doublings in 2006-07, 2008-09, … • Questions • Why do computers get better and cheaper? • How do computer architects contribute (my bias)? • How to learn to project future trends and implications?
Outline • Computer Primer • Software • Hardware • Technology Primer • Harnessing Moore’s Law • Future Trends
Computer Primer: Software Application programmers write software: int main (int argc, char *argv[]) { int i; int sum = 0; for (i = 0; i <= 100; i++) sum = sum + i * i; printf (“The sum from 0 .. 100 is %d\n”, sum); } [Example due to Jim Larus]
Computer Primer: Software, cont. System software translates for hardware: .main: ... loop: lw $14, 28($sp) mul $15, $14, $14 <--- multiply i * i lw $24, 24($sp) addu $25, $24, $15 <--- add to sum sw $25, 24($sp) addu $8, $14, 1 sw $8, 28($sp) ble $8, 100, loop la $4, str lw $5, 24($sp) jal printf move $2, $0 lw $31, 20($sp) addu $sp, 32 j $31
Computer Primer: Software, cont. What the hardware really sees: … 10001111101011100000000000011100 10001111101110000000000000011000 00000001110011100000000000011001 <--- multiply i * i 00100101110010000000000000000001 00101001000000010000000001100101 10101111101010000000000000011100 00000000000000000111100000010010 00000011000011111100100000100001 <--- add to sum 00010100001000001111111111110111 10101111101110010000000000011000 00111100000001000001000000000000 10001111101001010000000000011000 00001100000100000000000011101100 00100100100001000000010000110000 10001111101111110000000000010100 00100111101111010000000000100000 00000011111000000000000000001000 00000000000000000001000000100001
Computer Primer: Hardware Components • Processor • Rapidly executes instructions • Commonly: Processor implemented • as microprocessor chip (Intel Pentium 4) • Larger computers have multiple processors • Memory • Stores vast quantities of instructions and data • Commonly: DRAM chipsbacked by magnetic disks • Input/Output • Connect compute to outside world • E.g., keyboards, displays, & network interfaces
Apple Mac 7200 (from Hennessy & Patterson) (C) Copyright 1998 Morgan Kaufmann Publishers. Reproduced with permission from Computer Organization and Design: The Hardware/Software Interface, 2E.
Computer Primer: Hardware Operation E.g., do mul temp,i,i & go on to next instruction Fetch-Execute Loop { S1: read “current” instruction from memory S2: decode instruction to see what is to be done S3: read instruction input(s) S4: perform instruction operation S5: write instruction output(s) Also determine “next” instruction and make it “current” } Repeat
Computer Big Picture • Separate Software & Hardware (divide & conquer) • Software • Worry about applications only (hardware can already exist) • Translate from one form to another (instructions & data interchangeable!) • Hardware • Expose set of instructions (most functionally equivalent) • Execute instructions rapidly (without regard for software)
Outline • Computer Primer • Technology Primer • Exponential Growth • Technology Background • Moore’s Law • Harnessing Moore’s Law • Future Trends
Exponential Growth • Occurs when growth is proportional to current size • Mathematically: dy / dt = k * y • Solution: y = e k*t • E.g., a bond with $100 principal yielding 10% interest • 1 year: $110 = $100 * (1 + 0.10) • 2 years: $121 = $100 * (1 + 0.10) * (1 + 0.10) • … • 8 years: $214 = $100 * (1 + 0.10)8 • Other examples • Unconstrained population growth • Moore’s Law
Absurd Exponential Example • Parameters • $16 base • 59% growth/year • 36 years • 1st year’s $16 buy book • 3rd year’s $64 buy computer game • 15th year’s $16,000 buy car • 24th year’s $100,000 buy house • 36th year’s $300,000,000 buy a lot
Technology Background • Computer logic implemented with switches • Like light switches, except that a switch can control others • Yields a network (called circuit) of switches • Want circuits to be fast, reliable, & cheap • Logic Technologies • Mechanical switch & vacuum tube • Transistor (1947) • Integrated circuit (chip): circuit of manytransistors made at once (1958) • (Also memory & communication technologies)
(Technologist’s) Moore’s Law • Parameters • 16 transistor/chip circa 1964 • 59% growth/year • 36 years (2000) and counting • 1st year’s 16 ??? • 3rd year’s 64 ??? • 15th year’s 16,000 ??? • 24th year’s 100,000 ??? • 36th year’s 300,000,000 ??? • Was useful & then got more than 1,000,000 times better!
Other “Moore’s Laws” • Other technologies improving rapidly • Magnetic disk capacity • DRAM capacity • Fiber-optic network bandwidth • Other aspects improving slowly • Delay to memory • Delay to disk • Delay across networks • Computer Implementor’s Challenge • Design with dissimilarly expanding resources • To Double computer performance every two years • A.k.a., (Popular) Moore’s Law
Outline • Computer Primer • Technology Primer • Harnessing Moore’s Law • Microprocessor • Bit-Level Parallelism • Instruction-Level Parallelism • Caching & Memory Hierarchies • Cost & Implications • Future Trends
Microprocessor • Computers for the 1960s expensive,using 100s if not 1000s of chips • First Microprocessor in 1971 • Processor on one chip • Intel 4004 • 2300 transistors • Barely a processor • Could access 300 bytes of memory(0.0003 megabytes) • Use more and faster transistor in parallel
Transistor Parallelism • To use more transistor quickly, • use them side-by-side (or in parallel) • Approach depend on scale • Consider organizing people • 10 people • 1000 people • 1,000,000 people • Transistors • Bit-level parallelism • Instuction-level parallelism • (Thread-level parallelism)
Bit-Level Parallelism • Less (e.g., 8 * 15 = 120): 00001000 * 00001111 = 00001000 00001000 00001000 00001000 ------------ 00001111000 • More: 010101010101010101010101 * 000011110000111100001111 = 1010000010100000100111110101111101011111011 • More bits manipulated faster!
Instruction-Level Parallelism • Limits to bit-level parallelism • Numbers are big enough • Operations are fast • Seek parallelism executing many instruction at once • Recall Fetch-Execute Loop { S1: read “current” instruction from memory S2: decode instruction to see what is to be done S3: read instruction input(s) S4: perform instruction operation S5: write instruction output(s) Also determine “next” instruction and make it “current” }
Instruction-Level Parallelism, cont. • One-at-a-time instructions per cycle = 1/5 Time 01 02 03 04 05 06 07 08 09 10 ADD S1 S2 S3 S4 S5 SUB .. .. .. .. .. S1 S2 S3 S4 S5 • Pipelining instructions per cycle = 1 (or less) Time 01 02 03 04 05 06 07 08 09 10 ADD S1 S2 S3 S4 S5 SUB .. S1 S2 S3 S4 S5 ORI .. .. S1 S2 S3 S4 S5 AND .. .. .. S1 S2 S3 S4 S5 MUL .. .. .. .. S1 S2 S3 S4 S5
Instruction-Level Parallelism, cont. • 4-way Superscalar instructions per cycle = 4 (or less) Time 01 02 03 04 05 06 07 08 09 10 ADD S1 S2 S3 S4 S5 SUB S1 S2 S3 S4 S5 ORI S1 S2 S3 S4 S5 AND S1 S2 S3 S4 S5 MUL .. S1 S2 S3 S4 S5 SRL .. S1 S2 S3 S4 S5 XOR .. S1 S2 S3 S4 S5 LDW .. S1 S2 S3 S4 S5 STW .. .. S1 S2 S3 S4 S5 DIV .. .. S1 S2 S3 S4 S5
Instruction-Level Parallelism, cont. • Current processors have dozens of instructions executing • Must predict which instructions are next • Limits to control prediction? • Look elsewhere? (thread-level parallelism later) • Memory a serious problem • 1980: memory access time = one instruction time • 2000: memory access time = 100 instruction times
Caching & Memory Hierarchies • Memory can be • Fast • Vast • But not both • Use two memories • Cache: small, fast (e.g., 64,000 bytes in 1 ns) • Memory: large, vast (e.g., 64,000,000 bytes in 100 ns) • Use prediction to fill cache • Likely to re-reference information • Likely to reference nearby information • E.g., address book cache of phone directory
Caching & Memory Hierarchies, cont. • Cache + Memory makes memory look fast & vast • If cache has information on 99% of accesses • 1 ns + 1% * 100 ns = 2 ns • E.g. P3 (w/o L2 cache) • Caching Applied Recursively • Registers • Level-one cache • Level-two cache • Memory • Disk • (File Server) • (Proxy Cache)
Cost Side of Moore’s Law • About every two years: same computing at half cost • Long-term effect: • 1940s Prototypes for calculating ballistic trajectories • 1950s Early mainframes for large banks • 1960s Mainframes flourish in many large businesses • 1970s Minicomputers for business, science, & engineering • Early 1980s PCs for word processing & spreadsheets • Late 1980s PCs for desktop publishing • 1990s PCs for games, multimedia, e-mail, & web • Jim Gray: In ten years you can buy a computer for the cost of its sales tax today (assuming 3% or more)
Outline • Computer Primer • Technology Primer • Harnessing Moore’s Law • Future Trends • Moore’s Law • Harnessing Moore’s Law • Computer uses • Some Non-Technical Implications
Revolutions • Industrial Revolution enabled by machines • Interchangeable parts • Mass production • Lower costs expanded application • Information Revolution enabled by machines • Interchangeable purpose (software) • Mass production (chips = integrated circuits) • Lower costs expanded application
Future of Moore’s Law • Short-Term (1-5 years) • Will operate (due to prototypes in lab) • Fabrication cost will go up rapidly • Medium-Term (5-15 years) • Exponential growth rate will likely slow • Trillion-dollar industry is motivated • Long-Term (>15 years) • May need new technology (chemical or quantum) • We can do better (e.g., human brain) • I would not close the patent office
Future of Harnessing Moore’s Law • Thread-Level Parallelism • Multiple processors cooperating (exists today) • More common in future with multiple processors per chip • Parallelism in Internet? The Grid. • System on a Chip • Processor, memory, and I/O on one chip • Cost-performance leap like microprocessor? • (e.g., accelerometer at right) • Communication • World-wide web & wireless cell phone fuse! • Other properties: robust & easy to design & use
Future Computer Uses • Computer cost-effectiveness determines application viability • Spreadsheets on a US$2M mainframe do not make sense • A 10x cost-performance change enables new possibilities [Joy] • Most computers will NOT be computers • How many electric motors do you have in your home? • How many did you buy as electric motors? • I control several computers, but most computers I control are embedded in cars, remote controls, refrigerators, etc. • Two Stories • Danny Hillis’s doorknobs • William Wulf’s “powerful” computer
Future Computer Uses, cont. • Technologists have always been poor predictors for future use • Edison invented the motion picture machine • Hollywood invented movies • To Predict: • What would you want if it was 10 times cheaper? • What can be 10 time cheaper if you make more? • Better yet, ask a ten year old! • What do you think?
Some Non-Technical Thoughts • We make over a billion transistors/second • One transistor per man/woman/child in < 10 seconds(humankind has made many more transistors than bricks!) • But those transistors are not being distributed equally • Computers can be incredibly effectively tools • Knowledge workers in medicine, law, & engineering • But not unskilled laborers! • Computer use will exacerbate the social gradient • As citizens, we should ask • Can/should we ameliorate this trend? • If so how?
Summary • Computers are machines for purposes“to be determined” • Vast cost reductions have enabled new uses • Software flexibility • Moore’s Law and its harnessing • Technology should be our tool, not our master • Many benefits • Some costs