260 likes | 670 Views
CIS 570 Advanced Computer Systems University of Massachusetts Dartmouth Instructor: Dr. Michael Geiger Fall 2008 Lecture 1: Fundamentals of Computer Design Outline Syllabus & course policies Changes in computer architecture What is computer architecture? Design principles Syllabus notes
E N D
CIS 570Advanced Computer SystemsUniversity of Massachusetts Dartmouth Instructor: Dr. Michael Geiger Fall 2008 Lecture 1: Fundamentals of Computer Design
Outline • Syllabus & course policies • Changes in computer architecture • What is computer architecture? • Design principles M. Geiger CIS 570 Lec. 1
Syllabus notes • Course web site (still under construction): http://www.cis.umassd.edu/~mgeiger/ cis570/f08.htm • TA: To be determined • My info: • Office: Science & Engineering, 221C • Office hours: M 1:30-2:30, T 2-3:30, Th 2:30-4 • E-mail: mgeiger@umassd.edu • Course text: Hennessy & Patterson’s Computer Architecture: A Quantitative Approach, 4th ed. M. Geiger CIS 570 Lec. 1
Course objectives • To understand the operation of modern microprocessors at an architectural level. • To understand the operation of memory and I/O subsystems and their relation to overall system performance. • To understand the benefits of multiprocessor systems and the difficulties in designing and utilizing them. • To gain familiarity with simulation techniques used in research in computer architecture. M. Geiger CIS 570 Lec. 1
Course policies • Prereqs: CIS 273 & 370 or equivalent • Academic honesty • All work individual unless explicitly stated otherwise (e.g., final projects) • May discuss concepts (e.g., how does Tomasulo’s algorithm work) but not solutions • Plagiarism is also considered cheating • Any assignment or portion of an assignment violating this policy will receive a grade of 0 • More severe or repeat infractions may incur additional penalties, up to and including a failing grade in the class M. Geiger CIS 570 Lec. 1
Grading policies • Assignment breakdown: • Problem sets: 20% • Simulation exercises: 10% • Research project (including report & presentation): 20% • Midterm exam: 15% • Final exam: 25% • Quizzes & participation: 10% • Late assignments: 10% per day M. Geiger CIS 570 Lec. 1
Topic schedule • Computer design fundamentals • Basic ISA review • Architectural simulation • Uniprocessor systems • Advanced pipelining—exploiting ILP & TLP • Memory hierarchy design • Storage & I/O • Multiprocessor systems • Memory in multiprocessors • Synchronization • Interconnection networks M. Geiger CIS 570 Lec. 1
Changes in computer architecture • Old Conventional Wisdom: Power is free, Transistors expensive • New Conventional Wisdom: “Power wall” Power expensive, Xtors free (Can put more on chip than can afford to turn on) • Old CW: Sufficiently increasing Instruction Level Parallelism via compilers, innovation (Out-of-order, speculation, VLIW, …) • New CW: “ILP wall” law of diminishing returns on more HW for ILP • Old CW: Multiplies are slow, Memory access is fast • New CW: “Memory wall” Memory slow, multiplies fast (200 clock cycles to DRAM memory, 4 clocks for multiply) • Old CW: Uniprocessor performance 2X / 1.5 yrs • New CW: Power Wall + ILP Wall + Memory Wall = Brick Wall • Uniprocessor performance now 2X / 5(?) yrs Sea change in chip design: multiple “cores” (2X processors per chip / ~ 2 years) • More simpler processors are more power efficient M. Geiger CIS 570 Lec. 1
Uniprocessor performance From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition, October, 2006 • VAX : 25%/year 1978 to 1986 • RISC + x86: 52%/year 1986 to 2002 • RISC + x86: ??%/year 2002 to present M. Geiger CIS 570 Lec. 1
Chip design changes • Intel 4004 (1971): 4-bit processor,2312 transistors, 0.4 MHz, 10 micron PMOS, 11 mm2 chip • RISC II (1983): 32-bit, 5 stage pipeline, 40,760 transistors, 3 MHz, 3 micron NMOS, 60 mm2 chip • 125 mm2 chip, 0.065 micron CMOS = 2312 RISC II+FPU+Icache+Dcache M. Geiger CIS 570 Lec. 1
From ILP to TLP & DLP • (Almost) All microprocessor companies moving to multiprocessor systems • Embedded domain is the lone holdout • Single processors gain performance by exploiting instruction level parallelism (ILP) • Multiprocessors exploit either: • Thread level parallelism (TLP), or • Data level parallelism (DLP) • What’s the problem? M. Geiger CIS 570 Lec. 1
From ILP to TLP & DLP (cont.) • We’ve got tons of infrastructure for single-processor systems • Algorithms, languages, compilers, operating systems, architectures, etc. • These don’t exactly scale well • Multiprocessor design: not as simple as creating a chip with 1000 CPUs • Task scheduling/division • Communication • Memory issues • Even programming moving from 1 to 2 CPUs is extremely difficult • Not strictly computer architecture, but it can’t happen without architects M. Geiger CIS 570 Lec. 1
CIS 570 Approach • How are we going to address this change? • Start by going through single-processor systems • Study ILP and ways to exploit that • Delve into memory hierarchies for single processors • Talk about storage and I/O systems • We may touch on embedded systems at this point • Then, we’ll look at multiprocessor systems • Discuss TLP and DLP • Talk about how multiprocessors affect memory design • Cover interconnection networks M. Geiger CIS 570 Lec. 1
What is computer architecture? software instruction set • Classical view: instruction set architecture (ISA) • Boundary between hardware and software • Provides abstraction at both high level and low level hardware M. Geiger CIS 570 Lec. 1
ISA vs. Computer Architecture • Modern issues aren’t in instruction set design • “Architecture is dead” … or is it? • Computer architecture now encompasses a larger range of technical issues • Modern view: ISA + design of computer organization & hardware to meet goals and functional requirements • Organization: high-level view of system • Hardware: specifics of a given system • Function of complete system now the issue M. Geiger CIS 570 Lec. 1
The roles of computer architecture • … as David Patterson sees it, anyway • Other fields borrow ideas from architecture • Anticipate and exploit advances in technology • Develop well-defined, thoroughly tested interfaces • Quantitative comparisons to determine when goals are reached • Quantitative principles of design M. Geiger CIS 570 Lec. 1
Goals and requirements • What goals might we want to meet? • Performance • Power • Price • Dependability • We’ll talk about how to quantify these as needed throughout the semester • Primarily focus on performance (both uniprocessor & multiprocessor systems) and dependability (mostly storage systems) M. Geiger CIS 570 Lec. 1
Design principles • Take advantage of parallelism • Principle of locality • Focus on the common case • Amdahl’s Law • Generalized processor performance M. Geiger CIS 570 Lec. 1
1. Take advantage of parallelism • Increasing throughput of server computer via multiple processors or multiple disks • Detailed HW design • Carry lookahead adders uses parallelism to speed up computing sums from linear to logarithmic in number of bits per operand • Multiple memory banks searched in parallel in set-associative caches • Pipelining: overlap instruction execution to reduce the total time to complete an instruction sequence. • Not every instruction depends on immediate predecessor executing instructions completely/partially in parallel possible • Classic 5-stage pipeline: 1) Instruction Fetch (Ifetch), 2) Register Read (Reg), 3) Execute (ALU), 4) Data Memory Access (Dmem), 5) Register Write (Reg) M. Geiger CIS 570 Lec. 1
2. Principle of locality • The Principle of Locality: • Program access a relatively small portion of the address space at any instant of time. • Two Different Types of Locality: • Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse) • Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straight-line code, array access) • Last 30 years, HW relied on locality for memory perf. • Guiding principle behind caches • To some degree, guides instruction execution, too (90/10 rule) MEM P $ M. Geiger CIS 570 Lec. 1
3. Focus on the common case • In making a design trade-off, favor the frequent case over the infrequent case • E.g., Instruction fetch and decode unit used more frequently than multiplier, so optimize it 1st • E.g., If database server has 50 disks / processor, storage dependability dominates system dependability, so optimize it 1st • Frequent case is often simpler and can be done faster than the infrequent case • E.g., overflow is rare when adding 2 numbers, so improve performance by optimizing more common case of no overflow • May slow down overflow, but overall performance improved by optimizing for the normal case • What is frequent case and how much performance improved by making case faster => Amdahl’s Law M. Geiger CIS 570 Lec. 1
4. Amdahl’s Law Best you could ever hope to do: M. Geiger CIS 570 Lec. 1
CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle 5. Processor performance Inst Count CPI Clock Rate Program X Compiler X (X) Inst. Set. X X Organization X X Technology X M. Geiger CIS 570 Lec. 1
Next week • Review of ISAs (Appendix B) • Review of pipelining basics (Appendix A) • Discussion of architectural simulation M. Geiger CIS 570 Lec. 1
Acknowledgements • This lecture borrows heavily from David Patterson’s lecture slides for EECS 252: Graduate Computer Architecture, at the University of California, Berkeley • Many figures and other information are taken from Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th ed unless otherwise noted M. Geiger CIS 570 Lec. 1