CIS 570 Advanced Computer Systems University of Massachusetts Dartmouth

CIS 570Advanced Computer SystemsUniversity of Massachusetts Dartmouth Instructor: Dr. Michael Geiger Fall 2008 Lecture 1: Fundamentals of Computer Design

Outline • Syllabus & course policies • Changes in computer architecture • What is computer architecture? • Design principles M. Geiger CIS 570 Lec. 1

Syllabus notes • Course web site (still under construction): http://www.cis.umassd.edu/~mgeiger/ cis570/f08.htm • TA: To be determined • My info: • Office: Science & Engineering, 221C • Office hours: M 1:30-2:30, T 2-3:30, Th 2:30-4 • E-mail: mgeiger@umassd.edu • Course text: Hennessy & Patterson’s Computer Architecture: A Quantitative Approach, 4th ed. M. Geiger CIS 570 Lec. 1

Course objectives • To understand the operation of modern microprocessors at an architectural level. • To understand the operation of memory and I/O subsystems and their relation to overall system performance. • To understand the benefits of multiprocessor systems and the difficulties in designing and utilizing them. • To gain familiarity with simulation techniques used in research in computer architecture. M. Geiger CIS 570 Lec. 1

Course policies • Prereqs: CIS 273 & 370 or equivalent • Academic honesty • All work individual unless explicitly stated otherwise (e.g., final projects) • May discuss concepts (e.g., how does Tomasulo’s algorithm work) but not solutions • Plagiarism is also considered cheating • Any assignment or portion of an assignment violating this policy will receive a grade of 0 • More severe or repeat infractions may incur additional penalties, up to and including a failing grade in the class M. Geiger CIS 570 Lec. 1

Grading policies • Assignment breakdown: • Problem sets: 20% • Simulation exercises: 10% • Research project (including report & presentation): 20% • Midterm exam: 15% • Final exam: 25% • Quizzes & participation: 10% • Late assignments: 10% per day M. Geiger CIS 570 Lec. 1

Topic schedule • Computer design fundamentals • Basic ISA review • Architectural simulation • Uniprocessor systems • Advanced pipelining—exploiting ILP & TLP • Memory hierarchy design • Storage & I/O • Multiprocessor systems • Memory in multiprocessors • Synchronization • Interconnection networks M. Geiger CIS 570 Lec. 1

Changes in computer architecture • Old Conventional Wisdom: Power is free, Transistors expensive • New Conventional Wisdom: “Power wall” Power expensive, Xtors free (Can put more on chip than can afford to turn on) • Old CW: Sufficiently increasing Instruction Level Parallelism via compilers, innovation (Out-of-order, speculation, VLIW, …) • New CW: “ILP wall” law of diminishing returns on more HW for ILP • Old CW: Multiplies are slow, Memory access is fast • New CW: “Memory wall” Memory slow, multiplies fast (200 clock cycles to DRAM memory, 4 clocks for multiply) • Old CW: Uniprocessor performance 2X / 1.5 yrs • New CW: Power Wall + ILP Wall + Memory Wall = Brick Wall • Uniprocessor performance now 2X / 5(?) yrs  Sea change in chip design: multiple “cores” (2X processors per chip / ~ 2 years) • More simpler processors are more power efficient M. Geiger CIS 570 Lec. 1

Uniprocessor performance From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition, October, 2006 • VAX : 25%/year 1978 to 1986 • RISC + x86: 52%/year 1986 to 2002 • RISC + x86: ??%/year 2002 to present M. Geiger CIS 570 Lec. 1

Chip design changes • Intel 4004 (1971): 4-bit processor,2312 transistors, 0.4 MHz, 10 micron PMOS, 11 mm2 chip • RISC II (1983): 32-bit, 5 stage pipeline, 40,760 transistors, 3 MHz, 3 micron NMOS, 60 mm2 chip • 125 mm2 chip, 0.065 micron CMOS = 2312 RISC II+FPU+Icache+Dcache M. Geiger CIS 570 Lec. 1

From ILP to TLP & DLP • (Almost) All microprocessor companies moving to multiprocessor systems • Embedded domain is the lone holdout • Single processors gain performance by exploiting instruction level parallelism (ILP) • Multiprocessors exploit either: • Thread level parallelism (TLP), or • Data level parallelism (DLP) • What’s the problem? M. Geiger CIS 570 Lec. 1

From ILP to TLP & DLP (cont.) • We’ve got tons of infrastructure for single-processor systems • Algorithms, languages, compilers, operating systems, architectures, etc. • These don’t exactly scale well • Multiprocessor design: not as simple as creating a chip with 1000 CPUs • Task scheduling/division • Communication • Memory issues • Even programming  moving from 1 to 2 CPUs is extremely difficult • Not strictly computer architecture, but it can’t happen without architects M. Geiger CIS 570 Lec. 1

CIS 570 Approach • How are we going to address this change? • Start by going through single-processor systems • Study ILP and ways to exploit that • Delve into memory hierarchies for single processors • Talk about storage and I/O systems • We may touch on embedded systems at this point • Then, we’ll look at multiprocessor systems • Discuss TLP and DLP • Talk about how multiprocessors affect memory design • Cover interconnection networks M. Geiger CIS 570 Lec. 1

What is computer architecture? software instruction set • Classical view: instruction set architecture (ISA) • Boundary between hardware and software • Provides abstraction at both high level and low level hardware M. Geiger CIS 570 Lec. 1

ISA vs. Computer Architecture • Modern issues aren’t in instruction set design • “Architecture is dead” … or is it? • Computer architecture now encompasses a larger range of technical issues • Modern view: ISA + design of computer organization & hardware to meet goals and functional requirements • Organization: high-level view of system • Hardware: specifics of a given system • Function of complete system now the issue M. Geiger CIS 570 Lec. 1

The roles of computer architecture • … as David Patterson sees it, anyway • Other fields borrow ideas from architecture • Anticipate and exploit advances in technology • Develop well-defined, thoroughly tested interfaces • Quantitative comparisons to determine when goals are reached • Quantitative principles of design M. Geiger CIS 570 Lec. 1

Goals and requirements • What goals might we want to meet? • Performance • Power • Price • Dependability • We’ll talk about how to quantify these as needed throughout the semester • Primarily focus on performance (both uniprocessor & multiprocessor systems) and dependability (mostly storage systems) M. Geiger CIS 570 Lec. 1

Design principles • Take advantage of parallelism • Principle of locality • Focus on the common case • Amdahl’s Law • Generalized processor performance M. Geiger CIS 570 Lec. 1

1. Take advantage of parallelism • Increasing throughput of server computer via multiple processors or multiple disks • Detailed HW design • Carry lookahead adders uses parallelism to speed up computing sums from linear to logarithmic in number of bits per operand • Multiple memory banks searched in parallel in set-associative caches • Pipelining: overlap instruction execution to reduce the total time to complete an instruction sequence. • Not every instruction depends on immediate predecessor  executing instructions completely/partially in parallel possible • Classic 5-stage pipeline: 1) Instruction Fetch (Ifetch), 2) Register Read (Reg), 3) Execute (ALU), 4) Data Memory Access (Dmem), 5) Register Write (Reg) M. Geiger CIS 570 Lec. 1

2. Principle of locality • The Principle of Locality: • Program access a relatively small portion of the address space at any instant of time. • Two Different Types of Locality: • Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse) • Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straight-line code, array access) • Last 30 years, HW relied on locality for memory perf. • Guiding principle behind caches • To some degree, guides instruction execution, too (90/10 rule) MEM P $ M. Geiger CIS 570 Lec. 1

3. Focus on the common case • In making a design trade-off, favor the frequent case over the infrequent case • E.g., Instruction fetch and decode unit used more frequently than multiplier, so optimize it 1st • E.g., If database server has 50 disks / processor, storage dependability dominates system dependability, so optimize it 1st • Frequent case is often simpler and can be done faster than the infrequent case • E.g., overflow is rare when adding 2 numbers, so improve performance by optimizing more common case of no overflow • May slow down overflow, but overall performance improved by optimizing for the normal case • What is frequent case and how much performance improved by making case faster => Amdahl’s Law M. Geiger CIS 570 Lec. 1

4. Amdahl’s Law Best you could ever hope to do: M. Geiger CIS 570 Lec. 1

CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle 5. Processor performance Inst Count CPI Clock Rate Program X Compiler X (X) Inst. Set. X X Organization X X Technology X M. Geiger CIS 570 Lec. 1

Next week • Review of ISAs (Appendix B) • Review of pipelining basics (Appendix A) • Discussion of architectural simulation M. Geiger CIS 570 Lec. 1

Acknowledgements • This lecture borrows heavily from David Patterson’s lecture slides for EECS 252: Graduate Computer Architecture, at the University of California, Berkeley • Many figures and other information are taken from Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th ed unless otherwise noted M. Geiger CIS 570 Lec. 1

CIS 570 Advanced Computer Systems University of Massachusetts Dartmouth

CIS 570 Advanced Computer Systems University of Massachusetts Dartmouth

Presentation Transcript

University Of Massachusetts

Dartmouth CS258bis: Advanced Userlands

CIS 620 Advanced Operating Systems

CIS 620 Advanced Operating Systems

CIS 620 Advanced Operating Systems

Dartmouth University

Advanced Operating Systems CIS 720

Dartmouth University

University of Massachusetts Dartmouth

Jim Kaput University of Massachusetts-Dartmouth

Avijit Gangopadhyay University of Massachusetts Dartmouth Email: avijit@umassd

Structure of Computer Systems (Advanced Computer Architectures)

Anne-Marie Brunner and Jim Bisagni University of Massachusetts, Dartmouth

UNIVERSITY OF MASSACHUSETTS DARTMOUTH

1 University of Massachusetts Dartmouth 2 Bigelow Laboratory for Ocean Sciences

CIS 250 Advanced Computer Applications

CIS 620 Advanced Operating Systems

CIS 620 Advanced Operating Systems

CIS 620 Advanced Operating Systems

CIS 620 Advanced Operating Systems

CIS 620 Advanced Operating Systems