650 likes | 976 Views
Text book. “ Computer Organization & Design — The hardware / software interface ” ---------------------Fourth Edition Author: David A. Patterson and john L. Hennessy. 學期分數計法. 兩次期中考 40% 期末考 30% 平時 30% 每次上課全程出席得學期成績總分 2 分,扣除三次期中、期末考,共 15 次
E N D
Text book “Computer Organization & Design— The hardware / software interface” ---------------------Fourth Edition Author: David A. Patterson and john L. Hennessy Jung-Sheng Fu, DEE, NUU, ROC
學期分數計法 • 兩次期中考 40% • 期末考 30% • 平時 30% • 每次上課全程出席得學期成績總分2分,扣除三次期中、期末考,共15次 • 其他加分: • 找出老師的錯誤,包括課程內容、英文發音、拚字或文法錯誤等等( 加總成績0.5~1分) • 回答問題 (加總成績0.2~1分分) • 問了好問題(加總成績0.1~0,5分) • 其他 • 學期總成績= 期中考1* 20% + 期中考2*20% + 期末考*20% + 平時 + 其他加分 Jung-Sheng Fu, DEE, NUU, ROC
投影片下載位址 • 數位學習系統上會有所有投影片 • 每週投影片會更新至http://web.nuu.edu.tw/~jsfu/course.htm • 請儘量養成看原文書的習慣,因為… Jung-Sheng Fu, DEE, NUU, ROC
學期授課大綱說明 • 1. Computer Abstractions and Technology • 2. Instructions: Language of the Computer • 3. Arithmetic for Computers • 4. The Processor • 5. Large and Fast: Exploiting Memory Hierarchy
Chapter 1 Computer Abstractions and Technology
The Computer Revolution §1.1 Introduction • Progress in computer technology • Makes novel applications feasible • Computers in automobiles • Cell phones • Human genome project • World Wide Web • Search Engines • Computers are pervasive Chapter 1 — Computer Abstractions and Technology — 6
Classes of Computers • Desktop computers • General purpose, variety of software • Subject to cost/performance tradeoff • Server computers • Network based • High capacity, performance, reliability • Range from small servers to building sized • Embedded computers • Hidden as components of systems • Stringent power/performance/cost constraints Chapter 1 — Computer Abstractions and Technology — 7
The Processor Market The number of cell phones, personal computers, and televisions manufactured per year between 1997 and 2007 Chapter 1 — Computer Abstractions and Technology — 8
What You Will Learn • How programs are translated into the machine language • And how the hardware executes them • The hardware/software interface • What determines program performance • And how it can be improved • How hardware designers improve performance • What is parallel processing Chapter 1 — Computer Abstractions and Technology — 9
Understanding Performance • Algorithm (Other books) • Determines number of operations executed • Programming language, compiler, architecture (Chapters 2 & 3) • Determine number of machine instructions executed per operation • Processor and memory system (Chapters 4, 5, 7) • Determine how fast instructions are executed • I/O system (including OS) (Chapters 6) • Determines how fast I/O operations are executed Chapter 1 — Computer Abstractions and Technology — 10
Below Your Program • Application software • Written in high-level language • System software • Compiler: translates HLL code to machine code • Operating System: service code • Handling input/output • Managing memory and storage • Scheduling tasks & sharing resources • Hardware • Processor, memory, I/O controllers §1.2 Below Your Program Chapter 1 — Computer Abstractions and Technology — 11
Levels of Program Code • High-level language • Level of abstraction closer to problem domain • Provides for productivity and portability • Assembly language • Textual representation of instructions • Hardware representation • Binary digits (bits) • Encoded instructions and data Chapter 1 — Computer Abstractions and Technology — 12
Components of a Computer §1.3 Under the Covers • Five classic Components The BIG Picture Chapter 1 — Computer Abstractions and Technology — 13
Components of a Computer §1.3 Under the Covers • All kinds of computer have same components • Desktop, server, embedded • Input/output includes (Chapter 6) • User-interface devices • Display, keyboard, mouse • Storage devices • Hard disk, CD/DVD, flash • Network adapters • For communicating with other computers Chapter 1 — Computer Abstractions and Technology — 14
Anatomy of a Computer Output device Network cable Input device Input device Chapter 1 — Computer Abstractions and Technology — 15
Anatomy of a Mouse • Optical mouse • LED illuminates desktop • Small low-res camera • Basic image processor • Looks for x, y movement • Buttons & wheel • Supersedes roller-ball mechanical mouse Chapter 1 — Computer Abstractions and Technology — 16
Through the Looking Glass • LCD screen: picture elements (pixels) • Mirrors content of frame buffer memory Chapter 1 — Computer Abstractions and Technology — 17
Opening the Box Chapter 1 — Computer Abstractions and Technology — 18
Inside the Processor (CPU) • Datapath: performs operations on data • Control: tells datapath, memory, and I/O devices what to do according to the instruction of the program ... • Cache memory • Small fast SRAM memory for immediate access to data Chapter 1 — Computer Abstractions and Technology — 19
Inside the Processor • AMD Barcelona: 4 processor cores Chapter 1 — Computer Abstractions and Technology — 20
Abstractions • Abstraction helps us deal with complexity • Hide lower-level detail • Instruction set architecture (ISA) • The hardware/software interface • Application binary interface • The ISA plus system software interface • Implementation • A hardware that obeys the architecutre abstraction The BIG Picture Chapter 1 — Computer Abstractions and Technology — 21
A Safe Place for Data • Main memory is volatile • Loses instructions and data when power off • Non-volatile secondary memory • Magnetic disk • Flash memory • Optical disk (CDROM, DVD, Blu-Ray) Chapter 1 — Computer Abstractions and Technology — 22
Networks • Communication and resource sharing • Local area network (LAN): Ethernet • Within a building • Wide area network (WAN: the Internet) • Wireless network: WiFi, Bluetooth Chapter 1 — Computer Abstractions and Technology — 23
Technology Trends • Electronics technology continues to evolve • Increased capacity and performance • Reduced cost DRAM capacity Chapter 1 — Computer Abstractions and Technology — 24
Defining Performance §1.4 Performance • Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology — 25
Response Time and Throughput • Response time (Execution time) • How long it takes to complete a task • Throughput • Total work done per unit time • e.g., tasks/transactions/… per hour • How are response time and throughput affected by • Replacing the processor with a faster version? • Adding more processors? • We’ll focus on response time for now… Chapter 1 — Computer Abstractions and Technology — 26
Relative Performance • Define Performance = 1/Execution Time • “X is n time faster than Y” • Example: time taken to run a program • 10s on A, 15s on B • Execution TimeB / Execution TimeA= 15s / 10s = 1.5 • So A is 1.5 times faster than B Chapter 1 — Computer Abstractions and Technology — 27
Measuring Execution Time • Elapsed time • Total response time, including all aspects • Processing, I/O, OS overhead, idle time • Determines system performance • CPU time • Time spent processing a given job • Discounts I/O time, other jobs’ shares • Comprises user CPU time and system CPU time Chapter 1 — Computer Abstractions and Technology — 28
Clock period Clock (cycles) Data transferand computation Update state CPU Clocking • Operation of digital hardware governed by a constant-rate clock • Clock period: duration of a clock cycle • e.g., 250ps = 0.25ns = 250×10–12s • Clock frequency (rate): cycles per second • e.g., 4.0GHz = 4000MHz = 4.0×109Hz Chapter 1 — Computer Abstractions and Technology — 29
額外補充 Chapter 1 — Computer Abstractions and Technology — 30
CPU Time • Performance improved by • Reducing number of clock cycles • Increasing clock rate • Hardware designer must often trade off clock rate against cycle count Chapter 1 — Computer Abstractions and Technology — 31
CPU Time Example • Computer A: 2GHz clock, 10s CPU time • Designing Computer B • Aim for 6s CPU time • Can do faster clock, but causes 1.2 × clock cycles of A • How fast must Computer B clock be? Chapter 1 — Computer Abstractions and Technology — 32
Instruction Count and CPI • Instruction Count for a program • Determined by program, ISA and compiler • Average Cycles Per Instruction (CPI) • Determined by CPU hardware • Provides one way of comparing two different implementations of the same ISA Chapter 1 — Computer Abstractions and Technology — 33
CPI Example • Computer A: Cycle Time = 250ps, CPI = 2.0 • Computer B: Cycle Time = 500ps, CPI = 1.2 • Same ISA • Which is faster, and by how much? A is faster… …by this much Chapter 1 — Computer Abstractions and Technology — 34
Classic performance Equation Chapter 1 — Computer Abstractions and Technology — 35
CPI in More Detail • If different instruction classes take different numbers of cycles • Weighted average CPI Relative frequency Chapter 1 — Computer Abstractions and Technology — 36
CPI Example • Alternative compiled code sequences using instructions in classes A, B, C • Which code sequence will be faster? • What is the CPI for each sequence? Chapter 1 — Computer Abstractions and Technology — 37
CPI Example • Sequence 1: IC = 5 • Clock Cycles= 2×1 + 1×2 + 2×3= 10 • Avg. CPI = 10/5 = 2.0 • Sequence 2: IC = 6 • Clock Cycles= 4×1 + 1×2 + 1×3= 9 • Avg. CPI = 9/6 = 1.5 Chapter 1 — Computer Abstractions and Technology — 38
Performance Summary • Performance depends on • Algorithm: affects IC, possibly CPI • Programming language: affects IC, CPI • Compiler: affects IC, CPI • Instruction set architecture: affects IC, CPI, Clock cycle time The BIG Picture Chapter 1 — Computer Abstractions and Technology — 39
Power Trends §1.5 The Power Wall Chapter 1 — Computer Abstractions and Technology — 40
Power Trends §1.5 The Power Wall • In CMOS IC technology, the primary source of power dissipation is so-called dynamic power (power that is consumed during switching) ×30 5V → 1V ×1000 Chapter 1 — Computer Abstractions and Technology — 41
Reducing Power • Suppose a new CPU has • 85% of capacitive load of old CPU • 15% voltage and 15% frequency reduction • The power wall • We can’t reduce voltage further (because of the static power dissipation) • We can’t remove more heat • How else can we improve performance? Chapter 1 — Computer Abstractions and Technology — 42
Uniprocessor Performance §1.6 The Sea Change: The Switch to Multiprocessors Constrained by power, instruction-level parallelism, memory latency Chapter 1 — Computer Abstractions and Technology — 43
Multiprocessors • Multicore microprocessors • More than one processor per chip • Requires explicitly parallel programming • Compare with instruction level parallelism • Hardware executes multiple instructions at once • Hidden from the programmer • Hard to do • Programming for performance (the parallel program must be fast; otherwise, just use the sequential program ) • Load balancing • Optimizing communication and synchronization overhead Chapter 1 — Computer Abstractions and Technology — 44
Manufacturing ICs • Yield: proportion of working dies per wafer §1.7 Real Stuff: The AMD Opteron X4 Chapter 1 — Computer Abstractions and Technology — 45
AMD Opteron X2 Wafer • X2: 300mm wafer, 117 chips, 90nm technology • X4: 45nm technology Chapter 1 — Computer Abstractions and Technology — 46
Integrated Circuit Cost • Costs are not linear to die area • When die area is decreased, how about cost? • Defect rate determined by manufacturing process • Die area determined by architecture and circuit design Chapter 1 — Computer Abstractions and Technology — 47
SPEC CPU Benchmark • Benchmark : Programs used to measure performance • Supposedly typical of actual workload • Standard Performance Evaluation Cooperative (SPEC) • Develops benchmarks for CPU, I/O, Web, … • SPEC CPU2006 • Elapsed time to execute a selection of programs • Negligible I/O, so focuses on CPU performance • Normalize relative to reference machine • Summarize as geometric mean of performance ratios • CINT2006 (integer) and CFP2006 (floating-point) Chapter 1 — Computer Abstractions and Technology — 48
CINT2006 for Opteron X4 2356 High cache miss rates Chapter 1 — Computer Abstractions and Technology — 49
SPEC Power Benchmark (skip) • Power consumption of servers at different workload levels • Performance: ssj_ops/sec • Power: Watts (Joules/sec) Chapter 1 — Computer Abstractions and Technology — 50