390 likes | 567 Views
ECE 15B Computer Organization Spring 2010 Dmitri Strukov. Lecture 3: Arithmetic Instructions. Partially adapted from Computer Organization and Design, 4 th edition, Patterson and Hennessy, and classes taught by Patterson at Berkeley, Ryan Kastner at UCSB and Mary Jane Irwin at Penn State.
E N D
ECE 15B Computer OrganizationSpring 2010Dmitri Strukov Lecture 3: Arithmetic Instructions Partially adapted from Computer Organization and Design, 4th edition, Patterson and Hennessy, and classes taught by Patterson at Berkeley, Ryan Kastner at UCSB and Mary Jane Irwin at Penn State
Announcement • TA office hours for Vivek were moved to Tuesday 11:00 am – 12:00 am • Basics of logic design is in Appendix C (P+H) • SPIM and reading status ECE 15B Spring 2010
Agenda • Key concepts from last lecture & several new ones • C operators and operands • Variables in Assembly: Registers • Addition and Subtraction in Assembly ECE 15B Spring 2010
Key Concepts from Last Lecture • Synchronous circuits • Clocking & Pipelining & Timing Diagram • CPU simplified diagram ECE 15B Spring 2010
CPU Clocking • Operation of digital hardware governed by a constant-rate clock Clock period Clock (cycles) Data transferand computation Update state • Clock period: duration of a clock cycle • e.g., 250ps = 0.25ns = 250×10–12s • Clock frequency (rate): cycles per second • e.g., 4.0GHz = 4000MHz = 4.0×109Hz ECE 15B Spring 2010
CPU Overview ECE 15B Spring 2010
… with muxes • Can’t just join wires together • Use multiplexers ECE 15B Spring 2010
… with muxes ECE 15B Spring 2010
Below the Program temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw $t0, 0($2) lw $t1, 4($2) sw $t1, 0($2) sw $t0, 4($2) High Level Language Program (e.g., C) Compiler Assembly Language Program (e.g.,MIPS) Assembler 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111 Machine Language Program (MIPS) Machine Interpretation Hardware Architecture Description (e.g., block diagrams) Architecture Implementation Logic Circuit Description(Circuit Schematic Diagrams) ECE 15B Spring 2010
Assembly Language • Basic job of a CPU: execute lots of instructions • Instructions are the primitive operations that the CPU may execute • Different CPUs implement different sets of instructions • Instruction Set Architecture (ISA) is a set of instructions a particular CPU implements • Examples: Intel 80x86 (Pentium 4), IBM/Motorola Power PC (Macintosh), MIPS, Intel IA64, ARM ECE 15B Spring 2010
Instruction Set Architectures • Early trend was to add more and more instructions to new CPU to do elaborate operations • VAX architecture had an instruction to multiply polynomials • RISC philosophy (Cocke IBM, Patterson, Hennessy, 1980s) • RISC = Reduced Instruction Set Computing • Keep the instruction set small and simple which makes it easier to build fast hardware • Let software (compiler) do complicated operations by composing simpler ones ECE 15B Spring 2010
How to Access Performance of Instruction Set Architecture? ECE 15B Spring 2010
SPEC CPU Benchmark • Programs used to measure performance • Supposedly typical of actual workload • Standard Performance Evaluation Corp (SPEC) • Develops benchmarks for CPU, I/O, Web, … • SPEC CPU2006 • Elapsed time to execute a selection of programs • Negligible I/O, so focuses on CPU performance • Normalize relative to reference machine • Summarize as geometric mean of performance ratios • CINT2006 (integer) and CFP2006 (floating-point)
CINT2006 for Opteron X4 2356 ECE 15B Spring 2010
Review: Technology Trends Uniprocessor Performance (SPECint) 3X “Sea change” in chip design: multiple “cores” or processors per chip 1.20x/year 1.52x/year Performance (vs. VAX-11/780) 1.25x/year • VAX : 1.25x/year 1978 to 1986 • RISC + x86: 1.52x/year 1986 to 2002 • RISC + x86: 1.20x/year 2002 to present ECE 15B Spring 2010
MIPS Architecture • MIPS • Semiconductor company that built one of the first commercial RISC architectures • We will study the MIPS architecture in detail in this class • Why MIPS instead of Intel 80x86 • MIPS is simple and elegant. Don’t want to get bogged down in gritty details • MIPS is widely used in embedded apps • There are more embedded computers than PCs ECE 15B Spring 2010
Assembly Variables: Registers • Unlike HLL like C or Java, assembly cannot use variables • Why not? Keep hardware simple • Assembly Operands are registers • Limited number of special locations built directly into the hardware • Operations can only be performed on these • Benefit: Since registers file is small, it is very fast ECE 15B Spring 2010
Assembly Variables: Registers • Drawback: • Registers are in a hardware, there are a predetermined number of them • Solution: • MIPS code must be very carefully put together to efficiently use registers • 32 registers in MIPS • Smaller is faster • Each MIPS register is 32 bits wide • Groups of 32 bits called a word in MIPS ECE 15B Spring 2010
Assembly Variables • Registers are numbered from 0 to 31 • Each Register can be referred to by number or name • Number references • $0, $1, $2, …., $30, $31 ECE 15B Spring 2010
Assembly Variables: Registers • By convention, each register also has a name to make it easier to code • For now: $16 - $23 $s0 - $s7 (correspond to C variables) $8- $15 $t0 - $t7 (correspond to temporary variables) Will explain other 16 register names later • In general, use names to make your code more readable ECE 15B Spring 2010
C, Java Variables vs. Registers • In C (and most High Level Languages) variables declared first and given a type • Example: intfahr, celcius; char a, b, c, d, e; • Each variable can only represent a value of the type it was declared as (cannot mic and match int and char variables) • In assembly Language the registers have no type • Operation determines how register contents are treated ECE 15B Spring 2010
Comments in Assembly • Another way to make your code more readable: comments • Hash (#) is used for MIPS comments • Anything from hash mark to end of line is a comment and will be ignored • Note: different from C, comments have format /* comment */, so they can span many lines ECE 15B Spring 2010
Assembly Instructions • In assembly language, each statement (called an instruction), executes exactly one of a short list of simple commands • Unlike in C (and most other high level languages), each line of assembly code contains at most one instruction • Instructions are related to operations (=,+,-, *,/) in C or Java ECE 15B Spring 2010
MIPS Syntax • Instruction Syntax: [Label:] Op-code [oper. 1], [oper. 2], [oper.3], [#comment] (0) (1) (2) (3) (4) (5) • Where 1) operation name 2,3,4) operands 5) comments 0) label field is optional, will discuss later • For arithmetic and logic instruction 2) operand getting result (“destination”) 3) 1st operand for operation (“source 1”) 4) 2nd operand for operation (source 2” • Syntax is rigid • 1 operator, 3 operands • Why? Keep hardware simple via regularity ECE 15B Spring 2010
Addition and Subtraction of Integers • Addition in assembly • Example: add $s0, $s1, $s2 (in MIPS) • Equivalent to: a = b + c (in C) • Where MIPS registers $s0, $s1, $s2 are associated with C variables a, b, c • Subtraction in Assembly • Example Sub $s3, $s4, S5 (in MIPS) • Equivalent to: d = e - f (in C) • Where MIPS registers $s3, $s4, $s5 are associated with C variables d, e, f ECE 15B Spring 2010
Addition and Subtraction of Integers • The following C statement in MIPS? a= b + c+ d - a • Break into multiple instructions add $t0, $s1, $s2 #temp = b + c add $t0, $t0, $s3 # temp = temp + d sub $s0, $t0, $s4 # a = temp – e • Notes: • A single line of C may break up into several lines of MIPS • Everything after the hash mark on each line is ignored (i.e. comments) ECE 15B Spring 2010
Addition and Subtraction of Integers • How do we do this? f = (g + h) – (i + j) Use intermediate temporary registers add $t0, $s1, $s2 #temp = g + h add $t1, $s3, $s4 #temp = I + j sub $s0, $t0, $t1 #f = (g+h)-(i+j) ECE 15B Spring 2010
Immediates • Immediates are numerical constants • They appear often in code, so there are special instructions for them • Add immediate: addi $s0, $s1, 10 # f= g + 10 (in C) • Where MIPS registers $s0 and $s1 are associated with C variables f and g • Syntax similar to add instruction, except that last argument is a number instead of register ECE 15B Spring 2010
Immediates • There is no Subtract Immediate in MIPS: Why? • Remove redundant operations, i.e. if operation can be decomposed to into simpler ones exclude it from the set of instructions addi …, -X is equivalent to subi …, X so no subi • Example addi $so, $s1, -10 # f = g – 10 • where MIPS registers $s0 and $s1 are associated with C variables f and g ECE 15B Spring 2010
Register Zero • One particular immediate, the number zero (o) appears very often in code • So define register zero ($0 or $zero) to always have the value 0 • Example add $s0, S1, $zero # f = g • Where MIPS registers $s0 and $s1 are associated with C variables f, g, • Defined in hardware, so an instruction addi $zero, $zero, 5 will not do anything! ECE 15B Spring 2010
Additional Notes: CPU Time • Performance improved by • Reducing number of clock cycles • Increasing clock rate • Hardware designer must often trade off clock rate against cycle count ECE 15B Spring 2010
Additional Notes: CPU Time Example • Computer A: 2GHz clock, 10s CPU time • Designing Computer B • Aim for 6s CPU time • Can do faster clock, but causes 1.2 × clock cycles • How fast must Computer B clock be? ECE 15B Spring 2010
Additional Notes: Instruction Count and CPI • Instruction Count for a program • Determined by program, ISA and compiler • Average cycles per instruction • Determined by CPU hardware • If different instructions have different CPI • Average CPI affected by instruction mix ECE 15B Spring 2010
Additional Notes: CPI Example • Computer A: Cycle Time = 250ps, CPI = 2.0 • Computer B: Cycle Time = 500ps, CPI = 1.2 • Same ISA • Which is faster, and by how much? A is faster… …by this much ECE 15B Spring 2010
Additional Notes: CPI in More Detail • If different instruction classes take different numbers of cycles • Weighted average CPI Relative frequency ECE 15B Spring 2010
Additional Notes: Pipelining Analogy • Pipelined laundry: overlapping execution • Parallelism improves performance • Four loads: • Speedup= 8/3.5 = 2.3 • Non-stop: • Speedup= 2n/0.5n + 1.5 ≈ 4= number of stages ECE 15B Spring 2010
Additional Notes: Pipeline Performance Single-cycle (Tc= 800ps) Pipelined (Tc= 200ps) ECE 15B Spring 2010
Conclusions • In MIPS assembly language • Register replace C variables • One instruction (simple operation) per line • Simpler is faster ECE 15B Spring 2010
Review • Instructions so far: add, addi, sub • Registers so far C variables: $s0 - $s7 Temporary variables: $t0 - $t9 Zero: $zero ECE 15B Spring 2010