Chapter 3

Chapter 3 Arithmetic for Computers

Arithmetic for Computers §3.1 Introduction • What about fractions and other real numbers? • What happens if an operation creates a number bigger than can be represented? • And underlying these questions is a mystery: How does hardware really multiply or divide numbers? • Operations on integers • Addition and subtraction Multiplication and division Dealing with overflow • Floating-point real numbers • Representation and operations Chapter 3 — Arithmetic for Computers — 2

Integer Addition • Example: 7 + 6 §3.2 Addition and Subtraction • Overflow if result out of range • Adding +ve and –ve operands, no overflow • Adding two +ve operands • Overflow if result sign is 1 • Adding two –ve operands • Overflow if result sign is 0 Chapter 3 — Arithmetic for Computers — 3

Integer Subtraction • Add negation of second operand • Example: 7 – 6 = 7 + (–6) +7: 0000 0000 … 0000 0111–6: 1111 1111 … 1111 1010+1: 0000 0000 … 0000 0001 • Overflow if result out of range • Subtracting two +ve or two –ve operands, no overflow • Subtracting +ve from –ve operand • Overflow if result sign is 0 • Subtracting –ve from +ve operand • Overflow if result sign is 1 Chapter 3 — Arithmetic for Computers — 4

Overflow • cause exceptions on overflow: add ,addi, sub • not cause exceptions on overflow: addu, addiu, subu

Dealing with Overflow • Some languages (e.g., C) ignore overflow • Use MIPS addu, addui, subu instructions • Other languages (e.g., Ada, Fortran) require raising an exception • Use MIPS add, addi, sub instructions • On overflow, invoke exception handler • Save PC in exception program counter (EPC) register • Jump to predefined handler address • mfc0 (move from coprocessor reg) instruction can retrieve EPC value, to return after corrective action Chapter 3 — Arithmetic for Computers — 6

Overflow

Arithmetic for Multimedia • Graphics and media processing operates on vectors of 8-bit and 16-bit data • Use 64-bit adder, with partitioned carry chain • Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors • SIMD (single-instruction, multiple-data) • Saturating operations • On overflow, result is largest representable value • c.f. 2s-complement modulo arithmetic • E.g., clipping in audio, saturation in video Chapter 3 — Arithmetic for Computers — 8

1000 × 1001 1000 0000 0000 1000 1001000 Multiplication §3.3 Multiplication • Start with long-multiplication approach multiplicand multiplier product Length of product is the sum of operand lengths Chapter 3 — Arithmetic for Computers — 9

Multiplication Hardware Initially 0 Chapter 3 — Arithmetic for Computers — 10

Optimized Multiplier • Perform steps in parallel: add/shift • One cycle per partial-product addition • That’s ok, if frequency of multiplications is low Chapter 3 — Arithmetic for Computers — 11

Signed Multiplication • Convert to positive number • Run for 31 iterations • Negate the product if necessary

Faster Multiplier • Uses multiple adders • Cost/performance tradeoff • Can be pipelined • Several multiplication performed in parallel Chapter 3 — Arithmetic for Computers — 13

MIPS Multiplication • Two 32-bit registers for product • HI: most-significant 32 bits • LO: least-significant 32-bits • Instructions • multrs, rt / multurs, rt • 64-bit product in HI/LO • mfhi rd / mflo rd • Move from HI/LO to rd • Can test HI value to see if product overflows 32 bits • mul rd, rs, rt • Least-significant 32 bits of product –> rd Chapter 3 — Arithmetic for Computers — 14

Division §3.4 Division • Check for 0 divisor • Long division approach • If divisor ≤ dividend bits • 1 bit in quotient, subtract • Otherwise • 0 bit in quotient, bring down next dividend bit • Restoring division • Do the subtract, and if remainder goes < 0, add divisor back • Signed division • Divide using absolute values • Adjust sign of quotient and remainder as required quotient dividend 1001 1000 1001010 -1000 10 101 1010 -1000 10 divisor remainder n-bit operands yield n-bitquotient and remainder Chapter 3 — Arithmetic for Computers — 15

Division Hardware Initially divisor in left half Initially dividend Chapter 3 — Arithmetic for Computers — 16

Optimized Divider • One cycle per partial-remainder subtraction • Looks a lot like a multiplier! • Same hardware can be used for both Chapter 3 — Arithmetic for Computers — 17

Signed Division -4 +1

Faster Division • Can’t use parallel hardware as in multiplier • Subtraction is conditional on sign of remainder • Faster dividers (e.g. SRT devision) generate multiple quotient bits per step • Still require multiple steps Chapter 3 — Arithmetic for Computers — 19

MIPS Division • Use HI/LO registers for result • HI: 32-bit remainder • LO: 32-bit quotient • Instructions • div rs, rt / divurs, rt • No overflow or divide-by-0 checking • Software must perform checks if required • Use mfhi, mflo to access result Chapter 3 — Arithmetic for Computers — 20

Floating Point scientific notation normalized floating point fraction exponent,

Floating-Point representation • Overflow • Underflow • Double precision • To 32-bit word • 52-bit number • 11-bit exponent • 2x10308 • Single precision • A single 32-bit word • IEEE 754 floating-point standard • (-1)s

Floating-Point representation • IEEE 754 floating-point standard Integer comparisons Biased exponent

Example • Showing representation of the number -0.75 in IEEE 754

Floating-point addition • Example: • 4 decimal digits for significant • 2 digit for exponent Step 1 : aligning Step 2 : Addition Step 3 : aligning and checking Overflow and Underflow checking Step 4 : rounding

Floating-point addition

Example • Try adding the numbers O.5ten and -0.4375ten in binary Step 1: Step 2: Step 3: Step 4:

Block diagram

Floating-Point Multiplication Step 1. Step 2. Step 3 normalizing Step 4 rounding the number Step 5 The sign of the product

Floating-Point Multiplication

Floating-Point Instructions in MIPS • MIPS supports the IEEE 754 single precision and double precision • Floating-point addition, [single (add .s) , double (add .d)] • Floating-point subtraction, [single (sub .s) ,double (sub .d)] • Floating-point multiplication, [single (mu l . s) ,double(mul.d)] • Floating-point division, [single (d i v .s) ,double (d iv .d)] • Floating-point comparison, [single (c . x . s) double (c . x . d),] • Floating-point branch, [true (bc1t) , false (bc1f)] • $f0, $ f1, $ f2, ...-used either for single precision or double precision. • swc1 and lws1

Compiling a Floating-Point C Program into MIPS Assembly Code • Assume that the floating-point argument fahr is passed in $f 12 and the result should go in $f0.

Compiling a Floating-Point C Program into MIPS Assembly Code X = X +Y *Z

Compiling a Floating-Point C Program into MIPS Assembly Code

Floating-Point • Elaboration • Row-major vs. Column-major • Paired single version (add.ps) • Single instruction that a+b*c (PowerPC,Sparck64, AMD SSE5)

Accurate Arithmetic • Accurate Arithmetic • Tow extra bit (guard and round) • Example: • units in the last place (ulp) • IEEE 754 rounding mode • Round up • Round down • Truncate • Round to nearest even • Sticky bit

Arithmetic sumery

Parallelism and Computer Arithmetic:Associativity • Floating-point dose not associative • x+(y+z) ≠ (x+y)+z

The x86 floating pint architecture • Stack architecture • Register-memory instructions • st and st(i) • 80-bits wide • Floating point operation • Data movement instructions • Arithmetic instructions • Comparison • Transcendental instructions

x86 FP Instructions • Optional variations • I: integer operand • P: pop operand from stack • R: reverse operand order • But not all combinations allowed Chapter 3 — Arithmetic for Computers — 45

The x86 floating pint architecture

The Intel streaming SIMD extension 2(SSE2) • In 2001 Intel add 144 instructions • eight 64-bits registers • AMD64, EM64T • 128-bit double precision

Streaming SIMD Extension 2 (SSE2) • Adds 4 × 128-bit registers • Extended to 8 registers in AMD64/EM64T • Can be used for multiple FP operands • 2× 64-bit double precision • 4× 32-bit double precision • Instructions operate on them simultaneously • Single-Instruction Multiple-Data Chapter 3 — Arithmetic for Computers — 48

Fallacies and Pitfalls • Fallacy: Right shift is the same as an integer division by a power of 2. • -5 =1111 1111 1111 1111 1111 1111 1111 1011two • shifting right by two = 0011 1111 1111 1111 1111 1111 1111 1110two • Equal to 1,073,741 ,822ten • Arithmetic shifting = 1111 1111 1111 1111 1111 1111 1111 1110two • Equal to -2 • Pitfall: The MIPS instruction add immediate unsigned (addiu) sign-extends its16-bit immediate field. • Fallacy: Only theoretical mathematicians care about floating-point accuracy.

The frequency of the MIPS instructions for SPEC2006 integer and floating point

Chapter 3

Chapter 3

Presentation Transcript

Chapter 3

Chapter 3

Chapter 3

Chapter 3

Chapter 3

Chapter 3

chapter 3

CHAPTER 3-3

Chapter 3-3

Chapter 3 Chapter 3

CHAPTER 3

Chapter 3

Chapter 3

Chapter 3

Chapter 3

Chapter 3

Chapter 3

Chapter 3

Chapter 3

Chapter 3

CHAPTER 3

Chapter 3