610 likes | 630 Views
Chapter 3 Arithmetic for Computers 電腦之算術運算. ROBERT CHEN. Outlines. Signed and Unsigned Numbers( 有號數與無號數 ) Addition and Subtraction( 加法與減法 )) Multiplication( 乘法 ) Division( 除法 ) Floating Point( 浮點數 ). Signed and Unsigned Numbers. Numbers can be presented in any base( 基底 )
E N D
Chapter 3 Arithmetic for Computers 電腦之算術運算 ROBERT CHEN
Outlines • Signed and Unsigned Numbers(有號數與無號數) • Addition and Subtraction(加法與減法)) • Multiplication(乘法) • Division(除法) • Floating Point(浮點數)
Signed and Unsigned Numbers • Numbers can be presented in any base(基底) • A n-bit number: an-1an-2…a1a0, base is d, then its decimal value is: an-1× d n-1+ an-2 × d n-2+… + a1 × d n-1 +a0 × d 0 (10) Where a0 called the Least Significant Bit (LSB最低有效位元), an-1 called the Most Significant Bit (LSB最高有效位元) [Ex] 1011two = ten ? [Ans] (1 × 23) + (0 × 22) + (1 × 21) + (1 × 20) = (1 × 8) + (0 × 4) + (1 × 2) + (1 × 1) = 8 + 0 + 2 + 1 = 11ten
Signed and Unsigned Numbers • Negative Number Representation (負數表示法) • Sign-magnitude • 1’s complement • 2’s complement (adopted by all now-a-day computers) • For a n-bit binary number • The Most Significant Bit (MSB) represents the sign : 0 : positive, 1:negative • The rest (n-1) bits represents the magnitude 快速求法:欲求某負數之2’s complement表示法,可先求其正數,再取補數後加1,注意位元數目!
Signed and Unsigned Numbers • Three presentations for a 4-bit number
Signed and Unsigned Numbers • 2’s complement binary number to decimal conversion an-1× (-d n-1)+ an-2 × d n-2+… + a1 × d n-1 +a0 × d 0 (10) [Ex] what is the decimal number of the 32-bit binary pattern? 1111 1111 1111 1111 1111 1111 1111 11002 [Ans] (1×-231) + (1×230) + (1×229) + … + (1×22) + (0×21) + (0×20) = - 231+ 230 + 229 + … + 22 + 0 + 0 = - 214748364810 + 214748364410 = - 410 [Ex] Find -210 2’s complement representation • Find 210 = 0000 0000 0000 0000 0000 0000 0000 00102 • Complement : 1111 1111 1111 1111 1111 1111 1111 11012 • Add 1 : 12 1111 1111 1111 1111 1111 1111 1111 11102 +
[Ex] Show that a 2’s-complement number can be converted to a representation with more bits by sign extension. That is, given an n-bit 2’s-complement number X, show that the m-bit 2’s-complement representation of X, where m > n, can be obtained by appending m-n copies of X’s sign bit to the left of thze n-bit representation of X. [84清大電機] [Ans] 若假設X為n-bit正數,欲做sign extension成A (m-bit),則將bit n~(m-1)補0,則 若假設X為n-bit負數,欲做sign extension成A (m-bit), 因為轉換後的A=原來的X值,所以 m-1 n-1 0 n bits m-n bits Signed and Unsigned Numbers 明顯地,ai(n-1 i m-2)必須全為1,亦即將an-1=1做sign extension (因為am-1 ~ an-1皆為0)
Signed and Unsigned Numbers [Ex] Let A=an-1an-2…a1a0 be a two’scomplement integer. Show that [Ans] (1)若A為正數,則an-1=0,A=an-1an-2…a1a0(2)=0an-2…a1a0(2)= (2)若A為負數,則an-1=1,A=an-1an-2…a1a0(2)=1an-2…a1a0(2) 由(1),(2)得證
Signed and Unsigned Numbers • Sign extension (符號擴展) • Converting n-bit numbers into numbers with more than n bits • copy the most significant bit (the sign bit) into the other bits0010 -> 000000101010 -> 11111010 • MIPS “data transfer” instruction (load/store)
Signed and Unsigned Numbers • Signed vs. unsigned number $s0 = 1111 1111 1111 1111 1111 1111 1111 11112 $s1 = 0000 0000 0000 0000 0000 0000 0000 00012 slt $t0, $s0, $s1 # signed number comparison sltu $t1, $s0, $s1 # unsigned number comparison After executing the above instructions, $t0 = 1, $t1 = 0. Why? • Range detection [Ex] If ($a1>$a2) or ($a1< 0) then jump to IndexOutOfBounds sltu $t0, $a1, $t2 # temp reg $t0=0, if k>=length or k<0 beq $t0, zero, IndexOutOfBounds # if exceed, then jump
Signed and Unsigned Numbers [Ex1] Write all 4-bit numbers using the above representations. (1) according the decimal number (2) according the 0000 ~ 1111 (3) compare their range and zero representation, max/min number [Ex2] Show -18 using the above representations with (1) 8 bits (2) 16 bits. [Ans] sign-magnitude: 10010010 1000 0000 0001 0010 1’s complement: 11101101 1111 1111 1110 1101 2’s complement: 11101110 1111 1111 1110 1110 [Ex3] Why 2’s complement representation is better than the others in computer architecture? [Ans] 1. One representation of zero 2. Arithmetic works easily 3. Negating is fairly easy [Ex4] What is overflow ? How to judge whether an overflow happens or not? Cin Cout =1 [Ex5] Express +69 and -69 with 8 bits using signed-magnitude, 1’s complement, 2’s complement and excess code representations. [Ans] sign-magnitude 0100010111000101 1’s complement 0100010110111010 2’s complement 0100010110111011 excess-128 code 1100010100111011
Cin Cout 1 1 1 1 0進位 0 1 1 1 + 0 0 0 1 1 1 0 0 0 進位 MSB Overflow • Overflow(溢位) • 意義:兩數做算術運算,其結果超出其所能表示之範圍 e.g., adding two n-bit numbers does not yield an n-bit number0111+0001 • 可能發生之狀況: • 正數+正數 • 正數-負數 • 負數+負數 • 負數-正數 • 判斷是否發生溢位 • MSB之Cin Cout = 0 表無溢位:將進位捨棄,結果為正確答案 • MSB之Cin Cout = 1 表有溢位:結果為錯誤答案 • 解決溢位之方法 • 增加位元數目
Effects of Overflow • An exception (interrupt) occurs • Control jumps to predefined address for exception • Interrupted address is saved for possible resumption • Don't always want to detect overflow • addu, addiu, subu do NOT cause exceptions on overflownote: addiu still sign-extends! note: sltu, sltiu for unsigned comparisons • add, addi, sub cause exceptions on overflow • MIPS C compiler 總是產生無號數之算術指令所以會忽略溢位,但MIPS FORTRAN compiler則會依據運算元型態產生適當的算術指令 • 自行研究課本p.173,174之程式
Review: ALU Design • 建立算術邏輯單元 • 由建立一個位元的算術邏輯單元 (ALU)開始, 因為MIPS的字組都是32位元的長度,所以的ALU也必須是32位元 • 使用 4 種基本硬體元件來建構ALU
CarryIn 輸入 輸出 a + Sum a 0 0 0 0 1 1 1 1 b 0 0 1 1 0 0 1 1 CarryIn 0 1 0 1 0 1 0 1 CarryOut 0 0 0 1 0 1 1 1 Sum 0 1 1 0 1 0 0 1 b CarryOut Review: ALU Design • 一位元加法器(Full Adder全加器)
一位元ALU 執行 AND, OR及加 運算 = 位元2 為加法的運算結果 由32個一位元 ALU所建構的一個32位元的ALU Review: ALU Design
Review: ALU Design • 減法如同對運算元的反相值做加法運算而且最低有效位元(LSB)本身還是有個進位輸入訊號(CarryIn) • a + (¬b) + 1 = a + (-b) = a - b • 一位元 ALU 執行 AND, OR及對 a和 b或 a和 ¬b 做加法運算及減法運算 • ALU 0: Operation = 2, Binvert = 1, 及 CarryIn = 1 • ALU 1-31: Operation = 2 及 Binvert = 1
Review: ALU Design • 支援小於即設定 (slt)指令 • 範例: slt $t0, $s1, $s2 ai:暫存器 $s1的位元 i bi:暫存器$s2的位元 i 結果: 假如 $s1 < $s20…001 否則 0…000
Review: ALU Design • 32位元 ALU • 觀察:假如 $s1 - $s2 <0, 則 $s1 < $s2 • 結果 0 = 加法器的符號位元 • 控制訊號 / 輸入 • Operation = 3 • Less 1-31 = 0 • Binvert = 1 • CarryIn 0 = 1 • Less 0 = Set
Review: ALU Design • 32位元ALU(續) • 設定 Binvert = 1及 CarryIn 0 = 1做減法運算 • 對加法及邏輯運算設定 Binvert = 0及 CarryIn 0 = 0 • 組合 Binvert及CarryIn 0為一條控制線,稱為 Bnegate (下一頁之圖)
Review: ALU Design • 支援條件分支指令 • 假使兩個暫存器相等或假使不相等時作分支 • 觀察 • if a=b then a-b = 0 • if a-b = 0 then • Result = 0…00 • Zero = 0 不等時 • Zero = 1 相等時
ALU控制線的值 及相對應的功能 常用來直接代表 ALU的符號 Bnegate 0 0 0 1 1 Operation 00 01 10 10 11 and or add subtract set on less than 功能 Review: ALU Design
Review: ALU Design • We can build an ALU to support the MIPS instruction set • key idea: use multiplexer (MUX) to select the output we want • efficiently perform subtraction using two’s complement • replicate a 1-bit ALU to produce a 32-bit ALU • Important points about hardware • all of the gates are always working • the speed of a gate is affected by the number of inputs to the gate • the speed of a circuit is affected by the number of gates in series (on the “critical path” or the “deepest level of logic”)
y0 x0 y1 x1 yn-2 xn-2 yn-1 xn-1 c1 FA0 cn cn-1 c0 cn-2 FA1 FAn-1 FAn-2 s0 s1 sn-2 sn-1 Parallel Adder vs. Carry Look Ahead Adder • Half adder (HA) • 2 bits adder, say x and y • Sum = x y • Carry = xy • Full adder (FA) • 3 bits adder, say x, y and c • Sum = x y c • Carry = (x y) c + xy • Propagation delay time tp(net) = tXOR+max(tXOR, 2tNAND) • N-bit parallel adder can be implemented by n FAs Propagation delay time: tp(net) = (n-1) tc+max(ts, tc) tp ,ts ,tc : propagation delat time of the total path, the sum and the carry, respectively. • Each FA is a two-level logic circuit, if the propagation delay time of one two-level logic circuit is d, then that of a n-bit parallel adder is nd.
Parallel Adder vs. Carry Look Ahead Adder • Ripple Carry Adder (Parallel adder) • 8 bits binary adder-subtractor • Each full adder has a small propagation delay; these delays add up as the carry bits are propagated.
Parallel Adder vs. Carry Look Ahead Adder • Carry Lookahead Adder (4 bits) • One solution to the delay problem • The generate part ,g, g = X Y(carry generator) • The second part is the propagate, p, p = X Y(carry propagation) • In general, this can be expressed by the equation Si = Pi Ci Ci+1= gi+ piCi • For the 4-bit adder, these values are
p3 C0 C0 S0 p0 X0 Y0 p2 g0 S1 C1 p1 X1 Y1 g1 p1 S2 C2 X2 p2 Y2 g2 p0 X3 p3 S3 C3 Y3 C4 C4 g3 Parallel Adder vs. Carry Look Ahead Adder • Carry Lookahead Adder • Block diagram 74182
y0 x0 y1 x1 yn-2 xn-2 yn-1 xn-1 c1 FA0 cn cn-1 c0 cn-2 FA1 FAn-1 FAn-2 s0 s1 sn-2 sn-1 Parallel Adder vs. Carry Look Ahead Adder [Ex1] If the propagation delay time of S is 30ns and that of C is 20ns in an FA. What is the total propagation delay of a 4-bit ripple carry adder (7483)? KEY: Calculate the longest path 30 +20 x (4-1) = 90 ns [Ex2] Repeat [Ex1] for a 32-bit ripple adder, the propagation delay time is ________. 30+ 20(32-1) =650ns [Ex3] If the propagation delay time of a lookahead carry generator is 20ns, that of pi and gi is 20ns. What is the total propagation delay of a carry lookahead adder ? 40ns (see Fig. on page 4-7) [Ex4] Repeat [Ex1] for a 32-bit carry lookahead adder , the propagation delay time is ________. 40ns The propagation delay time of look-ahead adder is independent of .
Multiplication • Unsigned Integer multiplication(無號整數相乘) • 範例 . (1000)10 x (1011)10: 1000 x 1011 1000 1000_ 0000__ 1000___ 1011000 • Example. (0010)2 x (0011)2: 0010 x 0011 0010 0010_ 0000__ 0000___ 0000110 multiplicand被乘數 multiplier乘數
First version Product register is initialized to 0 It takes almost 100 clock cycles, if each step took a clock cycle S t a r t M u l t i p l i e r 0 = 1 M u l t i p l i e r 0 = 0 1 . T e s t M u l t i p l i e r 0 1 a . A d d m u l t i p l i c a n d t o p r o d u c t a n d p l a c e t h e r e s u l t i n P r o d u c t r e g i s t e r 2 . S h i f t t h e M u l t i p l i c a n d r e g i s t e r l e f t 1 b i t 3 . S h i f t t h e M u l t i p l i e r r e g i s t e r r i g h t 1 b i t N o : < 3 2 r e p e t i t i o n s 3 2 n d r e p e t i t i o n ? Y e s : 3 2 r e p e t i t i o n s D o n e Multiplication
Multiplication • Example for first-version multiplier f • Using 4-bit number, multiply 210 × 310 = 00102 × 00112
Sequential (second) version multiplier Product register is initialized to 0 32-bit ALU S t a r t M u l t i p l i e r 0 = 1 M u l t i p l i e r 0 = 0 1 . T e s t M u l t i p l i e r 0 1 a . A d d m u l t i p l i c a n d t o t h e l e f t h a l f o f t h e p r o d u c t a n d p l a c e t h e r e s u l t i n t h e l e f t h a l f o f t h e P r o d u c t r e g i s t e r 2 . S h i f t t h e P r o d u c t r e g i s t e r r i g h t 1 b i t 3 . S h i f t t h e M u l t i p l i e r r e g i s t e r r i g h t 1 b i t N o : < 3 2 r e p e t i t i o n s 3 2 n d r e p e t i t i o n ? Y e s : 3 2 r e p e t i t i o n s D o n e Multiplication
Multiplication • Example for sequential-version multiplier f • Using 4-bit number, multiply 210 × 310 = 00102 × 00112
Final (third) version multiplier 32-bit ALU Product register right half is initialized to the value of multiplier 60 clock cycles Multiplication
Multiplication • Signed Multiplication(有號數乘法) • Convert the multiplier and multiplicand to positive numbers and remember the original signs. • Shifting steps need to extend the sign of the product • Negate the product if the original signs disagree. [NOTE] 有號數乘法:利用第三版乘法器先做前面31位元的運算,再比較乘數及被乘數的符號(第32位元)是否相同 • A more elegant method: Booth’s algorithm • 對有連續1的乘數可加速處理 • 碰到第一個1加法變減法
Middle of Run 0 1 1 1 0 End of Run Beginning of Run Booth’s Algorithm • Classifying groups of bits into the beginning, the middle or the end of a run of 1s • Take a look at 2-bit groups
Booth’s Algorithm • Depending on the current and previous bits, do one of the following: • 00: no arithmetic operation • 01: End of a string of 1s, so add multiplicand to the left half of the product • 10: Beginning of a strings of 1s, so subtract the multiplicand from the left half of the product • 11: no arithmetic operation • Shift the Product register right 1 bit.
Booth’s Algorithm • Example • 2 x (-3) = 6 or 0010 x 1101 = 1111 1010 Previous bit Current bit
Booth’s Algorithm • Multiply by 2i via shift • Shift left by one bit --> multiply by 2 • Shift left by n bit --> multiply by 2 n • Proof of Booth’s Algorithm • Why does it work for 2’s complement signed number? • In other words, if (ai-1 - ai) • = 0 do nothing • = 1 add b • = -1 subtract b • Booth’s algorithm can be written as: (a-1 - a0) x b x 2 0+ (a0- a1) x b x 2 1+ …+(a30-a31)x b x 2 31 = … = b x a
整數的乘法 • 前提 • 所有數值均以2補數法表示 • 被乘數乘數=乘積 (multiplicand multiplier=product) • (nbits) (nbits)= (2nbits): 若不足位需做符號擴展 • 被乘數需一個nbit的暫存器,乘積需一個(2n+1)bit的暫存器,但乘數置於乘積的左半部 • 多那一個bit….要做啥?(輔助位元) • 布斯演算法 • 取決於乘數中前一個與目前bit(從後面來),並依下列情況處理: • 00: 不做算術運算,乘積右移一位 • 01: 連續字串1的結尾,將被乘數加到乘積左半邊中,乘積右移一位 • 10:連續字串1的開始,從乘積左半邊減去被乘數,乘積右移一位 • 11: 不做算術運算,乘積右移一位
密技1:A-B = A+(-B) =A+(B’+1) 密技2: 採用算術右移,符號位元保持不變 整數的乘法 • 布斯演算法範例 (4bits) • 2 x (-3) = -6 or 0010 x 1101 = 1111 1010 Previous bit Current bit
Use 32 adders instead of using a single 32-bit adder on at a time. Faster Multiplication
Division • Some definitions: • Dividend(被除數), Divisor(除數), Quotient(商數), Remainder(餘數) • Dividend = Quotient × Divisor + Remainder • Example: 1001010 divided by 1000 1001 Quotient(商) Divisor 1000 1001010 Dividend(被除數)–1000 10 101 1010 –1000 10 Remainder (or Modulo result)
S t a r t D i v i s o r 1 . S u b t r a c t t h e D i v i s o r r e g i s t e r f r o m t h e S h i f t r i g h t R e m a i n d e r r e g i s t e r a n d p l a c e t h e r e s u l t i n t h e R e m a i n d e r r e g i s t e r 6 4 b i t s Q u o t i e n t 6 4 - b i t A L U S h i f t l e f t > R e m a i n d e r 0 R e m a i n d e r < 0 T e s t R e m a i n d e r 3 2 b i t s C o n t r o l R e m a i n d e r t e s t W r i t e 2 a . S h i f t t h e Q u o t i e n t r e g i s t e r t o t h e l e f t , 2 b . R e s t o r e t h e o r i g i n a l v a l u e b y a d d i n g 6 4 b i t s s e t t i n g t h e n e w r i g h t m o s t b i t t o 1 t h e D i v i s o r r e g i s t e r t o t h e R e m a i n d e r r e g i s t e r a n d p l a c e t h e s u m i n t h e R e m a i n d e r r e g i s t e r . A l s o s h i f t t h e Q u o t i e n t r e g i s t e r t o t h e l e f t , s e t t i n g t h e n e w l e a s t s i g n i f i c a n t b i t t o 0 3 . S h i f t t h e D i v i s o r r e g i s t e r r i g h t 1 b i t N o : < 3 3 r e p e t i t i o n s 3 3 r d r e p e t i t i o n ? Y e s : 3 3 r e p e t i t i o n s D o n e Division • First version Divisor
Example • Using a 4-bit version to perform 72 = 3…1
Division • Observations on the first version of the division hardware • 1/2 bits in divisor always 0 • 1/2 of 64-bit adder is wasted • 1/2 of divisor is wasted • Instead of shifting divisor to right, shift remainder to left? • 1st step cannot produce a 1 in quotient bit (otherwise too big) • switch order to shift first and then subtract, can save 1 iteration
D i v i s o r 3 2 b i t s Q u o t i e n t 3 2 - b i t A L U S h i f t l e f t 3 2 b i t s C o n t r o l S h i f t l e f t R e m a i n d e r t e s t W r i t e 6 4 b i t s Division • Second Version of the division hardware
Final Version of the division hardware Execute 7 2 = 3…1 using the algorithm (DIY) S t a r t 1 . S h i f t t h e R e m a i n d e r r e g i s t e r l e f t 1 b i t 2 . S u b t r a c t t h e D i v i s o r r e g i s t e r f r o m t h e l e f t h a l f o f t h e R e m a i n d e r r e g i s t e r a n d p l a c e t h e r e s u l t i n t h e l e f t h a l f o f t h e D i v i s o r R e m a i n d e r r e g i s t e r 3 2 b i t s > R e m a i n d e r 0 R e m a i n d e r < 0 T e s t R e m a i n d e r 3 2 - b i t A L U 3 a . S h i f t t h e R e m a i n d e r r e g i s t e r t o t h e 3 b . R e s t o r e t h e o r i g i n a l v a l u e b y a d d i n g l e f t , s e t t i n g t h e n e w r i g h t m o s t b i t t o 1 t h e D i v i s o r r e g i s t e r t o t h e l e f t h a l f o f t h e R e m a i n d e r r e g i s t e r a n d p l a c e t h e s u m i n t h e l e f t h a l f o f t h e R e m a i n d e r r e g i s t e r . S h i f t r i g h t A l s o s h i f t t h e R e m a i n d e r r e g i s t e r t o t h e C o n t r o l l e f t , s e t t i n g t h e n e w r i g h t m o s t b i t t o 0 R e m a i n d e r S h i f t l e f t t e s t W r i t e 6 4 b i t s N o : < 3 2 r e p e t i t i o n s 3 2 n d r e p e t i t i o n ? e s : 3 2 r e p e t i t i o n s Y D o n e . S h i f t l e f t h a l f o f R e m a i n d e r r i g h t 1 b i t Division
Division • Signed Division(有號數除法) • Dividend = Quotient * Divisor + Remainder • Consider the signs of dividend and divisor : 7 2 • +7 +2 = +3 … +1 7 = (+3) × 2 + 1 • -7 +2 = (-3) … (-1) - 7 = (-3) × 2 + (-1) • +7 -2 = (-3) … +1 7 = (-3) × (-2) + 1 • -7 -2 = +3 … (-1) - 7 = (+3) × (-2) + (-1) • Correctly signed division algorithm • If the signs of the operands are opposite • negative the Quotient • make the sign of the nonzero Remainder match the Dividend • 除法運算元異號時,商數取負號,非零餘數與除數同號