930 likes | 1.08k Views
Lecture 7. Montgomery Multipliers & Exponentiation Units. Motivation:. Public-key ciphers. Secret-key (Symmetric) Cryptosystems. key of Alice and Bob - K AB. key of Alice and Bob - K AB. Network. Decryption. Encryption. Bob. Alice. Key Distribution Problem. Users. Keys. N · (N-1).
E N D
Lecture 7 Montgomery Multipliers & Exponentiation Units
Motivation: Public-key ciphers
Secret-key (Symmetric) Cryptosystems key of Alice and Bob - KAB key of Alice and Bob - KAB Network Decryption Encryption Bob Alice
Key Distribution Problem Users Keys N · (N-1) N - Users Keys 5,000 100 2 1000 500,000
Digital Signature Problem Both corresponding sides have the same information and are able to generate a signature • There is a possibility of the • receiver falsifying the message • sender denying that he/she sent the message
Public Key (Asymmetric) Cryptosystems Private key of Bob - kB Public key of Bob - KB Network Decryption Encryption Bob Alice
Non-repudiation Alice Bob Signature Message Signature Message Hash function Hash function Hash value 1 Hash value yes no Hash value 2 Public key cipher Public key cipher Alice’s public key Alice’s private key
RSA as a trap-door one-way function PUBLIC KEY message ciphertext C = f(M) = Me mod N M C M = f-1(C) = Cd mod N PRIVATE KEY N = P Q P, Q - large prime numbers e d 1 mod ((P-1)(Q-1))
RSA keys PUBLIC KEY PRIVATE KEY { e, N } { d, P, Q } P, Q: P, Q - large prime numbers N: N = P Q e: gcd(e, P-1) = 1 and gcd(e, Q-1) = 1 d: e d 1 mod ((P-1)(Q-1))
Mini-RSA keys PUBLIC KEY PRIVATE KEY { e, N } { d, P, Q } P, Q: P = 5 Q = 11 N = P Q = 55 N: e: gcd(e, 5-1) = 1 and gcd(e, 11-1) = 1 e=3 d: 3 d 1 mod 40 d=27
Mini-RSA as a trap-door one-way function PUBLIC KEY message ciphertext C = f(2) = 23 mod 55 = 8 M=2 C=8 M = f-1(C) = 827 mod 55 = 2 PRIVATE KEY N = 5 11 5, 11 - prime numbers 3 27 1 mod ((5-1)(11-1))
Basic Operations of RSA L < k Encryption public key exponent e C M N mod = public key modulus plaintext ciphertext k-bits k-bits k-bits Decryption L=k private key exponent d M mod = C N ciphertext private key modulus plaintext k-bits k-bits k-bits
Quotient and remainder Given integers a and n, n>0 ! q, r Z such that a = q n + r and 0 r < n a q = q – quotient r – remainder (of a divided by n) = a div n n a r = a - q n = a – n = n = a mod n
mod 5 = • -32 mod 5 =
Integers coungruent modulo n Two integers a and b are congruent modulo n (equivalent modulo n) written a b iff a mod n = b mod n or a = b + kn, k Z or n | a - b
Rules of addition, subtraction and multiplication modulo n a + b mod n = ((a mod n) + (b mod n)) mod n a - b mod n = ((a mod n) - (b mod n)) mod n ab mod n = ((a mod n) (b mod n)) mod n
9 · 13 mod 5 = 25 · 25 mod 26 =
Laws of modular arithmetic Modular addition Regular addition a+ba+c (mod n) iff bc (mod n) a+b = a+c iff b=c Regular multiplication Modular multiplication If a b ac (mod n) and gcd(a, n) = 1 then bc (mod n) If a b = ac and a0 then b=c
Modular Multiplication: Example 18 42 (mod 8) 6 3 6 7 (mod 8) 3 7 (mod 8) 0 1 2 3 4 5 6 7 x 6 x mod 8 0 6 4 2 0 6 4 2 0 1 2 3 4 5 6 7 x 5 x mod 8 0 5 2 7 4 1 6 3
Basic Modular Exponentiation
How to perform exponentiation efficiently? Y = XE mod N = X X X X X … X X mod N E-times E may be in the range of 21024 10308 Problems: 1. huge storage necessary to store XE before reduction 2. amount of computations infeasible to perform Solutions: 1. modulo reduction after each multiplication 2. clever algorithms 200 BC, India, “Chandah-Sûtra”
Exponentiation: Y = XE mod N Right-to-left binary exponentiation Left-to-right binary exponentiation E = (eL-1, eL-2, …, e1, e0)2 Y = 1; S = X; for i=0 to L-1 { if (ei == 1) Y = Y S mod N; S = S2 mod N; } Y = 1; for i=L-1 downto 0 { Y = Y2 mod N; if (ei == 1) Y = Y X mod N; }
Right-to-Left Binary Exponentiation in Hardware X 1 enable S Y E SQR MUL output
Left-to-Right Binary Exponentiation in Hardware 1 Y X Control Logic E MUL output
Algorithms for Modular Multiplication Multiplication Multiplication combined with modular reduction (k2) • Classical • Karatsuba • Schönhage-Strassen (FFT) (klg 3) 2 • Montgomery algorithm (k ln(k)) (k2) Modular Reduction (k2) • Classical • Barrett • Selby-Mitchell complexity same as multiplication used (k2)
Montgomery Modular Multiplication (1) X, Y, M – (n-1)-bit numbers Z = X Y mod M Integer domain Montgomery domain X X’ = X 2n mod M Y Y’ = Y 2n mod M Z’ = MP(X’, Y’, M) = = X’ Y’ 2-n mod M = = (X 2n) (Y 2n) 2-n mod M = = X Y 2nmod M Z = X Y mod M Z’ = Z 2nmod M
Montgomery Modular Multiplication (2) X X’ X’ = MP(X, 22n mod M, M) = = X 22n 2-n mod M = X 2n mod M Z Z’ Z = MP(Z’, 1, M) = = (Z 2n) 1 2-n mod M = Z mod M = Z
Basic version of the Radix-2Montgomery Multiplication Algorithm
Montgomery Product S[0] = 0 S[i+1] = Z = S[n] for i=0 to n-1 S[i]+xiY 2 if qi = S[i] + xiY mod 2= 0 S[i]+xiY + M 2 if qi = S[i] + xiY mod 2= 1 M assumed to be odd
Basic version of the Radix-2Montgomery Multiplication Algorithm
Project 2 Rules • Groups consisting of 2 students (preferred) • or a single student (if needed) • Each group works on different architectures • Each group of two works on two similar architectures. • Members of the group can freely exchange VHDL code • and ideas with each other. • Students working individually work on a single architecture. • They must not exchange code with other students. • Members of the group of two are graded jointly, • unless they agree to split no later than two weeks • before the Project deadline.
Investigated Montgomery Multipliers Non-Scalable Scalable G1 G2 • McIvor, et al. • based on 5-to-2 CSA • based on 4-to-2 CSA • Koc & Tenca • radix 2 • radix 4 G3 • Huang, et al. • Architecture 2 • Huang, et al. • Architecture 1 G4 G5 • Savas et al. • radix 2 • radix 4 • Harris, et al. • radix 2 • radix 4 G6 • Suzuki • Virtex 5 DSP • Stratix III DSP
Investigated Montgomery Multipliers Non-Scalable Scalable • flexible, can handle multiple • operand sizes • operand size is described • by a special input, and can be • changed during run-time • size of the circuit • is constant • dedicated to one particular • operand size • operand size is described • by a generic, and can be • changed only after • reconfiguration • size of the circuit varies • as a function of the operand • size
Assumptions (1) Operand sizes: Evaluated parameters: Max. Clock Frequency [MHz] Min. Latency [clock cycles] Min. Latency [μs] Resource Utilization (CLB slices/ALUTs, DSP Units, Block Memories) Latency x Area [μs x CLB slices/ALUTs]
Project 2 Rules • Montgomery Multiplier - required • Montgomery Exponentiation Unit – bonus • Virtex 5 and Stratix III – required • Virtex 6 and Stratix IV - bonus • 1024 and 2048 bit operand sizes required • 3072 and 4096 bit operand sizes bonus
Assumptions (2) • Uniform Interface • (to be provided, but may need to be tweaked • depending on the architecture) • Test vectors generated using reference software • implementation • (may need to be extended to generate • intermediate results) • Your own testbench.
Montgomery Multipliers based on Carry Save Adders
Carry Save Adder (CSA) cn-1 c2 c1 c0 bn-1 b2 b1 b0 an-1 a2 a1 a0 . . . FA FA FA FA c3 c2 s2 s3 c1 s1 s0 sn-1 cn cn-1
Operation of a Carry Save Adder (CSA) Example 20 22 23 24 21 x y z 0 1 0 1 0 1 1 0 1 1 1 0 1 1 1 s c 0 0 1 1 0 1 1 011 x+y+z = s + c
x3 x2 x1 x0 y3 y2 y1 y0 z3 z2 z1 z0 w3 w2 w1 w0 s3 s2 s1 s0 c4 c3 c2 c1 w3 w2 w1 w0 c4 s3 s2 s1 s0 c4 c3 c2 c1 ’ ’ ’ ’ ’ ’ ’ ’ S5 S4 S3 S2 S1 S0 Carry-save adder for four operands
Carry-save adder for four operands s0 s3 s2 c2 s1 c1 c3 c4 s0 ’ ’ s3 s2 s1 ’ ’ c4 ’ c3 c2 c1 ’ ’ ’
Carry-save adder for four operands z y w x 4 4 4 4 CSA c s CSA s’ c’ CPA S
Carry Save Reduction 4-to-2 U+V+W+Y = S+C
Radix-2 Montgomery Multiplier Based on Carry Save Reduction 4-to-2
Montgomery Multipliers and Exponentiation Units by Mc Ivor, et al.