250 likes | 446 Views
IELM 511: Information System design. Introduction. Part 1. ISD for well structured data – relational and other DBMS. Info storage (modeling, normalization) Info retrieval (Relational algebra, Calculus, SQL) DB integrated API’s. Part 2. ISD for systems with non-uniformly structured data.
E N D
IELM 511: Information System design Introduction Part 1. ISD for well structured data – relational and other DBMS Info storage (modeling, normalization) Info retrieval (Relational algebra, Calculus, SQL) DB integrated API’s Part 2. ISD for systems with non-uniformly structured data Basics of web-based IS (www, web2.0, …) Markup’s, HTML, XML Design tools for Info Sys: UML Part III: (subset of) API’s for mobile apps Security, Cryptography IS product lifecycles Algorithm analysis, P, NP, NPC
Agenda The mathematical basis for RSA encryption Modulo mathematics: +; *; ^ How RSA is implemented Proof of correctness of RSA Concluding remarks
Need for RSA In the last lecture, we saw the use of (shared) private key cryptography Example: E-banking (you may need to physically get password) Shared key cryptography does not solve all communication problems: Examples: Secure E-commerce (how did you exchange password with Amazon? with Yahoo shopping ?) We also saw the need for a public-key private-key encryption systems (digital signatures, secure transmission) In this lecture, we look at the theoretical basis for the RSA algorithm, which is used (in some form or other) in public-private key cryptography The theoretical basis for the RSA algorithm: Number theory, Algorithms
Modulo mathematics Given an integer m and positive integer n, m mod n is the smallest nonnegative integerr such that for some integer q m = nq + r Examples: 27 mod 3 = 0 [since 27 = 3*9 + 0] 27 mod 4 = 3 [since 27 = 4*6 + 3] -27 mod 4 = 1 [since -27 = 4+(-7)+ 1] Note: this definition works for positive and negative m
Modulo ring Zn is the set of integers {0, 1, . . . , n − 1} with two operators: addition modulo n, denoted +n: i +n j = (i + j) mod n multiplication modulo n, denoted: *n: i *n j = (i * j) mod n Exercises: Prove that +n and *n satisfy the commutative property; Prove that *n distributes over +n
An insecure private key scheme: +n In all discussion, we will assume that a message is a lower-case English text message (with 26 characters) In most encoding/decoding, we will use the notation a = 0; b = 1; … z =25 Scheme: Secret key: integer k Encode: Replace each letter x by x' =(x +26 k) = (x + k) mod 26. Decode: Replace each letter x' by (x' –26 k) = (x' – 26) mod 26. Notes: 1. (x' – k) can be negative [hence the usefulness of our mod definition!] 2. Exercise: show that indeed ( (x +26 k) –26 k ) = x
An insecure private key scheme: +n Scheme: Secret key: integer k Encode: Replace each letter x by x' =(x +26 k) = (x + k) mod 26. Decode: Replace each letter x' by (x' –26 k) = (x' – 26) mod 26. Q: Why is this scheme insecure ? Answer: A scheme is insecure if an efficient algorithm exists that can decrypt an encrypted message without knowledge of the key, k In our scheme, k can have any value (infinite possibilities), BUT To decipher k, how many values do we need to try ? Why ?i mod n = (i + kn) mod n for all integers k.
So +n does not work, how about *n Scheme: 1. Code the message into (a series of) number(s): Message = M 2. Private key: integers a,n 3. Encode: fa,n( M)=(a *n M) = (a * M) mod n. 4. Decode: ?? For this scheme, we need an inverse for multiplication mod n, namely some function, ga,n(X) = a-1 *n X such that ga,n(fa,n( M)) = M, Question: Is there some such function g( ) ? In other words, we are looking for a definition of a multiplicative inverse.
Crypto scheme using *n … M fa,n( M)=(a *n M) Suppose: (a, n, M) = (4, 12, 3) 4 * 3 mod 12 = 0 Impossible to decrypt! Recipient gets message = 0; From the Z12 table, row a=4 there are four possible values. a
Crypto scheme using *n … M fa,n( M)=(a *n M) Second try: (a, n, M) = (5, 12, 7) 5 * 7 mod 12 = 11 Only one entry = 11 in the Z12 table, row a=5 Recipient decrypts M = 7 ! a Conclusion: This scheme works iff all entries in some row of Zn table are unique (and indeed, are a permutation of the set {0, 1, …, n-1} Question: which combination of values n, a have this property ?
Primes, Relative primes, and GCD's in *n A number > 1 is called a prime if it can only be divided by itself or 1 with no remainder. Given two numbers, a and b, we define gcd( a, b) as the largest integer that divides both a and b without remainder. Two numbers, a and b, are called relatively prime if gcd( a, b) = 1. Examples: 2, 3, 5, 7 .. are prime numbers How many prime numbers are there? gcd( 12, 3) = 3 gcd( 12, 5) = 1 Given prime number p, what is gcd( p, n) = ?
Primes, Relative primes, and GCD's in *n A useful theorem and corollary Theorem 1. Given two positive integers j, k, gcd(j, k) = 1iff there are integers x and y such that jx + ky = 1. Corollary 2. For any positive integer n, an element a Zn has a multiplicative inverse if and only if gcd(a, n) = 1.
How to compute gcd( a, b): Euclid's method Lemma 3. Let j, k, q, and r be nonnegative integers such that k = jq + r, then gcd(j, k) = gcd(r, j). Proof: case 1. r = 0 gcd( r, j) = gcd( 0, j) = j (since everything divides 0), and k = jq, therefore gcd( k, j) = j case 2. r > 0 (i) let d be a common factor of j and k integers x, y > 0 such that j = xd and k = yd; yd = xdq + r r = d( y – dq) d is a factor of r. (ii) let d be a common factor if r, j integers x, y > 0 such that r = dx and j = dy; k = dyq + dx = d( yq + x) d is a common factor of k, j. From (i) and (ii) , d is a common factor of r, j iff it is a common factor of j, k, which implies that gcd( j, k) = gcd( r, j).
How to compute gcd( a, b): Euclid's method Lemma 3. Let j, k, q, and r be nonnegative integers such that k = jq + r, then gcd(j, k) = gcd(r, j). Algorithm gcd( k, j) 1. gcd(k, j) where 0 ≤ j < k 2. If (j = 0) return( k) 3. Else 4. r = k mod j; // therefore k = jq + r 5. return gcd(j, r) Example: gcd( 235, 141) iteration 1: gcd( 235, 141): k = 235; j = 141; r = k mod j = 235 – 1 * 141 = 94 iteration 2: gcd( 141, 94): k = 141; j = 94; r = 141 - 1 * 94 = 47 iteration 3: gcd( 94, 47) : k = 94; j = 47; r = 94 – 2 * 47 = 0 iteration 4. gcd( 47, 0): returns 47.
Can we use *n and its inverse to design Asymmetric keys? Not quite – such a mechanism is not secure. First, let's look at the scheme that works: RSA RSA (named after Profs. Rivest, Shamir & Adelman) was proposed in 1970's at MIT It is the basis of almost all eCommerce security today Main idea: - The public key, Kp, provides a mechanism to encode the Message - Given Kp and encrypted message M* = rsa( Kp, M) we cannot efficiently compute Kp-1 - The secret key, Ks, provides an efficient means to compute Kp-1 Before studying the theory behind RSA, let's first see how RSA functions.
The RSA scheme 1. Select two large prime numbers, p and q 2. Let n = pq; let T = ( p - 1)( q - 1) 3. Select a large prime, e (e != 1), such that gcd( e, T) = 1 4. Calculate d = e-1 mod T 5. The public key, Kp is (n ,e) 6. The secret key, Ks is d Notes: Large prime: a prime number with 150 digits or more (later we shall see why) Is T prime ? In step 3, e is selected so that e, T are relatively prime.
RSA: usage and security Suppose Alice wants to send Bob a message, x ( 0 < x < n) 1. Alice gets Bob's public key, (e, n) 2. Alice computes x* = xe mod n 3. Alice sends x* to Bob. Bob wants to decrypt the message received from Alice: 1. Bob looks up his secret key, d 2. Bob computes x** = x*d mod n Claim: x** = x = original message that Alice wants to send. To prove that RSA works, we need to prove the following: 1. Correctness: (xe mod n)d mod n = x 2. Security: 2.1. A party who knows n, e, and Me mod n, but not p, q, or d cannot compute M 2.2. A party who knows n (public key) cannot find its factors p, q (otherwise they could easily calculate d!)
Multiplicative inverse modulo n RSA involves the following step: … 4. Calculate d = e-1 mod T What is e-1 ? In Zn, we say that a-1 is the multiplicative inverse of a (!= 0) iff a *n a-1 = a-1 *n a = 1 Does such an inverse always exist ? If so, how can we compute it ? a a-1 _______________ 1 1 2 - 3 - 4 - 5 5 6 - 7 7 8 - 9 - 10 - 11 11
Computing the multiplicative inverse Recall Theorem 1. Given two positive integers j, k, gcd(j, k) = 1iff there are integers x and y such that jx + ky = 1. We need a solution to: a *n x = 1, which is the same as ax mod n = 1 ax = qn + r (for some integer q, and r = 1), ax + (-q)n = 1 Claim: If a Zn, and x, y are integers such that ax + ny = 1, then a-1 = x mod n Proof (sketch): a *n x = a *n x + n *n y = a *n x +n n *n y = (ax + ny) mod n =1 since n *n y = 0 since (s + t) mod n = (s mod n + t mod n ) mod n Exercise: prove this
Computing the multiplicative inverse.. To solve: a *n x = 1, we need to find two integers x, y such that (ax + ny) mod n =1 The following algorithm, with inputs a, n, solves for x (if it exists): Algorithm gcd_xy( k, j) // 0 ≤ j < k // returns: [x, y, gcd( j, k)] such that jx + ky = gcd( j, k) 1. If k = jq, return [x = 1, y = 0, gcd( k, j) = j]; 2. Else 3. r = k mod j; // therefore k = jq + r 4. q = (k – r)/j 5. [x', y', gcd(j, k)] = gcd( r, j) 6.return[x = y' – qx', y = x', gcd(r, j)] Exercise: prove that step 6 returns the correct values of x, y
Correctness of RSA 1. Select two large prime numbers, p and q 2. Let n = pq; let T = ( p - 1)( q - 1) 3. Select a large prime, e (e != 1), such that gcd( e, T) = 1 4. Calculate d = e-1 mod T 5. The public key, Kp is (n ,e) 6. The secret key, Ks is d We need to prove that: (xe mod n)d mod n = x We will use the following: For any a Zn and non-negative integers i, j (a) (ai mod n) *n (aj mod n) = ai +j mod n (b) (ai mod n)j mod n = aij mod n and Fermat's little thoerem: Let p be a prime number. Then, for every nonzero a Zp, ap−1 mod p = 1.
primes: p, q n = pq T = ( p - 1)( q - 1) e chosen such that gcd( e, T) = 1 d = e-1 mod T Correctness of RSA… We first prove that for prime, p (or q), x mod p = xed mod p ed mod T = 1 there is some integer k such that ed = 1 + kT xed mod p = x1 + k(p-1)(q-1) mod p = x (xk(q-1))(p-1) mod p case 1. xk(q-1)is a multiple of p x is a multiple of p (since p is prime) xed mod p = 0 = x mod p case 2. xk(q-1)is not a multiple of p (xk(q-1))(p-1) = 1 (Fermat's little theorem) xed mod p = x * 1 mod p = x mod p xed mod p = x mod p (for prime numbers, p, q) xed – x divides p (and q) xed – x = ip = jq xed – x is also divisible by pq[why?] xed – x = k (pq) = k n for some integer k xed = kn + x. Therefore, for 0 ≤ x < n, xed = x
primes: p, q n = pq T = ( p - 1)( q - 1) e chosen such that gcd( e, T) = 1 d = e-1 mod T Security of RSA To show that RSA is secure, we need some guarantee that 2.1. A party who knows n, e, and Me mod n, but not p, q, or d cannot compute M 2.2. A party who knows n (public key) cannot find its factors p, q (otherwise they could easily calculate d!) Given n, e, and Me mod n, Can we work backwards and compute M ? There is no known efficient algorithm to compute e-th root of a number mod n. [note: if n was always fixed, we could use a computer to build up a look-up decrypting sheet!] Given n (public key) can we find its factors p, q, and use them to compute T, and then use e to compute d ? So far, there is no known efficient algorithm to factorize a number.
Discussion RSA is currently the basis for almost all secure eCommerce Examples: banks (e.g. try hsbc.com, standardchartered.com.hk, …) signed emails (e.g. HKUST's ITSC) Once RSA has established a secure communication channel, two way symmetric encryption is used, usually some variant of DES, which is a block cipher algorithm. Three important mathematicians whose works were used in this lecture: Euclid (300 BC ) Fermat (17th century) Euler (18th century)
References and Further Reading Simon Singh, The Code Book, pub. Anchor press, 2000 PDF article giving brief introduction to RSA maths (Utah State, Prof Moon) Wikipedia cryptography portal Prof Deng Xiaotie/Prof Frances Yao’s lecture notes (City Univ, HK) Prof M. Golin's lecture notes (CSE, HKUST) Next: final exams