280 likes | 742 Views
Information Theory. Nathanael Paul Oct. 09, 2002. Claude Shannon: Father of Information Theory. “Communication Theory of Secrecy Systems” (1949) Cryptography becomes science Why is information theory so important in cryptography?. Some Terms. ( P,C,K,E,D ) Computational Security
E N D
Information Theory Nathanael Paul Oct. 09, 2002
Claude Shannon:Father of Information Theory • “Communication Theory of Secrecy Systems” (1949) • Cryptography becomes science • Why is information theory so important in cryptography?
Some Terms • (P,C,K,E,D) • Computational Security • Computational effort required to break cryptosystem • Provable Security • Relative to another, difficult problem • Unconditional Security • Oscar (adversary) can do whatever he wants, as much as he wants
Applying probability to cryptography • Each message p in P has a probability as well as each k in K has a probability • Given a p in P and a k in K, a y in C is uniquely determined. • Given a k in K and a y in C, an x in X is uniquely determined. • Induce a probability on ciphertext space • For the equation below, y is fixed.
Some probability theory… • Probability distribution on X • Joint probability • Conditional probability • Bayes’ Theorem
Probability Distribution of X • p(x) – probability function of X • X takes on a finite # (or countably infinite) of possible values – x • Ex. x is a letter in substitution cipher, where X is plaintext space • P(X=x) = p(x) >= 0 this sum is over all possible values of x
Joint Probability • Let X1 and X2 denote random variables • p(x1,x2) = P(X1 = x1, X2 = x2) • “The probability that X1 will take on the value x1 and X2 will take on the value x2” • If X1 and X2 are independent, then • p(x1,x2) = p(x1) * p(x2)
Conditional Probability • “What is the probability of x given y?” • p(x|y) = p(x,y)/p(y) • If p(X = x|Y = y) = p(X = x), then X and Y are independent.
Bayes’ Theorem • p(x,y) = p(x) * p(y | x) = p(y) * p(x | y)
Perfect Secrecy Defined • A cryptosystem (P,C,K,E,D) has perfect secrecy if “ciphertext yields no information about plaintext”
Perfect Secrecy Defined Suppose a cryptosystem (P,C,K,E,D) has |K| = |C| = |P|. This cryptosystem has P.S. iff the following hold: - Each key chosen is truely random- For each x in P, y in C, a unique key k ek(x) = y.
Perfect Secrecy (P.S.)implies|P| <= |K| and |C| <= |K| • Claim: Perfect Secrecy (P.S.)implies|P| <= |K| and |C| <= |K| • pP(x | y) = pP(x) > 0, where y is fixed.Ek(x) = y, for a k in K (k is random) • For each x a k in K Ek(x) = y, since probability pP(x) > 0.
Conclusion about Perfect Secrecy “Key size should be at least as large as message size, and key size should be at least as large as ciphertext size.”
Perfect Secrecy Example • P = C = K = Z26 = {0,1,2,...,24,25} • Ek(x) = x + k mod 26Dk(x) = x – k mod 26 • p(k) = 1/26 and p(x) = any distribution given • note: key must be truely random
Entropy • Want to be able to measure the “uncertainty” or “information” of some random variable X. • Entropy • a measure of information • “How much information or uncertainty is in a cryptosystem?”
Entropy (cont.) • Given: • X, a random variable • finite set of values of X: p1,..., pn Entropy is:
Entropy examples • X: X1, X2P: 1 , 0Entropy = 0, since there is no choice. X1 will happen 100% of the time. H(X) = 0. • X: X1, X2 X1 is more likely than P: ¾ , ¼ X2.H(X) = - (¾ log2(¾) + ¼ log2(¼))
Entropy examples (cont.) • X: X1, X2 ½ ½ H(x) = - (½ log2(½) + ½ log2(½)) = 1 • X: X1, X2, ..., XnP: 1/n, 1/n, ..., 1/nH(x) = - (1/n log2(1/n) * n) = log2(n)
Entropy examples (cont.) • If X is a random variable with n possible values: • H(X) <= log2(n), with equality iff each value has equal probability (i.e. 1/n) • By Jensen’s Inequality, log2(n) provides an upper bound on H(x) • If x is the months of the year:H(x) = log212 3.6 (about 4 bits needed to encode the year)
Unicity Distance • Assume in a given cryptosystem a msg is a string:x1,x2,...,xn where xi is in P (xi is a letter or block) • Encrypting each xi individually with the same key k, yi = Ek(xi), 1 <= i <= n • How many ciphertext blocks, yi’s, do we need to determine k?
Unicity Distance (cont.) • Ciphertext only attack with infinite computing power • Unicity Distance • Smallest # n, for which n ciphertexts (on average) uniquely determine key • One-time pad (infinite)
Defining a language • L: the set of all msgs, for n >= 1. • “the natural language” • p2: (x1,x2) : x1, x2 in P • pn: (x1,x2,...,xn), xi in P, so pn L • each pi inherits a probability distribution from L (digrams, trigrams, ...) • H(pi) makes sense
Entropy and Redundancy of a language What is the entropy of a language? What is the redundancy of a language?
Application of Entropy and Redundancy • 1 <= HL <= 1.5 in english • H(P) = 4.18 • H(P2) = 3.90 • RL = 1 – HL/log226 • about 70%, depends on HL
Unicity in substitution cipher • no = log2|K|/(RL*log2|P|) • |P| = 26|K| = 26! (all permutations) • no = log226!/(0.70 * log226)which is about 26.8 • Which means… on average, if one has 27 letters of ciphertext from a substitution cipher, then you should have enough information to determine the key!
Ending notes... • key equivocation • “How much information is revealed by the ciphertext about the key?” • H(K|C) = H(K) + H(P) – H(C) • Spurious keys • incorrect but possible • So reconsider our question: “Why can’t cryptography and math be separated?”