280 likes | 291 Views
Learn about cryptographic hash functions and their properties such as one-wayness, collision resistance, and how they can be used to protect data integrity and authenticity. Explore attacks on message digests and the importance of collision-resistant hash functions.
E N D
Overview • Cryptographic hash functions are functions that: • Map an arbitrary-length (but finite) input to a fixed-size output • Are one-way (hard to invert) • Are collision-resistant (difficult to find two values that produce the same output) • Examples: • Message digest functions - protect the integrity of data by creating a fingerprint of a digital document • Message Authentication Codes (MAC) - protect both the integrity and authenticity of data by creating a fingerprint based on both the digital document and a secret key Chapter 4 Hash Functions 1
Checksums vs. Mess. Digests • Checksums: • Used to produce a compact representation of a message • If the message changes the checksum will probably not match • Good: accidental changes to a message can be detected • Bad: easy to purposely alter a message without changing the checksum • Message digests: • Used to produce a compact representation (called the fingerprint or digest) of a message • If the message changes the digest will probably not match • Good: accidental changes to a message can be detected • Good: difficult to alter a message without changing the digest Chapter 4 Hash Functions 2
Hash Functions • Message digest functions are hash functions • A hash function, H(M)=h, takes an arbitrary-length input, M, and produces a fixed-length output, h • Example hash function: • H = sum all the letters of an input word modulo 26 • Input = a word • Output = a number between 0 and 25, inclusive • Example: • H(“Elvis”) = ((‘E’ + ‘L’ + ‘V’ + ‘I’ + ‘S’) mod 26) • H(“Elvis”) = ((5+12+22+9+19) mod 26) • H(“Elvis”) = (67 mod 26) • H(“Elvis”) = 15 Chapter 4 Hash Functions 3
Collisions • For the hash function: • H = sum all the letters of an input word modulo 26 • There are more inputs (words) than possible outputs (numbers 0-25) • Some different inputs must produce the same output • A collision occurs when two different inputs produce the same output: • The values x and y are not the same, but H(x) and H(y) are the same Chapter 4 Hash Functions 4
Collisions - Example • H(“Jumpsuit”) = 25 • (‘J’ + ‘U’ + ‘M’ + ‘P’ + ‘S’ + ‘U’ + ‘I’ + ‘T’) mod 26 • (10+21+13+16+19+21+9+20) mod 26 • 129 mod 26 • 25 • H(“TCB”) = 25 • (‘T’ + ‘C’ + ‘B’) mod 26 • (20+3+2) mod 26 • 25 mod 26 • 25 Chapter 4 Hash Functions 5
Collision-Resistant Hash Functions • Hash functions for which it is difficult to find collisions are called collision-resistant • A collision-resistant hash function, H(M)=h: • For any message, M1 • It is difficult to find another message, M2 such that: • M1 and M2 are not the same • H(M1) and H(M2) are the same Chapter 4 Hash Functions 6
One-Way Hash Functions • A function, H(M)=h, is one-way if: • Forward direction: given M it is easy to compute h • Backward direction: given h it is difficult to compute M • A one-way hash function: • Easy to compute the hash for a given message • Hard to determine what message produced a given hash value Chapter 4 Hash Functions 7
Message Digest Functions • Message digest functions are collision-resistant, one-way hash functions: • Given a message it is easy to compute its digest • Hard to find any message that produces a given digest (one-way) • Hard to find any two messages that have the same digest (collision-resistant) Chapter 4 Hash Functions 8
Using Message Digest Functions • Message digest functions can be used to protect data integrity: • A company makes some software available for download over the World Wide Web • Users want to be sure that they receive a copy that has not been tampered with • Solution: • The company creates a message digest for its software • The digest is transmitted (securely) to users • Users compute their own digest for the software they receive • If the digests match the software probably has not been altered Chapter 4 Hash Functions 9
Attacks on Message Digests • Brute-force search for a collision: • Goal: • Find a message that produces a given digest, d • Assume: • The message digest function is “strong” • The message digest function creates n-bit digests • Approach: • Generate random messages and compute digests for them until one is found with digest d • Approximately 2nrandom messages must be tried to find one that hashes to d Chapter 4 Hash Functions 10
Attacks on MDs (cont) • Birthday attack (based on the birthday paradox): • Goal: • Find any two messages that produce the same digest • Assume: • The message digest function is “strong” • The message digest function creates n-bit digests • Approach: • Generate random messages and compute digests for them until two are found that produce the same digest • Approximately 2n/2random messages must be tried to find one that hashes to d Chapter 4 Hash Functions 11
The Secure Hash Algorithm • The Secure Hash Algorithm: • A Federal Information Processing Standard (FIPS 180-1) adopted by the U.S. government in 1995 • Based on a message digest function called MD4 created by Ron Rivest • Developed by NIST and the NSA • Input: a message of b bits • Output: a 160-bit message digest Chapter 4 Hash Functions 12
SHA - Padding • Input: a message of b bits • Padding makes the message length a multiple of 512 bits • The input is always padded (even if its length is already a multiple of 512) • Padding is accomplished by appending to the input: • A single bit, 1 • Enough additional bits, all 0, to make the final 512-bit block exactly 448 bits long • A 64-bit integer representing the length of the original message in bits Chapter 4 Hash Functions 13
SHA – Padding Example • Consider the following message: • M = 01100010 11001010 1001 (20 bits) • To pad we append: • 1 (1 bit) • 427 0s (427 bits) • 64-bit binary representation of the number 20 (64 bits) • Result: • Pad(M) = 01100010 11001010 10011000 00000000 . . . 00000000 00010100 (512 bits) • 464 0s have been omitted above (denoted by the ellipsis) Chapter 4 Hash Functions 14
SHA – Constant Init. • After padding, constants are initialized to the following hexadecimal values: • Five 32-bit words: • H0= 67452301 • H1= EFCDAB89 • H2= 98BADCFE • H3= 10325476 • H4= C3D2E1F0 • Eighty 32-bit words: • K0– K19= 5A827999 • K20 – K39= 6ED9EBA1 • K40 – K59= 8F1BBCDC • K60– K79= CA62C1D6 Chapter 4 Hash Functions 15
SHA – Step 1 • The padded message contains a whole number of 512-bit blocks, denoted B1, B2, B3, . . ., Bn • Each 512-bit block, Bi, of the padded message is processed in turn: • Bi is divided into 16 32-bit words, W0, W1, . . ., W15 • W0 is composed of the leftmost 32 bits in Bi • W1 is composed of the second 32 bits in Bi … • W15 is composed of the rightmost 32 bits in Bi Chapter 4 Hash Functions 16
SHA – Step 2 • W0, W1, . . ., W15 are used to compute 64 new 32-bit words (W16, W17, . . ., W79) • Wj (16 <j < 79) is computed by: • XORing words Wj-3, Wj-8, Wj-14, and Wj-16 together • Circularly left shifting the result one bit for j = 16 to 79 do Wj= Circular_Left_Shift_1(Wj-3Wj-8Wj-14Wj-16) done Chapter 4 Hash Functions 17
SHA – Step 3 • The values of H0, H1, H2, H3, and H4are copiedinto five words called A, B, C, D, and E: • A = H0 • B = H1 • C = H2 • D = H3 • E = H4 Chapter 4 Hash Functions 18
SHA – Step 4 • Four functions are defined as follows: • For (0 <j < 19): • fj(B,C,D) = (B AND C) OR ((NOT B) AND D) • For (20 <j < 39): • fj(B,C,D) = (B C D) • For (40 <j < 59): • fj(B,C,D) = ((B AND C ) OR (B AND D) OR (C AND D)) • For (60 <j < 79): • fj(B,C,D) = (B C D) Chapter 4 Hash Functions 19
SHA – Step 4 (cont) • For each of the 80 words, W0, W1, . . ., W79, a 32-bit word called TEMP is computed • The values of the words A, B, C, D, and E are updated as shown below: for j = 0 to 79 do TEMP = Circular_Left_Shift_5(A) + fj(B,C,D) + E + Wj+ Kj E = D; D = C; C = Circular_Left_Shift_30(B); B = A; A = TEMP done Chapter 4 Hash Functions 20
SHA – Step 5 • The values of H0, H1, H2, H3, and H4, are updated: • H0= H0+ A • H1= H1+ B • H2= H2+ C • H3= H3+ D • H4= H4+ E Chapter 4 Hash Functions 21
SHA - Overview • Pad the message • Initialize constants • For each 512-bit block (B1, B2, B3, . . ., Bn): • Divide Bi into 16 32-bit words (W0– W15) • Compute 64 new 32-bit words (W16, W17, . . ., W79) • Copy H0 -H4 into A, B, C, D, and E • For each Wj (W0– W79) compute TEMP and update A-E • Update H0 - H4 • The 160-bit message digest is: H0 H1 H2 H3 H4 Chapter 4 Hash Functions 22
Motivation for Message Authentication Codes • Want to use a message digest function to protect files on our computer from viruses: • Calculate digests for important files and store them in a table • Recompute and check from time to time to verify that the files have not been modified • Good: if a virus modifies a file the change will be detected since the digest of that file will be different • Bad: the virus could just compute new digests for modified files and install them in the table Chapter 4 Hash Functions 23
Message Authentication Codes • A message authentication code (MAC) is a key-dependent message digest function • MACK(M) = h • The output, h, is a function of both the hash function and a key, K • The MAC can only be created or verified by someone who knows K • Can turn a one-way hash function into a MAC by encrypting the hash value with a symmetric-key cryptosystem Chapter 4 Hash Functions 24
Using MAC • MAC can be used to protect data integrity and authenticity: • Want to use a MAC to protect files on our computer from viruses: • Calculate MAC values for important files and store them in a table • Recompute and check from time to time to verify that the files haven’t been modified • Good: if a virus modifies a file the hash of that file will be different • Good: virus doesn’t know the proper key so it can’t install new MACs in the table to cover its tracks Chapter 4 Hash Functions 25
Implementing a MAC • Can use a block cipher algorithm: • Pad the message (if necessary) so that its length is a multiple of the cipher’s block size • Divide the message into n blocks equal in length to the cipher’s block size: • m1, m2, . . ., mn • Choose a key, k • Encrypt m1 with k • XOR the result with m2 • Encrypt the result with k • XOR the result with m3 … Chapter 4 Hash Functions 26
Implementing a MAC (cont) Chapter 4 Hash Functions 27
Summary • Message digests • Message digest functions are collision-resistant, one-way hash functions • Collision-resistant: hard to find two values that produce the same output • One-way: hard to determine what input produced a given output • Protects the integrity of a digital document • MAC • A message authentication code is a key-dependent message digest function • The output is a function of both the hash function and a secret key • The MAC can only be created or verified by someone who knows the key • Protects the integrity and authenticity of a digital document Chapter 4 Hash Functions 28