400 likes | 525 Views
CS/ECE Advanced Network Security Dr. Attila Altay Yavuz. Topic 3 Cryptographic Hash Functions Credit: Prof. Dr. Peng Ning Dr. Shai Halevi. Fall 2014. Hash Function Properties. Hash Function. Also known as Message digest One-way transformation One-way function Hash
E N D
CS/ECE Advanced Network SecurityDr. Attila Altay Yavuz Topic 3 Cryptographic Hash Functions Credit: Prof. Dr. Peng Ning Dr. Shai Halevi Dr. Attila Altay Yavuz Fall 2014
Hash Function Properties Dr. Attila Altay Yavuz
Hash Function • Also known as • Message digest • One-way transformation • One-way function • Hash • Length of H(m) much shorter then length of m • Usually fixed lengths: 128 or 160 bits Message of arbitrary length A fixed-length short message Hash
h M1 M2 ML-1 ML … d2 dL-1 dL d1 IV=d0 d=H(M) h h h h How are they built? But not always… Typically using Merkle-Damgård iteration: • Start from a “compression function” • h: {0,1}b+n{0,1}n • Iterate it |M|=b=512 bits c=160 bits d=h(c,M)=160 bits
What are they good for? “Modern, collision resistant hash functions were designed to create small, fixed size message digests so that a digest could act as a proxy for a possibly very large variable length message in a digital signature algorithm, such as RSA or DSA. These hash functions have since been widely used for many other “ancillary” applications, including hash-based message authentication codes, pseudo random number generators, and key derivation functions.” “Request for Candidate Algorithm Nominations”, -- NIST, November 2007
Desirable Properties of Hash Functions • Consider a hash function H • Performance: Easy to compute H(m) • One-way property: Given H(m) but not m, it’s computationally infeasible to find m • Weak collision resistance: Given H(m), it’s computationally infeasible to find m’ such that H(m’) = H(m). • Strong collision resistance: Computationally infeasible to find m1, m2 such that H(m1) = H(m2)
Length of Hash Image • Question • Why do we have 128 bits or 160 bits in the output of a hash function? • If it is too long • Unnecessary overhead • If it is too short • Birthday paradox • Loss of strong collision property
Birthday Paradox (Cont’d) • Implication for hash function H of length m • With probability at least 0.5 • If we hash about 2m/2 random inputs, • Two messages will have the same hash image • Birthday attack • Conclusion • Choose m 128, preferable m 160
Hash Function Use and Applications Dr. Attila Altay Yavuz
Using “imperfect” hash functions • Applications should rely only on “specific security properties” of hash functions • Try to make these properties as “standard” and as weak as possible • Increases the odds of long-term security • When weaknesses are found in hash function, application more likely to survive • E.g., MD5 is badly broken, but HMAC-MD5 is barely scratched
Security requirements • Deterministic hashing • Attacker chooses M, d=H(M) • Hashing with a random salt • Attacker chooses M, then good guychooses public salt, d=H(salt,M) • Hashing random messages • Given M, d=H(M’) e.g., M’=M||r • Hashing with a secret key • Attacker chooses M, d=H(key,M) Stronger Weaker
Deterministic hashing • Collision Resistance • Attacker cannot find M,M’ such that H(M)=H(M’) • Also many other properties • Hard to find fixed-points, near-collisions, M s.t. H(M) has low Hamming weight, etc.
Hashing with public salt • Target-Collision-Resistance (TCR) • Attacker chooses M, then given random salt, cannot find M’ such that H(salt,M)=H(salt,M’) • enhanced TRC (eTCR) • Attacker chooses M, then given random salt, cannot find M’,salt’ s.t. H(salt,M)=H(salt’,M’)
Hashing random messages • Second Preimage Resistance • Given random M, attacker cannot find M’such that H(M)=H(M’) • One-wayness • Given d=H(M) for random M, attacker cannot find M’ such that H(M’)=d • Extraction* • For random salt, high-entropy M, the digest d=H(salt,M) is close to being uniform * Combinatorial, not cryptographic
Hashing with a secret key • Pseudo-Random Functions • The mapping MH(key,M) for secret keylooks random to an attacker • Universal hashing* • For all M,M’, Prkey[ H(key,M)=H(key,M’) ]<e * Combinatorial, not cryptographic
Verifying a signature Message m H(m) Hash Verify Valid /Not Valid Signature Bob’s Public key Application: DigitalSignatures • Only one party (Bob) knows the private key Generating a signature H(m) Message m Signature (encryptedhash) Hash Sign Bob’s Private key
Application 1:Digital signatures • Hash-then-sign paradigm • First shorten the message, d = H(M) • Then sign the digest, s = SIGN(d) • Relies on collision resistance • If H(M)=H(M’) then s is a signature on both Attacks on MD5, SHA-1 threaten current signatures • MD5 attacks can be used to get bad CA cert[Stevens et al. 2009]
Collision resistance is hard • Attacker works off-line (find M,M’) • Can use state-of-the-art cryptanalysis, as much computation power as it can gather, without being detected !! • Helped by birthday attack (e.g., 280 vs 2160) • Well worth the effort • One collision forgery for any signer
same salt (since salt is explicitly signed) Signatures without CRHF [Naor-Yung 1989, Bellare-Rogaway 1997] • Use randomized hashing • To sign M, first choose fresh random salt • Set d= H(salt, M), s= SIGN( salt || d ) • Attack scenario (collision game): • Attacker chooses M, M’ • Signer chooses random salt • Attacker must find M' s.t. H(salt,M) = H(salt,M') • Attack is inherently on-line • Only rely on target collision resistance
TCR hashing for signatures • Not every randomization works • H(M|salt) may be subject to collision attacks • when H is Merkle-Damgård • Yet this is what PSS does (and it’s provable in the ROM) • Many constructions “in principle” • From any one-way function • Some engineering challenges • Most constructions use long/variable-size randomness, don’t preserve Merkle-Damgård • Also, signing salt means changing the underlying signature schemes
attacker can use different salt’ Signatures with enhanced TCR [H-Krawczyk 2006] • Use “stronger randomized hashing”, eTCR • To sign M, first choose fresh random salt • Set d = H(salt, M), s = SIGN( d ) • Attack scenario (collision game): • Attacker chooses M • Signer chooses random salt • Attacker needs M‘,salt’ s.t. H(salt,M)=H(salt',M') • Attack is still inherently on-line
Authentication with HMAC [Bellare-Canetti-Krawczyk 1996] • Simple key-prepend/append have problems when used with a Merkle-Damgård hash • tag=H(key | M) subject to extension attacks • HMAC: Compute tag = H(key | H(key | M)) • About as fast as key-prepend for a MD hash • Relies only on PRF quality of hash • MH(key|M) looks random when key is secret
Application: File Authentication • Want to detect if a file has been changed by someone after it was stored • Method • Compute a hash H(F) of file F • Store H(F) separately from F • Can tell at any later time if F has been changed by computing H(F’) and comparing to stored H(F) • Why not just store a duplicate copy of F???
Application: User Authentication • Alice wants to authenticate herself to Bob • assuming they already share a secret key K • Protocol: Alice Bob “I’m Alice” picks randomnumber R R computesY=H(R|K) Y verifies thatY=H(R|K) time Dr. Peng Ning
User Authentication… (cont’d) • Why not just send… • …K, in plaintext? • …H(K)? , i.e., what’s the purpose of R?
Application: Commitment Protocols • Ex.: A and B wish to play the game of “odd or even” over the network • A picks a number X • B picks another number Y • A and B “simultaneously” exchange X and Y • A wins if X+Y is odd, otherwise B wins • If A gets Y before deciding X, A can easily cheat (and vice versa for B) • How to prevent this?
Commitment… (Cont’d) • Proposal: A must commit to X before B will send Y • Protocol: • Can either A or B successfully cheat now? A B A picks X and computes Z=H(X) Z = H(X) Picks Y Y X verifies thatH(X) = Z
Commitment… (Cont’d) • Why is sending H(X) better than sending X? • Why is sending H(X) good enough to prevent A from cheating? • Why is it not necessary for B to send H(Y) (instead of Y)? • What problems are there if: • The set of possible values for X is small? • B can predict the next value X that A will pick?
Application: Message Encryption • Assume A and B share a secret key K • but don’t want to just use encryption of the message with K • A sends B the (encrypted) random number R1, B sends A the (encrypted) random number R2 • And then…
Initialization Vector = Concatenate, then Hash C+H R1 | R2 one-time pad one-time pad 64 64 Key Key C+H C+H C+H C+H E E E E M1 M2 M3 M4 M1 M2 M3 M4 64 64 64 46 + padding 64 64 64 46 + padding 64 64 64 64 64 64 64 64 C1 C2 C3 C4 C1 C2 C3 C4 • R1 | R2 is used like the IV of OFB mode, but C+H replaces encryption; as good as encryption?
Application: Message Authentication • A wishes to authenticate (but not encrypt) a message M (and A, B share secret key K) A B • picks random number R • computes Y = H(M|K|R) M, R, Y verifies thatY = H(M|K|R) • Why is R needed? Why is K needed?
Is Encryption a Good Hash Function? • Building hash using block chaining techniques • Encryption block size may be too short (DES=64) • Birthday attack • Can construct a message with a particular hash fairly easily • Extension attacks constant M1 M2 M3 M4 E E E E 64 Hash
Modern Hash Functions • MD5 • Previous versions (i.e., MD2, MD4) have weaknesses. • Broken; collisions published in August 2004 • Too weak to be used for serious applications • SHA (Secure Hash Algorithm) • Weaknesses were found • SHA-1 • Broken, but not yet cracked • Collisions in 269 hash operations, much less than the brute-force attack of 280 operations • Results were circulated in February 2005, and published in CRYPTO ’05 in August 2005 • SHA-256, SHA-384, …
(In)security of MD5 • A few recently discovered methods can find collisions in a few hours • A few collisions were published in 2004 • Can find many collisions for 1024-bit messages • More discoveries afterwards • In 2005, two X.509 certificates with different public keys and the same MD5 hash were constructed • This method is based on differential analysis • 8 hours on a 1.6GHz computer • Much faster than birthday attack
Comparison: SHA-1 vs. MD5 • SHA-1 is a stronger algorithm • brute-force attacks require on the order of 280 operations vs. 264 for MD5 • SHA-1 is about twice as expensive to compute • Both MD-5 and SHA-1 are much faster to compute than DES
Security of SHA-1 • SHA-1 • “Broken”, but not yet cracked • Collisions in 269 hash operations, much less than the brute-force attack of 280 operations • Results were circulated in February 2005, and published in CRYPTO ’05 in August 2005 • SHA-256, SHA-384, SHA-512
The Hashed Message Authentication Code (HMAC) Attila Altay Yavuz
HMAC • Given M1, and secret key K, can easily concatenate and compute the hash:H(K|M1|padding) • Given M1, M2, and H(K|M1|padding) easy to compute H(K|M1|padding|M2|newpadding) for some new message M2 • Simply use H(K|M1|padding) as the IV for computing the hash of M2|newpadding • does not require knowing the value of the secret key K
0x5c5c5c…5c concatenate computemessage digest HMAC(key,message) HMAC Processing Key K 0x363636…36 pad on right with 0’s to 512 bits in length Message M concatenate computemessage digest Dr. Peng Ning
Summary • Hashing is fast to compute • Has many applications (some making use of a secret key) • Hash images must be at least 128 bits long • but longer is better 256 is ideal • Hash function details are tedious • HMAC protects message digests from extension attacks