Arjen K. Lenstra 1,2 joint work with Benne de Weger 2 1 Lucent Technologies’ Bell Laboratories 2 Technische Universite

Progress in hashing cryptanalysis Arjen K. Lenstra1,2 joint work with Benne de Weger2 1 Lucent Technologies’ Bell Laboratories 2 Technische Universiteit Eindhoven

Outline • Brief summary of cryptographic hash functions: • purpose, design criteria, iterative design approach • popular hash functions • Cryptanalysis until August 2004 • Dobbertin, Dean et al. • Recent cryptanalytic developments • random collisions (Wang et al.) • cascading iterative hashes (Joux, etc.) • The possibility of undesirable constructions • Conclusion • what’s next? • how to respond?

Brief summary of cryptographic hash functions • purpose: fixed-size ‘fingerprint’ for message integrity applications • design criteria for L-bit hash function H: • H must be quickly computable • given L-bit y, finding x with H(x) = y should take effort  2L • (i.e., brute force): 1st pre-image resistant • given x, finding xx’ with H(x) = H(x’) should also take  2L • (i.e., same brute force): 2nd pre-image resistant • grey area • finding random x, x’ with xx’ and H(x) = H(x’) • should take effort  2L/2: random collision resistant • (can’t achieve better than this due to birthday paradox) • outputs indistinguishable from ‘random’: random oracle

Brief summary of cryptographic hash functions • purpose: fixed-size ‘fingerprint’ for message integrity applications • design criteria for L-bit hash function H: • H must be quickly computable • given L-bit y, finding x with H(x) = y should take effort  2L • (i.e., brute force): 1st pre-image resistant • given x, finding xx’ with H(x) = H(x’) should also take  2L • (i.e., same brute force): 2nd pre-image resistant • finding not-entirely-random collisions should be hard too • finding random x, x’ with xx’ and H(x) = H(x’) • should take effort  2L/2: random collision resistant • (can’t achieve better than this due to birthday paradox) • outputs indistinguishable from ‘random’: random oracle

Iterative design of cryptographic hash functions • Iterative L-bit hash function H: • Compression function f: • maps pair (512-bit block, L-bit string) to L-bit string • Fixed L-bit string h0: initialization vector (IV) • Input x written as concatenation of 512-bit blocks: • x1 || x2 || x3 || … || xm • where xm contains padding and x’s length (MD-strengthening) • For i = 1, 2, …, m in succession, compute hi = f(xi,hi1) • Resulting hash H(x) of x equals final L-bit string hm • Nice property: if f is collision resistant, then H is collision resistant • But (2004): falling apart as soon as collisions can be found

Popular cryptographic hash functions • Most eggs in the Message Digest basket: • MD4, L = 128 • tweaked version: MD5, L = 128 • length extension: SHA-0, L = 160 • surprise tweak: SHA-1, L = 160 • more tweaks, more length extensions: • SHA-224/256/384/512, L = 224/256/384/512 • all in the same family, all iterative

Hashing cryptanalysis until August 2004 • MD4 considered broken: Den Boer, Bosselaers, and Dobbertin, • 1996, ‘meaningful’ collisions • MD5 considered weak: Dobbertin, • 1996, collisions in the MD5 compression function • Iterated hash functions for which compression function • fixed points can be found (i.e., all hashes in the SHA family): • Drew Dean et al. (1999) found 2nd preimage weakness • (hidden in Dean’s thesis, never published) • MD5 and up: • security of practical applications not seriously questioned • Strong belief in effectiveness of tweaks

Recent cryptanalytic developments in hashing • August 2004: • X. Wang et al.: actual random collisions in MD4 (‘no time’), • MD5 in time  239, etc., for any IV • A. Joux: cascading of iterated L-bit and perfect M-bit hash • does not result in L+M-bit hash – as commonly believed Last result particularly worrisome because of its simplicity

Intermezzo: Joux’s result • cascading of iterated L-bit and perfect M-bit hash does not result • in L+M-bit hash: collisions much faster than in time 2(L+M)/2 find y11, y12 with f(y11,h0) = f(y12,h0) = h1 in time  2L/2 find y21, y22 with f(y21,h1) = f(y22,h1) = h2 in time  2L/2 … find yk1, yk2 with f(yk1,hk1) = f(yk2,hk1) = hk in time  2L/2 • Then: y1u||y2v||…||ykw for all u, v, …, w {1,2} all collide •  2K- fold collision in time K2L/2 • With K = M/2: • for any M-bit hash function there will be a pair among the • 2M/2-fold collision that collides for that M-bit hash as well  simultaneous collision for L-bit hash and M-bit hash in time (M/2)2L/2 + 2M/2, for iterated L-bit hash and any M-bit hash

Recent cryptanalytic developments in hashing • August 2004: • X. Wang et al.: actual random collisions in MD4 (‘no time’), • MD5 in time  239, etc., for any IV • A. Joux: cascading of iterated L-bit and perfect M-bit hash • does not result in L+M-bit hash – as commonly believed • A. Joux: actual random collision for SHA-0 in time  251 • E. Biham: cryptanalysis of SHA-1 variants • October 2004, Kelsey/Schneier (based on Joux): • 2nd preimage weakness in any iterated hash (improving Dean) • February 7, 2005, NIST announcement: • recent developments have no effect on SHA-1 • phase out SHA-1 by 2010, purely based on ‘Moore’s law’

Recent cryptanalytic developments in hashing • August 2004: • X. Wang et al.: actual random collisions in MD4 (‘no time’), • MD5 in time  239, etc., for any IV • A. Joux: cascading of iterated L-bit and perfect M-bit hash • does not result in L+M-bit hash – as commonly believed • A. Joux: actual random collision for SHA-0 in time  251 • E. Biham: cryptanalysis of SHA-1 variants • October 2004, Kelsey/Schneier (based on Joux): • 2nd preimage weakness in any iterated hash (improving Dean) • February 14, 2005, X. Wang et al. (based on Wang/Joux/Biham): • actual random collision for SHA-0 in time  239 • random collision possibility for SHA-1 in time  269 (or 266) (269 < 280 – no reaction or retraction from NIST yet)

Long term prospects of out, no news out, no news (never in) out, an unpleasant surprise ?? Popular cryptographic hash functions • Most eggs in the Message Digest basket: • MD4, L = 128 • tweaked version: MD5, L = 128 • length extension: SHA-0, L = 160 • surprise tweak: SHA-1, L = 160 • more tweaks, more length extensions: • SHA-224/256/384/512, L = 224/256/384/512 • all in the same family, all iterative Even given the ‘substantial changes’ compared to SHA-1, • to what extent can we still trust SHA-224/256/384/512?

How do the Wang et al. collision attacks work? • Interestingly: • people are still trying to figure it out • V. Klima succeeded: improved the MD5 attack to  233 • Very roughly speaking: • differential paths of compression function f are analysed • find M0 and low Hamming weight 1,in and out such that • f(M0,h0) = f(M0+ 1,in,h0) + out = h1 + out • MD4 is so weak that out = 0 for low Hamming weight 1,in  0 • find M1 and low Hamming weight 2,in such that • f(M1,h1) = f(M1+ 2,in,h1 + out) • as a result M0||M1 and M0+ 1,in||M1+ 2,in collide • for SHA-0, 1,in = 0 (and thus out = 0) to get ‘convenient’ h1, • (later version omits M0 altogether: single block SHA-0 collision)

The possibility of undesirable constructions • Often repeated argument: • random collisions are not good for anything • all collisions so far are ‘random’, so we’re fine • Despite this argument: • published random collisions used for actual attack examples • involving integrity checks for downloadable files • (Tripwire etc.) • random collisions combined with iterative structure • suffice for interesting X.509 constructions • so far no truly ‘disastrous’ applications (imo)

X.509 certificate • X.509 allows following format of data • that will be hashed and signed: • p1|| m || p2 • where: • p1 contains header, distinguished names, and • header of public key part, • may assume that p1 consists of whole number of blocks • m is an RSA modulus • p2 contains public exponent and • all other data until signature part • For collision purposes, the obvious place to ‘hide’ random data would be the RSA modulus

X.509 collision construction ingredients • if collisions can be found for any IV, then collisions can be • concocted such that they have same prescribed initial blocks • proper (and identical) data appended to random data pairs turns • random pair plus appendix into pair of valid RSA moduli • arbitrarily selected data can be appended to colliding • messages of same length, and they will still collide • Identical stuff of one’s choice can be prepended to new collision • Random collision can be promoted to meaningful data • Identical stuff of one’s choice can be appended to any collision 1 & 3: due to iterative nature of hashes 2: a new trick for RSA moduli construction

X.509 construction details • Construct colliding p1|| m || p2 and p1|| m’ || p2 as follows: • Prepend: • pick properly formatted p1 with names etc., whole # blocks • compute p1’s intermediate hash value h • ask X. Wang to find random collision m1, m2 with h as IV • p1||m1 and p1||m2 now collide as well • Promote: • find m3 s.t. m1||m3 = m and m2||m3 = m’ are RSA moduli • random m1, m2 extended to meaningful m1||m3 and m2||m3 • Append: • p1||m1||m3 = p1|| m and p1||m2||m3 = p1|| m’ still collide • and so do p1|| m ||p2 and p1|| m’ ||p2 for any p2

Applications of X.509 colliding certificates? • Can get one certificate for the price of two • Sign using one certificate, later deny based on other certificate • Keys involved must have been generated simultaneously, • so detection of this fraud attempt is easy • No other attack scenarios that are facilitated by • these types of collisions • (see www.win.tue.nl/~bdeweger/CollidingCertificates)

Outsider’s X.509 collision assessment • CA can no longer establish • proof of possession of private key • Does not seem to lead to dangerous attack scenarios • (we refer to it as a ‘construction’, not an ‘attack’) • Insider’s point of view may be different • Greater danger if m and m’ could be made to contain • different ‘subject distinguished name’ information • This may be possible if ‘grey area’ is exploited • Stay tuned –seems more is possible than we thought • Problems can be avoided by making sure that: • no one can predict any prefix of the hashed & signed part • before hashing and signing take place • (this is not part of the X.509 specs)

RSA moduli construction • The problem: • for any m1 and m2 of same length N, • find m3 such that m1||m3 and m2||m3 are secure RSA moduli • The solution (for ‘any’ M > 0): • repeatedly pick two M/2-bit primes p and q • use Chinese remaindering to find M-bit m3 such that • p divides m1||m3 and q divides m2||m3 • until (m1||m3)/p and (m2||m3)/q are both prime N = 1024, M = 1024, 2048-bit moduli, 512-bit smallest factors: secure N = 512, M = 512, 1024-bit moduli, 256-bit smallest factors: not secure

Other strange RSA moduli • Variants of our new RSA moduli construction allow: • ‘Twin RSA’: RSA moduli (n, n+2), with factors of same size • ‘predetermined Twin RSA’: RSA moduli (n, n+2) with • fixed leading half bits, and factors of different sizes

Predetermined Twin RSA, 2048-bit example • n= 80000000 00000000 00000000 00000000 • 00000000 00000000 00000000 00000000 • 00000000 00000000 00000000 00000000 • 00000000 00000000 00000000 00000000 • 00000000 00000000 00000000 00000000 • 00000000 00000000 00000000 00000000 • 00000000 00000000 00000000 00000000 • 00000000 00000000 00000000 00000000 • 396099A3 5F9D2B49 E7BB729E 9542A7B0 • A1FAD34B EE884199 E29A5DB4 E49DE1C8 • 279682F4 2A92FBFF 4F0F891F 65638997 • B28D26DA 10B7529A 40CFA534 8BB95BE8 • ADF4A21B 7DC562D4 93590D53 6B6124C5 • 6DB5D693 1004A7B4 C031C401 A4B6E1E8 • EA5C8362 E7B2DB3F BFDEF87D 75311FEA • 7D9BF1C3 9E3E64DF 9163E468 6D5D2711 • and n+2 are secure RSA moduli, with independent factorizations •  two 2048-bit RSA moduli for just 1024 bits

Other strange RSA moduli • Variants of our new RSA moduli construction allow: • ‘Twin RSA’: RSA moduli (n, n+2), with factors of same size • ‘predetermined Twin RSA’: RSA moduli (n, n+2) with • fixed leading half bits, and factors of different sizes • These moduli look highly suspicious • Questions: • Can anyone break them? • Can anyone break n given factorization of n+2 (or vice versa)? • Good for what? (One to sign, other to encrypt? Backup key?)

Conclusion • What’s next? • Continued improvements of random collision attacks very likely • Exploitation of ‘grey area’ looks promising • More ‘interesting’ constructions may emerge • How to respond? • short term: case by case risk analysis, in most cases no need for • changes, migrate in high risk cases only (to what? SHA-256?) • long term: • hard to defend long term application of SHA family • also hard to defend application of SHA-1 until 2010 • current iterated approach has undesirable properties • what about AES-like • competition for AHS, ‘Advanced Hash Standard’?

Arjen K. Lenstra 1,2 joint work with Benne de Weger 2 1 Lucent Technologies’ Bell Laboratories 2 Technische Universite

Arjen K. Lenstra 1,2 joint work with Benne de Weger 2 1 Lucent Technologies’ Bell Laboratories 2 Technische Universite

Presentation Transcript

Career Clusters: Focusing Education on the Future

Presentation on Partnership/Regional Development

Middleware 101

Oracle Fusion Middleware 11g (OFM) Overview

Planning for Certain High Risk Security Incidents Internet2 Member Meeting, San Diego San Diego Room, 8:45 AM, October

Middleware Security

Middleware Security

What Remains To Be Done in Cyber Security

Million Hearts ™ and the Kentucky Initiative

ソフトウェア工学特論 (13)

A Scalable Information Management Middleware for Large Distributed Systems

Middleware, Service-Oriented Architectures and Grid Computing

CPB Public Television Major Giving Initiative

Best Practices for Maintaining Oracle Fusion Middleware

Internet2 Network Tutorial:

Internet2

Middleware Technologies

ESA SME Initiative Course D:Materials