90 likes | 101 Views
Understanding hash algorithms, collision probabilities, and the significance of properly designed systems in cryptography. The immense number of possible hashes makes random guessing near impossible. Exploring MD5 and SHA functions, their vulnerabilities, and the need for stronger standards. Importance of migrating towards more secure hash functions for data protection.
E N D
If the hash algorithm is properly designed and distributes the hashes uniformly over the output space, "finding a hash collision" by random guessing is exceedingly unlikely (it's more likely that a million people will correctly guess all the California Lottery numbers every day for a billion trillion years). • This astonishing fact is due to the astonishingly large number of possible hashes available: a 128-bit hash can have 3.4 x 10^38 possible values, which is: 340,282,366,920,938,463,463,374,607,431,768,211,456 possible hashes
1 gig numbers / sec 1 gig = 10^9 = 2^30 128 bit will take 2^98 secs = 2^73 years = 10^20 years 100,000,000,000,000,000,000 years (1 year = 2^25 secs) atoms in the universe = 1078 to just under 1081 = i.e. 2246 to 2256
MD5 Hashing $ cat smallfileThis is a very small file with a few characters $ cat bigfileThis is a larger file that contains more characters.This demonstrates that no matter how big the inputstream is, the generated hash is the same size (butof course, not the same value). If two files havea different hash, they surely contain different data. $ ls -l empty-file smallfile bigfile linux-kernel-rw-rw-r-- 1 steve steve 0 2004-08-20 08:58 empty-file-rw-rw-r-- 1 steve steve 48 2004-08-20 08:48 smallfile-rw-rw-r-- 1 steve steve 260 2004-08-20 08:48 bigfile-rw-r--r-- 1 root root 1122363 2003-02-27 07:12 linux-kernel $ md5sum empty-file smallfile bigfile linux-kerneld41d8cd98f00b204e9800998ecf8427e empty-file75cdbfeb70a06d42210938da88c42991 smallfile6e0b7a1676ec0279139b3f39bd65e41a bigfilec74c812e4d2839fa9acf0aa0c915e022 linux-kernel
Avalanche Effect $ cat file1 This is a very small file with a few characters $ cat file2 this is a very small file with a few characters $ md5sum test? 75cdbfeb70a06d42210938da88c42991 file1 6fbe37f1eea0f802bd792ea885cd03e2 file2
Merkle Damgard Compression e.g. MD-5 uses 512 blocks of messages per round of compression, each broken into 4 stages (128 bits)
One MD5 operation. MD5 consists of 64 of these operations, grouped in four rounds of 16 operations. (A,B etc = 32 bits) F is a nonlinear function; one function is used in each round. Mi denotes a 32-bit block of the message input, and Ki denotes a 32-bit constant, different for each operation. <<<s denotes a left bit rotation by s places; s varies for each operation. + denotes addition modulo 232.
SHA-1 One iteration within the SHA-1 compression function. A, B, C, D and E are 32-bit words of the state; F is a nonlinear function that varies; <<< denotes left circular shift. Kt is a constant.
Some very bright researchers in China presented a paper inAugust 2004, and it's shaken up the security world considerably. This was some outstanding cryptography research. One MD5 hash collision
Opinion: Cryptanalysis of MD5 and SHA: Time for a new standard – BRUCE SCHNEIER • But there's an old saying inside the NSA: "Attacks always get better; they never get worse." • It's time for us all to migrate away from SHA-1. • Luckily, there are alternatives. The National Institute of Standards and Technology (NIST) already has standards for longer --and harder-to-break -- hash functions: SHA-224, SHA-256, SHA-384 and SHA-512. They're already government standards and can already be used. This is a good stopgap, but I'd like to see more