1 / 28

Universal Hash Families

Universal Hash Families. Universal hash families. Family of hash functions Finite multiset H of string-valued functions, each h ∈ H having the same nonempty domain A ⊆{0,1} * and range B ⊆{0,1} * , for some constant b

armina
Download Presentation

Universal Hash Families

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Universal Hash Families

  2. Universal hash families • Family of hash functions • Finite multiset H of string-valued functions, each h ∈ H having the same nonempty domain A ⊆{0,1}* and range B ⊆{0,1}*, for some constant b • [Definition] -almost universal2 (-AU2) & -almost XOR universal2 (-AXU2) • A family of hash functions H ={h: A{0,1}b} is -almost universal2 , written -AU2, if, for all distinct x, x’∈A, Prh∈ H [h(x)=h(x’)] . • The family of hash functions H is -almost XOR universal2 , written -AXU2, if, for all distinct x, x’∈A, and for all c∈{0,1}b, Prh∈ H [h(x)h(x’)=c] . •  =maxxx’{Prh [h(x)=h(x’)]} : collision probability • Principle measure of an AU2 : How small its collision probability is and how fast one can compute its functions

  3. Composition of universal hash families • Make the domain of a hash family bigger • H ={h: {0,1}a{0,1}b} • Hm ={h: {0,1}am{0,1}bm} • its elements are the same as in H where h(x1 x2… xm), for | xi |=a, is defined by h(x1)|| h(x2)||…|| h(xm) • [Proposition] If H is -AU2, then Hm is -AU2 • Make the collision probability smaller • H1 ={h: A{0,1}b1}, H2 ={h: A{0,1}b2} • H1 & H2 ={h: A{0,1}b1+ b2} • its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is defined as h1(x)||h2(x) • [Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H1 & H2 is 12-AU2

  4. Composition of universal hash families(cont.) • Make the image of a hash function shorter • H1 ={h: {0,1}a{0,1}b}, H2 ={h: {0,1}b{0,1}c} • H2○H1 ={h: {0,1}a{0,1}c} • its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is defined as h2(h1(x)) • [Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H2○H1 is (1+2)-AU2 • Turn an AU2 family H1 and AXU2 family H2 into an AXU2 family H2○H1 • [Proposition] Suppose H1 ={h: AB} is 1-AU2 and H2 ={h: BC} is 2-AXU2. Then H2○H1 ={h: AC} is (1+2)-AXU2

  5. Related researches • Carter-Wegman(1979, 1981) • Efficient authentication code under strongly universal hash functions • Key observations • Long messages can be authenticated efficiently using short keys if the number of bits in the authentication tag is increased slightly compared to ‘perfect’ schemes • If a message is hashed to a short authentication tag, weaker properties are sufficient for the first stage of the compression • Under certain conditions, the hash function can remain the same for many plaintexts, provided that hash results is encrypted using a one-time pad • Stinson(1994) • Improves the works by Wegman-Carter and establishes an explicit link between authentication codes and strongly universal hash functions • Johansson-Kabatianskii-Smeets(1996) • Establish a relation between authentication codes and codes correcting independent errors

  6. Related researches(cont.) • Krawczyk(1994, 1995) • Propose universal hash functions that are linear with respect to bitwise XOR • Makes it easier to reuse the authentication code • Encrypt the m-bit hash result for each new message using a one-time pad • Simple and efficient constructions based on polynomials and LFSR • Shoup(1996) • Propose and analyze the constructions based on polynomials over finite fields • Rogaway : Bucket hashing (1995) • Halevi-Krawczyk : MMH (1997) • Make optimal used of the multiply and accumulate instruction of the Pentium MMX processor • Black-Halevi-Krawczyk-Krovertz-Rogaway : UMAC (1999) • Further improved the performance on high end processors

  7. Constructions • Bucket hashing • is an -AU introduced by Rogaway • Defining the Bucket Hash Family B • word size w(1), parameters n(1), N(3) • domain D={0,1}wn, range R={0,1}wN • Let h Band let X=X1Xnbe the string we want to hash, where each |Xi|=w. Then h(X) is defined by the following algorithm. First, for each j{1,,N}, initialize Yjto 0w. Then, for each i{1,, n} and k  hi, replace Ykby Yk  Xi. When done, set h(X) = Y1||Y2||||YN. • Pseudocode for j 1 to N do Yj 0w for i  1 to n do Yhi1 Yhi1 Xi Yhi2 Yhi2 Xi Yhi3 Yhi3 Xi return Y1||Y2||||YN

  8. Constructions(cont.) • Bucket Hashing with Small Key Size • N=2s/L • Each hash function hB’[w,M,N] is specified by a list of length M • each entry contains L integers in the interval [0, N-1] • L arrays are introduced, each containing N buckets • Next, each array is compressed to s/L words, using a fixed primitive element GF(2s/L) • The hash result is equal to the concatenation of the L compressed arrays, each containing s/L words

  9. Constructions(cont.) • Hash Family Based on Fast Polynomial Evaluation • is based on polynomial evaluation over a finite field • q = 2r, Q = 2m = 2r+s, n = 1+2s,  : a linear mapping from GF(Q) onto GF(q) • Q = q0m, q = q0r , q0 : a prime power • fa(x) =a0 + a1x +  + an-1xn-1 • x, y, a0, a1, , an-1  GF(Q), z  GF(q) • H = {hx,y,z : hx,y,z(a) = hx,y,z(a0, a1, , an-1) =  (y fa(x)) + z}

  10. Constructions(cont.) • Hash Family Using Toeplitz Matrices • Toeplitz matrices are matrices with constant values on the left-to-right diagonals • A Toeplitz matrix of dimension n  m can be used to hash messages of length m to hash results of length n by vector-matrix multiplication • The Toeplitz construction uses matrices generated by sequences of length n + m - 1 drawn from -biased distributions • -biased distributions are a tool for replacing truly random sequences by more compact and easier to generate sequences • The lower , the more random the sequence is • Krawczyk proves that the family of hash functions associated with a family of Toeplitz-matrices corresponding to sequences selected from a -biased distribution is  -AXU with  = 2-n + 

  11. Constructions(cont.) • Evaluation Hash Function • is one of the variants analyzed by Shoup • The input (of length  tn) : viewed as a polynomial M(x) of degree < t over GF(2n) • The key : a random element   GF(2n) • the hash result : equal to M() GF(2n) • This family of hash functions is -AXU with  = t/2n

  12. Constructions(cont.) • Division Hash Function • represents the input as a polynomial M(x) of degree less than tn over GF(2) • The hash key : a random irreducible polynomial p(x) of degree n over GF(2) • The hash result : m(x) xn mod p(x) • This family of hash functions is -AXU with  = tn/2n • The total number of irreducible polynomials of degree n is roughly equal to 2n/n

  13. Constructions(cont.) • MMH(Multilinear Modular Hashing) hashing • consists of a (modified) inner product between message and key modulo a prime p (close to 2w, with w the word length; below w = 32) • is an -AXU2, but with xor replaced by subtraction modulo p • The core hash function maps 32 32-bit message words and 32 32-bit key words to a 32-bit result • The key size is 1024 bits and  = 1.5/ 230 • For larger messages, a tree construction can be used • the value of  and the key length have to be multiplied by the height of the tree • This algorithm is very fast on the Pentium Pro, which has a multiply and accumulate instruction • On a 32-bit machine, MMH requires only 2 instructions per byte for a 32-bit result

  14. Comparing the Hash Functions

  15. Comparing the Hash Functions(cont.) • Scheme A • the input : divided into 32 blocks of 8 Kbyte • each block is hashed using the same bucket hash function with N = 160 • results in an intermediate string of 20480 bytes • Scheme B • the input : divided into 64 blocks of 4 Kbyte • each block is hashed using the same bucket hash function with short key(s=42, L=6, N=128) • results in an intermediate string of 10752 bytes • Scheme C • the input is divided into 64 blocks of 4 Kbyte • each block is hashed using a 331024 Toeplitz matrix, based on a -biased sequence of length 1056 generated using an 88-bit LFSR • The length of the intermediate string is 8448 bytes

  16. Comparing the Hash Functions(cont.) • Scheme D • the input : hashed twice using the polynomial evaluation hash function with  = 2-15 resulting in a combined value of 2-30 • W = 5 • The performance is slightly key dependent. Therefore an average over a number of keys has been computed. • Scheme E • this is simply the evaluation hash function with t = 32768 • the resulting value of  is too small • However, choosing a smaller value of n that is not a multiple of 32 induces a performance penalty • Scheme F • the input : divided into 2048 blocks of 128 bytes • each block is hashed twice using MMH • the length of the intermediate string is 16384 bytes • It is not possible to obtain a value of closer to 2-32 in an efficient way

  17. Universal Hashing MAC

  18. Message authentication based on Universal hashing • Message authentication based on Universal hashing • Wegman-Carter approach • The parties share a secret key k=(h,P) • P : infinite random string • h : function drawn randomly from a strongly universal2 family of hash functions H • H is strongly universal2 if, for all xx’, the random variable h1(x)||h2(x), for h ∈ H, is uniformly distributed • To authenticate a message x, the sender transmits h(x) xored with the next piece of the pad P • Standard cryptographic technique • use of a pseudorandom function family, F • [Theorem] Assume H is -AXU2, and that F is replaced by the truly random function family R of functions. In this case, if an adversary makes q1 queries to the authentication algorithm S and q2 queries to the verification algorithm V, the probability of forging a MAC is at most q2

  19. Universal hashing MAC • Why Universal hashing MAC? • The speed of a universal hashing MAC depends on the speed of the hashing step and encrypting step • The encryption does not take long • hash function compresses messages => the encrypting message is short • The combinatorial properties of the universal hash function family is mathematically proven • needs no “over-design” or “safe margin” the way a cryptographic primitive would • Universal hashing MAC makes for desirable security properties • can select a cryptographically conservative design for the encrypting step • can pay with only a minor impact on speed • the cryptographic primitive is applied only to the much shorter hashed image of the message • security and efficiency are not conflicting requirements

  20. UMAC • The UMAC algorithm • species how the message, key, and nonce determine an authentication tag • The sender • will need to provide the receiver with the message, nonce, and tag • The receiver • can then compute what “should be” the tag for this particular message and nonce, and see if it matches the received tag • employs a subkey generation process in which the shared (convenient-length) key is mapped into UMAC's internal keys • subkey generation is done just once, at the beginning of a communication session during which the key does not change, and so subkey-generation is usually not performance-critical • UMAC depends on a few different parameters

  21. UMAC(cont.) • An illustrative special case of UMAC • Subkey generation: • Using a PRG, map Key to K = K1K2  K1024 and to A • each Ki : a 32-bit word, |A| = 512 • Hashing the message Msg to HM = NHXKey(Msg): • Let Len be |Msg| mod 4096, encoded as a 2-byte string • Append to Msg the minimum number of 0 bits to make |Msg| divisible by 8 • Let Msg = Msg1 || Msg2 ||  || Msgt where each Msgi is 1024 words except for Msgt, which has between 2 and 1024 words • Let HM = NHK(Msg1) || NHK(Msg2) || || NHK(Msgt) || Len • Computing the authentication tag: • The tag is Tag = HMAC-SHA1A(HM || Nonce)

  22. UMAC(cont.) • Definition of NH • blocksize n  2, wordsize w  1 • domain : A = {0, 1}2w {0, 1}2w  {0, 1}nw • range : B = {0, 1}2w • a random function in NH[n,w] is given by a random nw-bit string K • Uw : {0,  , 2w-1}, U2w : {0,  , 22w -1} • for integers x, ylet (x +wy) denote (x + y) mod 2w • M  A and M = M1  Ml, |M1| =  = |Ml|= w • K = {0, 1}nwandK = K1  Kn, |K1| =  = |Kn|= w

  23. UMAC(cont.) • where miUw is the number that Mi represents (as an unsigned integer) • where kiUw is the number that Ki represents (as an unsigned integer) • the right-hand side of the above equation is understood to name the (unique) 2w-bit string which represents (as an unsigned integer) the U2w-valued integer result

  24. Compression Function of MD4

  25. Compression Function of MD5

  26. Compression Function of RIPEMD-160

  27. Compression Function of SHA-1

  28. Conclusions

More Related