1 / 69

Cryptography and Privacy Preserving Operations Lecture 2: Pseudo-randomness

This lecture delves into the key ideas of cryptography, one-way functions, computational intractability, and pseudo-randomness. It covers the notion of reduction between cryptographic primitives and the amplification of weak one-way functions. The encryption problem, pseudo-random generators, and the concept of computational indistinguishability are also explored. The lecture concludes with discussions on pseudo-random generators, hardcore predicates, and important issues related to cryptographic operations.

jsharp
Download Presentation

Cryptography and Privacy Preserving Operations Lecture 2: Pseudo-randomness

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cryptography and Privacy Preserving OperationsLecture 2: Pseudo-randomness Lecturer:Moni Naor Weizmann Institute of Science

  2. Recap of Lecture 1 • Key idea of cryptography: use computational intractability for your advantage • One-way functions are necessary and sufficient to solve the two guard identification problem • Notion of Reduction between cryptographic primitives • Amplification of weak one-way functions • Things are a bit more complex in the computational world (than in the information theoretic one) • Encryption: easy when you share very long strings • Started with the notion of pseudo-randomness

  3. Is there an ultimate one-way function? • If f1:{0,1}* → {0,1}* and f2:{0,1}* → {0,1}* are guaranteed to: • Be polynomial time computable • At least one of them is one-way. then can construct a function g:{0,1}* → {0,1}* which is one-way: g(x1,x2 )= (f1(x1),f2 (x2 )) • If an 5n2 time one-way function is guaranteed to exist, can construct an O(n2 log n) one-way function g: • Idea: enumerate Turing Machine and make sure they run 5n2 steps g(x1,x2 ,…,xlog (n) )=M1(x1), M2(x2), …, Mlog n(xlog (n)) • If a one-way function is guaranteed to exist, then there exists a 5n2 time one-way: • Idea: concentrate on the prefix 1/p(n)

  4. Conclusions • Be careful what you wish for • Problem with resulting one-way function: • Cannot learn about behavior on large inputs from small inputs • Whole rational of considering asymptotic results is eroded • Construction does not work for non-uniform one-way functions

  5. The Encryption problem: • Alice would want to send a message m {0,1}n to Bob • Set-up phase is secret • They want to prevent Eve from learning anything about the message m Alice Bob Eve

  6. The encryption problem • Relevant both in the shared key and in the public key setting • Want to use many times • Also add authentication… • Other disruptions by Eve

  7. What does `learn’ mean? • If Eve has some knowledge of m should remain the same • Probability of guessing m • Min entropy of m • Probability of guess whether m is m0 or m1 • Probability of computing some function f of m • Ideally: the message sent is a independent of the message m • Implies all the above • Shannon: achievable only if the entropy of the shared secret is at least as large as the message m entropy • If no special knowledge about m • then |m| • Achievable: one-time pad. • Let rR{0,1}n • Think of r and m as elements in a group • To encrypt m send r+m • To decrypt z send m=z-r

  8. Pseudo-random generators • Would like to stretch a short secret (seed) into a long one • The resulting long string should be usable in any case where a long string is needed • In particular: as a one-time pad • Important notion: Indistinguishability Two probability distributions that cannot be distinguished • Statistical indistinguishability: distances between probability distributions • New notion: computational indistinguishability

  9. Computational Indistinguishability Definition: two sequences of distributions {Dn} and {D’n} on {0,1}nare computationally indistinguishable if for every polynomial p(n) and sufficiently large n, for every probabilistic polynomial time adversary Athat receives input y  {0,1}n and tries to decide whether y was generated by Dn or D’n |Prob[A=‘0’ | Dn ] - Prob[A=‘0’ | D’n ] | < 1/p(n) Without restriction on probabilistic polynomial tests: equivalent to variation distance being negligible ∑β  {0,1}n|Prob[ Dn = β] - Prob[ D’n = β]| < 1/p(n)

  10. Pseudo-random generators Definition: a function g:{0,1}* → {0,1}* is said to be a (cryptographic) pseudo-random generator if • It is polynomial time computable • It stretches the input |g(x)|>|x| • denote by ℓ(n) the length of the output on inputs of length n • If the input (seed) is random, then the output is indistinguishable from random For any probabilistic polynomial time adversaryA that receives inputyof lengthℓ(n)and tries to decide whethery= g(x) or is a random string from {0,1}ℓ(n)for any polynomialp(n) and sufficiently largen |Prob[A=`rand’| y=g(x)] - Prob[A=`rand’| yR {0,1}ℓ(n)] | < 1/p(n) Want to use the output a pseudo-random generator whenever long random strings are used Especially encryption • have not defined the desired properties yet. Anyone who considers arithmetical methods of producing random numbers is, of course, in a state of sin. J. von Neumann

  11. Important issues • Why is the adversary bounded by polynomial time? • Why is the indistinguishability not perfect?

  12. Construction of pseudo-random generators • Idea: given a one-way function there is a hard decision problem hidden there • If balanced enough: looks random • Such a problem is a hardcore predicate • Possibilities: • Last bit • First bit • Inner product

  13. Hardcore Predicate Definition: let f:{0,1}* → {0,1}* be a function. We say that h:{0,1}* → {0,1} is a hardcore predicate for f if • It is polynomial time computable • For any probabilistic polynomial time adversary A that receives input y=f(x) and tries to compute h(x) for any polynomial p(n) and sufficiently large n |Prob[A(y)=h(x)] -1/2| < 1/p(n) where the probability is over the choice y and the random coins of A • Sources of hardcoreness: • not enough information about x • not of interest for generating pseudo-randomness • enough information about x but hard to compute it

  14. Exercises Assume one-way functions exist • Show that the last bit/first bit are not necessarily hardcore predicates • Generalization: show that for any fixed function h:{0,1}* → {0,1} there is a one-way function f:{0,1}* → {0,1}* such that h is not a hardcore predicate of f • Show a one-way function f such that given y=f(x) each input bit of x can be guessed with probability at least 3/4

  15. Single bit expansion • Let f:{0,1}n → {0,1}n be a one-way permutation • Let h:{0,1}n → {0,1} be a hardcore predicate for f • Consider g:{0,1}n → {0,1}n+1 where g(x)=(f(x), h(x)) Claim: g is a pseudo-random generator Proof: can use a distinguisher for g to guess h(x) f(x), h(x)) f(x), 1-h(x))

  16. Hardcore Predicate With Public Information Definition: let f:{0,1}* → {0,1}* be a function. We say that h:{0,1}*x {0,1}* → {0,1} is a hardcore predicate for f if • h(x,r) is polynomial time computable • For any probabilistic polynomial time adversary A that receives input y=f(x) and public randomness r and tries to compute h(x,r) for any polynomial p(n) and sufficiently large n |Prob[A(y,r)=h(x,r)] -1/2| < 1/p(n) where the probability is over the choice y of r and the random coins of A Alternative view: can think of the public randomness as modifying the one-way function f: f’(x,r)=f(x),r.

  17. Example: weak hardcore predicate • Let h(x,i)= xi I.e. h selects the ith bit of x • For any one-way function f, no polynomial time algorithm A(y,i) can have probability of success better than 1-1/2n of computing h(x,i) • Exercise: let c:{0,1}* → {0,1}* be a good error correcting code • |c(x)| is O(|x|) • distance between any two codewords c(x) and c(x’) is a constant fraction of |c(x)| • It is possible to correct in polynomial time errors in a constant fraction of |c(x)| Show that for h(x,i)= c(x)i and any one-way function f, no polynomial time algorithm A(y,i) can have probability of success better than a constant of computing h(x,i)

  18. Inner Product Hardcore bit • The inner product bit: choose r R {0,1}n let h(x,r) = r ∙x = ∑ xi ri mod 2 Theorem [Goldreich-Levin]: for any one-way function the inner product is a hardcore predicate Proof structure: Algorithm A’for inverting f • There are many x’s for which A returns a correct answer (r ∙x ) on ½+ε of the r ’s • Reconstruction algorithm R: take an algorithm A that guesses h(x,r) correctly with probability ½+ε over the r‘s and output a list of candidates for x • No use of the y info by R (except feeding to A) • Choose from the list the/an x such that f(x)=y The main step!

  19. Why list? • Cannot have a unique answer! • Suppose A has two candidates x and x’ • On query r it returns at `random’ either r ∙x or r ∙x’ Prob[A(y,r)= r ∙x ] =½ +½Prob[r∙x = r∙x’] = ¾

  20. A: algorithm for guessingr¢xR:Reconstruction algorithm that outputs a list of candidates for xA’: algorithm for inverting f on a given y y A’ R y y,r1 A ? z1 =r1¢ x y,r2 A ? z2 =r2¢ x  y,rk A ? zk =rk¢ x z1, z2,  zk x1 ,x2 xk xi=x Check whetherf(xi)=y

  21. Warm-up (1) If A returns a correct answer on 1-1/2n of the r ’s • Choose r1, r2, … rn R {0,1}n • Run A(y,r1), A(y,r2), … A(y,rn) • Denote the response z1, z2, … zn • If r1, r2, … rn are linearly independent then: there is a unique x satisfying ri∙x = zi for all 1 ≤i ≤n • Prob[zi = A(y,ri)= ri∙x]≥ 1-1/2n • Therefore probability that all the zi‘s are correct is at least ½ • Do we need complete independence of the ri ‘s? • `one-wise’ independence is sufficient Can choose rR {0,1}n and set ri∙ = r+ei ei =0i-110n-i All the ri `s are linearly independent Each one is uniform in {0,1}n

  22. Warm-up (2) If A returns a correct answer on 3/4+ε of the r ’s Can amplify the probability of success! Given anyr {0,1}n Procedure A’(y,r): • Repeat for j=1, 2, … • Choose r’R {0,1}n • Run A(y,r+r’) and A(y,r’), denote the sum of responses by zj • Output the majority of the zj’s Analysis Pr[zj = r∙x]≥ Pr[A(y,r’)=r∙x ^ A(y,r+r’)=(r+r’)∙x]≥½+2ε • Does not work for ½+ε since success on r’and r+r’is not independent • Each one of the events ‘zj = r∙x’ is independent of the others • Therefore by taking sufficiently many j’s can amplify to a value as close to 1 as we wish • Need roughly 1/ε2 examples Idea for improvement: fix a few of the r’

  23. The real thing • Choose r1, r2, … rk R {0,1}n • Guess for j=1, 2, … k the value zj =rj∙x • Go over all 2k possibilities • For all nonempty subsets S {1,…,k} • Let rS= ∑ j S rj • The implied guess for zS= ∑ j S zj • For each position xi • for each S {1,…,k} run A(y,ei-rS) • output the majority value of {zs +A(y,ei-rS) } Analysis: • Each one of the vectors ei-rS is uniformly distributed • A(y,ei-rS) is correct with probability at least ½+ε • Claim: For every pair of nonempty subset S ≠T {1,…,k}: • the two vectors rS and rT are pair-wise independent • Therefore variance is as in completely independent trials • I is the number of correctA(y,ei-rS), VAR(I) ≤ 2k(½+ε) • Use Chebyshev’s Inequality Pr[|I-E(I)|≥λ√VAR(I)]≤1/λ2 • Need 2k= n/ε2 to get the probability of error to 1/n • So process is successful simultaneously for all positions xi,i{1,…,n} S T

  24. Analysis Number of invocations of A • 2k ∙ n ∙ (2k-1) = poly(n, 1/ε) ≈ n3/ε4 Size of resulting list of candidates for x for each guess of z1, z2, … zk unique x • 2k =poly(n, 1/ε) ) ≈ n/ε2 Conclusion: single bit expansion of a one-way permutation is a pseudo-random generator guesses positions subsets n+1 n x f(x) h(x,r)

  25. Reducing the size of the list of candidates Idea: bootstrap Given any r {0,1}n Procedure A’(y,r): • Choose r1, r2, … rk R {0,1}n • Guess for j=1, 2, … k the value zj =rj∙x • Go over all 2k possibilities • For all nonempty subsets S {1,…,k} • Let rS= ∑ j S rj • The implied guess for zS= ∑ j S zj • for each S {1,…,k} run A(y,r-rS) • output the majority value of {zs +A(y,r-rS) • For 2k= 1/ε2 the probability of error is, say, 1/8 Fix the samer1, r2, … rk for subsequent executions They are good for 7/8 of the r’s Run warm-up (2) Size of resulting list of candidates for x is ≈ 1/ε2

  26. Application: Diffie-Hellman The Diffie-Hellman assumption Let Gbe a group andgan element inG. Given g,a=gx and b=gy it is hard to find c=gxy for random x and y is probability of poly-time machine outputting gxy is negligible More accurately: a sequence of groups • Don’t know how to verify given c’ whether it is equal to gxy • Exercise: show that under the DH Assumption Given a=gx , b=gy and r {0,1}n no polynomial time machine can guess r∙gxy with advantage 1/poly • for random x,y and r

  27. Application: if subset is one-way, then it is a pseudo-random generator • Subset sum problem: given • n numbers 0 ≤ a1,a2 ,…,an ≤2m • Target sum y • Find subset S⊆ {1,...,n} ∑ i S ai,=y • Subset sum one-way function f:{0,1}mn+n → {0,1}m+mn f(a1,a2 ,…,an , x1,x2 ,…,xn ) = (a1,a2 ,…,an , ∑ i=1nxi ai mod 2m ) If m<n then we get out less bits then we put in. If m>n then we get out more bits then we put in. Theorem: if for m>n subset sum is a one-way function, then it is also a pseudo-random generator

  28. Subset Sum Generator Idea of proof: use the distinguisher A to compute r∙x For simplicity: do the computation mod P for large prime P • Given r {0,1}n and (a1,a2 ,…,an ,y) Generate new problem(a’1,a’2 ,…,a’n ,y’) : • Choose c R ZP • Let a’i = ai if ri=0and ai=ai+c mod P if ri=1 • Guess k R{o,…,n} - the value of ∑ xi ri • the number of locations where x and r are 1 • Let y’= y+c k mod P Run the distinguisher A on (a’1,a’2 ,…,a’n ,y’) • output what A says Xored with parity(k) Claim: if k is correct, then (a’1,a’2 ,…,a’n ,y’) is R pseudo-random Claim: for anyincorrect k, (a’1,a’2 ,…,a’n ,y’) is R random y’= z + (k-h)c mod P where z = ∑ i=1nxi a’i mod P and h=∑ xi ri Therefore: probability to guess correctly r∙x is 1/n∙(½+ε) + (n-1)/n (½)= ½+ε/n Prob[A=‘0’|pseudo]= ½+ε Prob[A=‘0’|random]= ½ pseudo-random random correct k incorrect k

  29. Interpretations of the Goldreich-Levin Theorem • A tool for constructing pseudo-random generators The main part of the proof: • A mechanism for translating `general confusion’ into randomness • Diffie-Hellman example • List decoding of Hadamard Codes • works in the other direction as well (for any code with good list decoding) • List decoding, as opposed to unique decoding, allows getting much closer to distance • `Explains’ unique decoding when prediction was 3/4+ε • Finding all linear functions agreeing with a function given in a black-box • Learning all Fourier coefficients larger than ε • If the Fourier coefficients are concentrated on a small set – can find them • True for AC0 circuits • Decision Trees

  30. Composing PRGs ℓ1 Composition Let • g1 be a (ℓ1, ℓ2 )-pseudo-random generator • g2 be a (ℓ2, ℓ3)-pseudo-random generator Consider g(x) = g2(g1(x)) Claim: g is a (ℓ1, ℓ3 )-pseudo-random generator Proof: consider three distributions on {0,1}ℓ3 • D1: y uniform in {0,1}ℓ3 • D2: y=g(x) for x uniform in {0,1}ℓ1 • D3: y=g2(z) for z uniform in {0,1}ℓ2 By assumption there is a distinguisher A between D1 and D2 A must either distinguish between D1 and D3 - can use A use to distinguish g2 or distinguish between D2 and D3 - can use A use to distinguish g1 ℓ2 ℓ3 triangle inequality

  31. Composing PRGs When composing • a generator secure against advantage ε1 and a • a generator secure against advantage ε2 we get security against advantage ε1+ε2 When composing the single bit expansion generator n times Loss in security is at mostε/n Hybrid argument: to prove that two distributions D and D’ are indistinguishable: suggest a collection of distributions D= D0, D1,… Dk =D’ such that If D and D’ can be distinguished, there is a pair Di and Di+1 that can be distinguished. Difference ε between D and D’ means ε/k between someDi and Di+1 Use such a distinguisher to derive a contradiction

  32. From single bit expansion to many bit expansion Internal Configuration Input Output • Can make r and f(m)(x) public • But not any other internal state • Can make m as large as needed r x f(x) h(x,r) h(f(x),r) f(2)(x) f(3)(x) h(f(2)(x),r) f(m)(x) h(f(m-1)(x),r)

  33. Exercise • Let {Dn} and {D’n} be two distributions that are • Computationally indistinguishable • Polynomial time samplable • Suppose that {y1,… ym} are all sampled according to {Dn} or all are sampled according to {D’n} • Prove: no probabilistic polynomial time machine can tell, given {y1,… ym}, whether they were sampled from {Dn} or {D’n}

  34. Existence of PRGs What we have proved: Theorem: if pseudo-random generators stretching by a single bit exist, then pseudo-random generators stretching by any polynomial factor exist Theorem: if one-way permutations exist, then pseudo-random generators exist A harder theorem to prove Theorem [HILL]: if one-way functions exist, then pseudo-random generators exist Exercise: show that if pseudo-random generators exist, then one-way functions exist

  35. Next-bit Test Definition: a function g:{0,1}* → {0,1}* is said to pass the next bit test if • It is polynomial time computable • It stretches the input |g(x)|>|x| • denote by ℓ(n) the length of the output on inputs of length n • If the input (seed) is random, then the output passes the next-bit test For any prefix 0≤ i< ℓ(n), for any probabilistic polynomial time adversary A that receives the first i bits of y= g(x) and tries to guess the next bit, or any polynomial p(n) and sufficiently large n |Prob[A(yi,y2,…,yi)= yi+1] – 1/2 | < 1/p(n) Theorem: a function g:{0,1}* → {0,1}* passes the next bit test if and only if it is a pseudo-random generator

  36. G: S Next-block Undpredictable Suppose that the function G maps a given a seed into a sequence of blocks let ℓ(n) be the length of the number of blocks given a seed of length n • If the input (seed) is random, then the output passes the next-block unpredicatability test For any prefix 0≤ i< ℓ(n), for any probabilistic polynomial time adversary A that receives the first i blocks of y= g(x) and tries to guess the next block yi+1, for any polynomial p(n) and sufficiently large n |Prob[A(y1,y2,…,yi)= yi+1] | < 1/p(n) Exercise: show how to convert a next-block unpredictable generator into a pseudo-random generator. y1y2, … ,

  37. Pseudo-Random Generatorsconcrete version Gn:0,1m 0,1n A cryptographically strong pseudo-random sequence generator - if passes all polynomial time statistical tests (t,)-pseudo-random - no testArunning in timetcan distinguish with advantage

  38. Three Basic issues in cryptography • Identification • Authentication • Encryption Solve in a shared key environment A B S S

  39. G: S Identification - Remote login using pseudo-random sequence A and B share key S0,1k In order for A to identify itself to B • Generate sequence Gn(S) • For each identification session - send next block ofGn(S) Gn(S)

  40. Problems... • More than two parties • Malicious adversaries - add noise • Coordinating the location block number • Better approach: Challenge-Response

  41. Challenge-Response Protocol • B selects a random location and sends to A • Asends value at random location A B What’s this?

  42. Desired Properties • Very long string - prevent repetitions • Random access to the sequence • Unpredictability - cannot guess the value at a random location • even after seeing values at many parts of the string to the adversary’s choice. • Pseudo-randomness implies unpredictability • Not the other way around for blocks

  43. Authenticating Messages • A wants to send message M0,1nto B • B should be confident that A is indeed the sender of M One-time application: S =(a,b) - wherea,bR 0,1n To authenticate M: supply aM b Computation is done in GF[2n]

  44. Problems and Solutions • Problems - same as for identification • If a very long random string available - • can use for one-time authentication • Works even if only random looking a,b A B Use this!

  45. Encryption of Messages • A wants to send message M0,1nto B • only B should be able to learn M One-time application: S = a whereaR 0,1n To encrypt M: send a M

  46. Encryption of Messages • If a very long random looking string available - • can use as in one-time encryption A B Use this!

  47. Pseudo-random Functions Concrete Treatment: F: 0,1k  0,1n  0,1m key Domain Range DenoteY= FS (X) A family of functionsΦk ={FS | S0,1k is (t, , q)-pseudo-random if it is • Efficiently computable - random access and...

  48. (t,,q)-pseudo-random The tester A that can choose adaptively • X1 and get Y1= FS (X1) • X2 and get Y2 = FS (X2 ) … • Xq and get Yq= FS (Xq) • Then A has to decide whether • FS R Φkor • FS R R n  m =  F| F:0,1n  0,1m 

  49. (t,,q)-pseudo-random For a function F chosen at random from (1) Φk ={FS | S0,1k  (2)R n  m =  F| F:0,1n  0,1m  For all t-time machines A that choose qlocations and try to distinguish (1) from (2)  ProbA ‘1’  FR Fk - ProbA ‘1’  FRR n  m   

  50. Equivalent/Non-Equivalent Definitions • Instead of next bit test: for XX1,X2 ,,Xqchosen by A, decide whether given Yis • Y= FS (X)or • YR0,1m • Adaptive vs. Non-adaptive • Unpredictability vs. pseudo-randomness • A pseudo-random sequence generator g:0,1m 0,1n • a pseudo-random function on small domain 0,1log n0,1with key in 0,1m

More Related