1. Joint withA.Ta-shma & D.Zuckerman 2. Improved: R.Shaltiel and C. Umans Slides: Adi Akavia

Extractors via Low-degree Polynomials 1. Joint withA.Ta-shma & D.Zuckerman2. Improved: R.Shaltiel and C. UmansSlides: Adi Akavia

Definitions Def: The min-entropy of a random variable X over {0, 1}n is defined as: Thus a random variable X has min-entropy at least k if Pr[X=x]≤2-k for all x. [Maximum possible min-entropy for such a R.V. is n] Def (statistical distance): Two distributions on a domain D are e-close if the probabilities they give to any AD differ by at most e (namely, half the norm-1 of the distance)

Weak random source n E Random string m Seed t Definitions Def: A (k, e)- extractor is a functionE:{0,1}n  {0,1}t{0,1}ms.t. for any R.V. X with min-entropy ≥kE(X,Ut) is e-close to Um(where Um denotes the uniform distribution over {0,1}m)

Weak random source n E Random string m Seed t Parameters The relevant parameters are: • min entropy of the weak random source – k.Relevant values log(n) k  n(seed length is t ≥ log(n) hence no point consider lower min entropy). • seed lengtht ≥ log(n) • Quality of the output: e • Size of the output m=f(k). The optimum is m=k.

Extractors High Min-Entropy distribution Uniform-distribution seed 2t 2n 2m E Close to uniform output

Next Bit Predictors Claim: to prove E is an extractor, it suffices to prove that for all 0<i<m+1 and all predictorsf:{0,1}i-1{0,1} Proof: Assume E is not an extractor; then exists a distribution X s.t. E(X,Ut) is note-close to Um, that is:

Proof Now define the following hybrid distributions:

Proof Summing the probabilities for the event corresponding to the set A for all distributions yields: And because |∑ai|≤ ∑|ai| there exists an index 0<i<m+1 for which:

The Predictor We now define a function f:{0,1}i-1{0,1} that can predict the i’th bit with probability at least ½+e/m (“a next bit predictor”): The function f uniformly and independently draws the bits yi,…,ym and outputs: Note: the above definition is not constructive, as A is not known!

Proof And fis indeed a next bit predictor: Q.E.D.

Next-q-it List-Predictor f is allowed to output a small list of l possible next elements

q-ary Extractor Def: Let F be a field with q elements. A (k, l)q-ary extractor is a functionE:{0,1}n  {0,1}tFms.t. for all R.V. X with min-entropy ≥k and all 0<i<m and all list-predictors f:Fi-1Fl

Generator Def: Define the generator matrix for the vector space Fd as a matrix Ad×d, s.t. for any non-zero vector vFd: (that is, any vector 0≠vFd multiplied by all powers of A generates the entire vector space Fd except for 0) Lemma: Such a generator matrix exists and can be found in time qO(d).

Note that for such a polynomial, the number of coefficients is exactly: (“choosing where to put d-1 bars between h-1 balls”) Strings as Low-degree Polynomials • Let F be a field with q elements • Let Fd be a vector space over F • Let h be the smallest integer s.t. • For x {0,1}n, let xdenote the unique d-variate polynomial of total degree h-1 whose coefficients are specified by x.

x(Aiv) Amv x(v) x(Amv) Aiv Fd v v Aiv Amv The [SU] Extractor • The definition of the q-ary extractor: E:{0,1}n {0,1}d log qFm seed, interpreted as a vector v Fd Generator matrix

Main Theorem Thm: For any n,q,d and h as previously defined, E is a (k, l)q-ary extractor if: Alternatively, E is a (k, l)q-ary extractor if:

What’s Ahead • “counting argument” and how it works • The reconstruction paradigm • Basic example – lines in space • Proof of the main theorem

Extension Fields A field F2 is called an extension of another field F if F is contained in F2 as a subfield. Thm: For every power pk (p prime, k>0) there is a unique (up to isomorphism) finite field containing pk elements. These fields are denoted GF(pk)and comprise all finite fields. Def: A polynomial is called irreducible in GF(p) if it does not factor over GF(p) Thm: Let f(x) be an irreducible polynomial of degree k over GF(p). The set of degree k-1 polynomials over Zp, with addition coordinate-wise and multiplication modulo f(x) form the finite field GF(pk)

Counting Argument For Y X, denote (Y)=yYPr[y] (“the weight of Y”) Assume a mapping R:{0,1}a{0,1}n, s.t. Prx~X[z R(z)=x] ½ Then: • for X uniform over a subset of 2n, |X| 2 |R(S)| • for an arbitrary distribution X, (X)  2 (R(S)) If X is of min-entropy k, then (R(S))2a·2-k = 2a-kand therefore k  a + 1(1 = (X)  2(R(S)) 21+a-k) 2nX R(S) R 2aS

“Reconstruction Proof Paradigm” Proof sketch: For a certain R.V. X with min-entropy k, assume by way of contradiction, a predictor f for the q-ary extractor. For a<<k construct a function R:{0,1}a{0,1}n --the “reconstruction function”-- that uses f as an oracle and: By the “counting argument”, this implies X has min-entropy much smaller than k

Basic Example – Lines Construction: • Let BC:F{0,1}s be a (inefficient) binary-code • Given • x, a weak random source, interpreted as a polynomial x:F2F and • s, a seed, interpreted as a random point (a,b), and an index j to a binary code. • Def:

(a,b+1) (a,b+m) x(a,b+m) x(a,b) x(a,b+1) 001 001 110 110 000 000 101 101 110 110 Basic Example – Illustration of Construction • x  x, s = ((a,b), 2) • E(x,s)=01001 (a,b) (inefficient) binary code

Basic Example – Proof Sketch • Assume, by way of contradiction, thereexists a predicator function f. • Next, show a reconstructionfunction R, s.t. • Conclude, a contradiction!(to the min-entropy assumption of X)

h ~ n1/2 j ~ lgn m ~ desired entropy Basic Example – Reconstruction Function Random line “advice” “Few” red points: a=mjO(h) Repeat using the new points, until all Fd is evaluated List decoding by the predictor f Resolve into one value on the line

Problems with the above Construction • Too many lines! • Takes too many bits to define a subspace

The Reconstruction Function (R) • Task: allow many strings x in the support of X to be reconstructed from very short advice strings. • Outlines: • Use f in a sequence of prediction steps to evaluate z on all points of Fd,. • Interpolate to recover coefficients of z, • which gives x Next We Show: there exists a sequence of prediction stepsthat works for manyx in the support of X and requires few advice strings

Curves • Let r=Q(d), • Pick random vectors and values • 2r random points y1,…,y2rFd, and • 2r values t1,…,t2rF, and • Define degree 2r-1 polynomials p1,p2 • p1:FFd defined by p1(ti)=yi, i=1,..,2r. • p2:FFd defined by p2(ti)=Ayi, i=1,..,r, and p2(ti)=yi, i=r+1,..,2r. • Define vector sets P1={p1(z)}zF and P2={p2(z)}zF • i>0 define P2i+1=AP2i-1 and P2i+2=AP2i({Pi}, the sequence of prediction stepsare low-degree curves in Fd, chosen using the coin tosses of R)

Ai*(y2) A(y2) A2(y2) A3(y2) A2(y1) Ai*(y1) A(y1) A3(y1) A2(yr) Ai*(yr) A(yr) A3(yr) Ai*(y2) Ai*(yr+1) Ai*(y1) Ai*(y2r) Ai*(yr) Amv A(y2) A2(y2) A2(yr+1) A(yr+1) A(y1) A2(y1) A2(y2r) A(y2r) A(yr) A(yr) Amv y2 Aiv yr+1 Aiv y1 y2r yr A2(yr+1)) yr+1 A(yr+1)) Ai*-1(yr+1)) v t1 t2 tr tr+1 A(y2r) Ai*-1(y2r) A2(y2r) y2r t2r v Curves Fd F

Simple Observations • A is non-singular linear-transform, hence i • Pi is 2r-wise independent collection of points • Pi and Pi+1 intersect at r random points • z|Pi is a univariate polynomial of degree at most 2hr. • Given evaluation of z on Av,A2v,…,Amv, we may use the predictor function f to predict z(Am+1v) to within l values. • We needadvice string: 2hr coefficients of z|Pi for i=1,…,m. (length: at most mhr log q ≤ a)

A(y2) A2(y2) A3(y2) Ai*(y2) Ai*(y1) A2(y1) A(y1) A3(y1) Ai*(yr) A(yr) A2(yr) A3(yr) Ai*(y2) Ai*(yr+1) Ai*(y1) Ai*(y2r) Ai*(yr) Amv A2(y2) A(y2) A2(yr+1) A(yr+1) A(y1) A2(y1) A(y2r) A2(y2r) A(yr) A(yr) y2 Aiv yr+1 y1 y2r yr A(yr+1)) yr+1 A2(yr+1)) Ai*-1(yr+1)) v t1 t2 tr tr+1 y2r A2(y2r) Ai*-1(y2r) A(y2r) t2r Using N.B.P. Cannot resolve into one value! Fd F

Ai*+1(y2) A2(y2) A(y2) Ai*(y2) A3(y2) Ai*+1(y1) Ai*+1(yr) Ai*(y1) A2(y1) A(y1) A3(y1) A3(yr) Ai*(yr) A2(yr) A(yr) Ai*(y2) Ai*(yr+1) Ai*(y1) Ai*(y2r) Ai*(yr) Amv A2(y2) A(y2) A(yr+1) A2(yr+1) A(y1) A2(y1) A(y2r) A2(y2r) A(yr) A(yr) y2 Aiv yr+1 y1 y2r yr yr+1 Ai*-1(yr+1)) A(yr+1)) A2(yr+1)) v t1 t2 tr tr+1 Ai*-1(y2r) A(y2r) y2r A2(y2r) t2r Using N.B.P. Can resolve into one value using the second curve! Fd F

yr+1 y2r Ai*+1(y2) Ai*(y2) A3(y2) A(y2) A2(y2) Ai*+1(y1) Ai*+1(yr) A(y1) Ai*(y1) A3(y1) A2(y1) A(yr) Ai*(yr) A2(yr) A3(yr) Ai*(y2) Ai*(yr+1) Ai*(y1) Ai*(y2r) Ai*(yr) Amv A(y2) A2(y2) A(yr+1) A2(yr+1) A2(y1) A(y1) A(y2r) A2(y2r) A(yr) A(yr) y2 Aiv yr+1 y1 y2r yr yr+1 A(yr+1)) Ai*-1(yr+1)) A2(yr+1)) v t1 t2 tr tr+1 y2r A2(y2r) Ai*-1(y2r) A(y2r) t2r Using N.B.P. Can resolve into one value using the second curve! Fd F

Open Problems • Is the [SU] extractor optimal? Just run it for longer sequences • Reconstruction technique requires interpolation from h (the degree) points, hence maximal entropy extracted is k/h • The seed --a point-- requires logarithmic number of bits

Main Lemma Proof Cont. • Claim: with probability at least 1-1/8qd over the coins tosses of R: • Proof: We use the following tail bound: Let t>4 be an even integer, and X1,…,Xn be t-wise independent R.V. with values in [0,1]. Let X=Xi, =E[X], and A>0. Then:

Main Lemma Proof Cont. • According to the next bit predictor, the probability for successful prediction is at least 1/2√l. • In the i’th iteration we make q predictions (as many points as there are on the curve). • Using the tail bounds provides the result. Q.E.D (of the claim). Main Lemma Proof (cont.): Therefore, w.h.p. there are at least q/4√l evaluations points of Pithat agree with the degree 2hr polynomial on the i’th curve (out of a total of at most lq).

Main Lemma Proof Cont. • A list decoding bound: given n distinct pairs (xi,yi) in field F and Parameters k and d, with k>(2dn)1/2, There are at most 2n/k degree d polynomials g such that g(xi)=yi for at least k pairs. Furthermore, a list of all such polynomials can be computed in time poly(n,log|F|). • Using this bound and the previous claim, at most 8l3/2degree 2rh polynomials agree on this number of points (q/4√l ).

Lemma Proof Cont. • Now, • Pi intersect Pi-1 at r random positions, and • we know the evaluation of z at the points in Pi-1 • Two degree 2rh polynomials can agree on at most 2rh/q fraction of their points, • So the probability that an “incorrect” polynomial among our candidates agrees on all r random points in at most

Main Lemma Proof Cont. • So, with probability at leastwe learn points Pi successfully. • After 2qd prediction steps, we have learned z on Fd\{0} (since A is a generator of Fd\{0}) • by the union bound, the probability that every step of the reconstruction is successful is at least ½. Q.E.D (main lemma)

Proof of Main Theorem Cont. • First, • By averaging argument: • Therefore, there must be a fixing of the coins of R, such that:

Ai*+1(y2) A2(y2) A(y2) Ai*(y2) A3(y2) Ai*+1(y1) Ai*+1(yr) Ai*(y1) A2(y1) A(y1) A3(y1) A3(yr) Ai*(yr) A2(yr) A(yr) Ai*(y2) Ai*(yr+1) Ai*(y1) Ai*(y2r) Ai*(yr) Amv A2(y2) A(y2) A(yr+1) A2(yr+1) A(y1) A2(y1) A(y2r) A2(y2r) A(yr) A(yr) y2 Aiv yr+1 y1 y2r yr yr+1 Ai*-1(yr+1)) A(yr+1)) A2(yr+1)) v t1 t2 tr tr+1 Ai*-1(y2r) A(y2r) y2r A2(y2r) t2r Using N.B.P. – Take 2 Unse N.B.P over all points in F, so that we get enough ”good evaluation” Fd F

Proof of Main Theorem Cont. • According to the counting argument, this implies that: • Recall that r=Q(d). • A contradiction to the parameter choice: Q.E.D (main theorem)!

From q-ary extractors to (regular) extractors The simple technique - using error correcting codes: Lemma: Let F be a field with q elements. Let C:{0,1}k=log(q){0,1}n be a binary error correcting code with distance at least 0.5-O(2) . If E: {0,1}n *{0,1}t ->Fm is a (k,O(r)) q-ary extractor, then E’: {0,1}n *{0,1}t+log(n) ->Fm defined by: Is a (k,rm) binary extractor.

From q-ary extractors to (regular) extractors A more complex transformation from q-ary extractors to binary extractors achieves the following parameters: Thm: Let F be a field with q<2m elements. There is a polynomial time computable function: Such that for any (k,r) q-ary extractor E, E’(x;(y,j))=B(E(x;y),j) is a (k,r log*m) binary extractor.

From q-ary extractors to (regular) extractors The last theorem allows using theorem 1 for  = O(e/log*m) , and implies a (k,e) extractor with seed length t=O(log n) and output length m=k/(log n)O(1)

Extractor  PRG • Identify: • string x{0,1}log n with the • function x:{0,1}log n{0,1} by setting x(i)=xi • Denote by S(x) the size of the smallest circuit computing function x Def (PRG): an -PRG for size s is a function G:{0,1}t{0,1}m with the following property: 1im and all function f:{0,1}i-1{0,1}i with size s circuits, Pr[f(G(Ut)1...i-1)=G(Ut)i]  ½ + /m This imply: for all size s-O(1) circuits C |Pr[C(G(Ut))=1] – Pr[C(Um)=1]| 

q-ary PRG Def (q-ary PRG): Let F be the field with q elements. A -q-ary PRG for size s is a function G:{0,1}tFm with the following property: 1im and all function f:Fi-1F(-2) with size s circuits, Pr[j f(G(Ut)1...i-1)j=G(Ut)i]   Fact: O()-q-ary PRG for size s can be transformed into (regular) m-PRG for size not much smaller than s

Note: Gx(j) corresponds to using our q-ary extractor construction with the “successor function” Amj The Construction We show: x is hard  at least one Gx(j) is a q-ary PRG Plan for building a PRG Gx:{0,1}t {0,1}m: • use a hard function x:{0,1}log n {0,1} • let z be the low-degree extension of x • obtain l “candidate” PRGs, where l=d(log q / log m) as follows:For 0j<l define Gx(j):{0,1}d log q Fm byGx(j)(v) = z(A1mjv)  z(A2mjv) ... z(AMmjv)where A is a generator of Fd\{0}

Getting into Details Note F’d is a subset of Fd think of Fd as both a vector space and the extension field of F perhaps we should just say: immediate from the correspondence between the cyclic group GF(qd) and Fd\{0} ??? otherwise in details we may say: Proof: • There exists a natural correspondence between Fd and GF(qd), and between F’d and GF(hd), • GF(qd) is cyclic of order qd-1, i.e. there exists a generator g • gp generates the unique subgroup of order hd-1, the multiplicative group of GF(hd). • A and A’ are the linear transforms corresponding to g and gp respectively. Let F’ be a subfield of F of size h Lemma: there exist invertible dd matrices A and A’ with entries from F which satisfy: •  vFd s.t. v0, {Aiv}i=Fd\{0} •  vF’d s.t. v0, {A’iv}i=F’d\{0} • A’=Ap for p=(qd-1)/(hd-1) • A and A’ can be found in time qO(d)

since hd>n, there are enough “slots” to embed all x in a d dimensional cube of size hd • and since A’ generates F’d\{0}, indeed x is embedded in a d dimensional cube of size hd • Note h denotes the degree in individual variables, and the total degree is at most hd • The computation of z from x can be done in poly(n,qd)=qO(d) time • require hd>n • Define z as followsz(A’i1)=x(i), where 1 is the all 1 vector (low degree extension). • Recall: For 0j<l define Gx(j):{0,1}d log q Fm byGx(j)(v) = z(A1mjv)  z(A2mjv) ... z(AMmjv Theorem (PRG main): for every n,d, and h satisfying hd>n, at least one of Gx(j) is an -q-ary PRG for size (-4 h d2 log2q). Furthermore, all the Gx(j)s are computable in time poly(qd,n) with oracle access to x.



1. Joint withA.Ta-shma & D.Zuckerman 2. Improved: R.Shaltiel and C. Umans Slides: Adi Akavia