1 / 43

Reconstruction of Depth-3 circuits

Reconstruction of Depth-3 circuits. Amir Shpilka Technion. Based on work with Zohar Karnin (Technion). Plan of talk. Background Problem definition Depth-3 circuits Results Proof idea: Structural theorem for zero depth-3 circuits Reconstruction of Depth 3 circuits.

huslu
Download Presentation

Reconstruction of Depth-3 circuits

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reconstruction of Depth-3 circuits Amir Shpilka Technion Based on work with Zohar Karnin (Technion)

  2. Plan of talk • Background • Problem definition • Depth-3 circuits • Results • Proof idea: • Structural theorem for zero depth-3 circuits • Reconstruction of Depth 3 circuits

  3. Reconstruction of arithmetic circuits • Input: Black-Box arithmetic circuit, over a finite field F, computing a polynomial f (x1,...,xn) C f(x1,...,xn) • Goal: Find a small circuit for f, using few queries • Motivation: natural problem, algebraic analog of learning • Caveat: queries from F or extension field of F

  4. + + + + M1 top fan-in = k L1,1 X X X X a1 an a0 ... x1 xn 1 Depth 3 circuits - (k) circuits Depth-3 = sums of products of linear functions L1,1 = t=1...n at¢xt + a0 Mi = j=1...diLi,j C = i=1...k Mi

  5. + M1 Mk L1,1 X X X X L1,d Depth 3 circuits - (k) circuits top fan-in = k + + + Li,j = t=1...n at¢xt + a0 Mi = j=1...diLi,j C = i=1...k Mi a1 an a0 ... x1 xn 1 Alternative view: shifted sparse polynomials g = y15 y23 ym7 + ... (k monomials) Replace each variable with a linear function in {xi}

  6. Why study depth-3 circuits ? • Easiest model for which lower bounds are difficult (only (n2) over C) • Depth-4 circuits almost equivalent to general circuits[Agrawal Vinay] • Exponential lower bounds for depth-4 circuits imply exponential lower bounds for general circuits • Polynomial time black-box polynomial identity testing for depth-4 implies derandomization of identity testing for general circuits • Understanding depth-3 is important

  7. Known results for depth-3 circuits • Lower bounds: • Exponential lower bounds over finite fields [Grigoriev Karpinsky, Grigoriev Razborov]. • Quadratic lower bounds over R, C[S Wigderson]. • Zero Testing: • Polynomial time when the circuit is given to us [Kayal Saxena]. • Quasi-polynomial time in the black-box model [Karnin S]. • Recall: for depth-2 circuits everything known. Depth-4 closely related to the general case. only for (k), with k=O(1)

  8. Our results • Reconstruction of (k) circuits: quasi-polynomial time algorithm • Reconstruction of read-k depth-3 circuits (every variable appears in at most k linear functions): polynomial time algorithm • Corollary: polynomial time reconstruction of multilinear (k) circuits

  9. Comparison to previous results • Poly-time reconstruction of Sparse polynomials (depth-2 circuit)[Ben-Or Tiwari],[Grigoriev Karpinski Singer],... ,[Klivans Spielman] • More generally: Randomizedreconstructionof polynomials whose "ordered" partial derivatives span a low dimensional space[Beimel Bergedano Bshouty Kushilevitz Varricchio], [Klivans S] • Reconstruction of Read-Once arith. formulas: • Poly-time randomized [Hancok Hellerstein], [Hancok Hellerstein Bshouty], [Bshouty Bshouty]. • Sub-exponential deterministic [S Volkovich] • Reconstruction of C ) ZPEXPRPC [Fortnow Klivans]

  10. Proof technique • Proof combines and extends several previous works: • Theorem on structure of zero (k) circuits [Dvir S] • Black-box zero testing of (k) circuits [Karnin S] • Reconstruction of (2) circuits [S] • Today: first give background on depth-3 circuits, then (2) circuits and finally (hopefully) cover (k) circuits

  11. What's next: • Structural theorem for zero (k) • Reconstruction of(2) • Reconstruction of(k)

  12. More on depth-3 circuits • Depth-3 = sum of products of linear functions • Li,j= t=1...n at¢xt + a0 • Mi = j=1...diLi,j • C = i=1...k Mi • Mi = multiplication gate = product of lin. functions • deg(C) = Maxi=1...kdeg(Mi) • gcd(C) = greatest common divisor of mult. gates= g.c.d. (M1,M2,...,Mk) • Note: gcd(C) = product of linear functions • Simplification: sim(C) := C/gcd(C) also depth-3 • Main def: rank(C) = dimension of span of linear functions in sim(C)

  13. Example • C = x2¢(y+x)¢(z-x-y) + x¢y¢(z-x-y) - 2(z-x-y)¢x2 • M1 = x2¢(y+x)¢(z-x-y) • M2 = x¢y¢(z-x-y) • M3= -2(z-x-y)¢x2 • deg(C) = 4 • gcd(C) = x¢ (z-x-y) • sim(C) = C/gcd(C) = x¢(y+x) + y -2x • rank(C) = dim(span{x, y+x, y, -2x}) = 2 • Note: without removing gcd, rank is 3. • Why define rank this way?

  14. Zero depth-3 circuits • C is zero: if C computes the zero polynomial • C is minimal: if no proper subset of multiplication gates sum to zero • Structural theorem: • if a degree d (k) circuit C is minimal and zero then: • rank(C) = O(log(d)k-2) • Note:rank of arbitrary (k) circuit can be n

  15. What is it good for? • Black-Box polynomial identity testing of (k) circuits in quasi-polynomial time [Karnin S] • Implies uniqueness: Corollary: If f is computed by a minimal (k) circuit C of rank (log(d)2k-2), then C is the unique(k) circuit for f. • Will play an important role later

  16. What's next: • Structural theorem for zero (k) • Reconstruction of(2) • Reconstruction of(k)

  17. Reconstruction of (2) • Input: Black-Box holding a (2) circuit • C = M1 + M2 = L1(X)¢L2(X)Ld(X) + L'1(X)¢L'2(X)L'd(X) • Goal: Reconstruct C using a few queries. • Two different cases: • C is of low rank (i.e. rank(C) ≤ log(d)2) • C is of high rank (i.e. rank(C) ≥ log(d)2)

  18. High rank case • High level idea: if C = M1 = L1(X)¢L2(X)Ld(X) then reconstruction = factoring. E.g. can use [Kaltofen] • Problem: C = M1 + M2 • Idea: eliminate M2 by an appropriate restriction to a co-dim 1 space (i.e. make L'1 vanish) • Problems: • How do we find such a subspace? • How do we reconstruct M1? (we only have its restriction)

  19. High rank case cont. • Input: Black-Box holding a (2) circuit C = M1 + M2 = L1(X)¢L2(X)Ld(X) + L'1(X)¢L'2(X)L'd(X) • Goal: Reconstruct C using a few queries. • Idea: eliminate M2 by restriction to a subspace (i.e. make L'1 vanish) • Problems: How to find such a subspace? • How do we reconstruct M1? • Basic approach: First learn C|V for low dimensional V. Then "lift" C|V to C. • Intuition: If V of low-dimension then we can use brute-force search to eliminate M2. • Problems: Computing M1|V, Lifting C|V to C. Solution: eliminate M2 in many ways... Requires high rank. Solve using structural theorem

  20. High rank case cont. • C = M1 + M2 = L1(X)¢L2(X)Ld(X) + L'1(X)¢L'2(X)L'd(X) • High level algorithm: • Restrict circuit to a random subspace V. • Guess linearly independent linear functions L'1,...,L't from M2|V • Restrict further to Vi =V|L'i=0 • Learn M1|Vi by factoring • Glue the different factors together • Lift the circuit found in step 5

  21. Gluing different factors • The Problem: we want to find N = i=1...d Li • Input: N1=N|x1=0,...,Nt=N|xt=0, for large t • We want to reconstruct the matrix from its deck of column deleted sub-matrices. L1 = a1,1¢x1 + ... + a1,t¢xt + a1,0 L2 = a2,1¢x2 + ... + a2,t¢xt+ a2,0 ... Ld = ad,1¢x1 + ... + ad,t¢xt+ ad,0

  22. Gluing different factors • The Problem: we want to find N = i=1...d Li • Input: N1=N|x1=0,...,Nt=N|xt=0, for large t • Idea: find L in N1 and L' in N2 that agree on coordinates 3,4,... and glue them together • Problems: maybe many such L'

  23. Gluing different factors • The Problem: we want to find N = i=1...d Li • Input: N1=N|x1=0,...,Nt=N|xt=0 • Idea: find LN1, L'N2 that agree on coordinates 3,4,5...t and glue them • Problem: maybe many such L' • Look at (**01) • Hard to tell which of the 4values is missing (0001)

  24. Gluing different factors • The Problem: we want to find N = i=1...d Li • Input: N1=N|x1=0,...,Nt=N|xt=0 • Idea: find LN1, L'N2 that agree on coordinates 3,4,5...t and glue them • Problem: maybe many such L' • Idea: find L for which there is a unique L' • Problem: why such L exists? • Proof: isoperimetric inequality/information theory/lower bounds for locally-decodable-codes...

  25. Gluing different factors • The Problem: we want to find N = i=1...d Li • Input: N1=N|x1=0,...,Nt=N|xt=0 • Idea: find LN1, L'N2 that agree on coordinates 3,4,5...t and glue them • Problem: maybe many such L' • Idea: find L for which there is a unique L' • Problem: why such L exists? Claim: If no such L then the rows give a subset of {0,1}t with too many edges  (isoperimetric ineq.)

  26. Back to gluing different factors • The Problem: we want to find N = i=1...d Li • Input: N1=N|x1=0,...,Nt=N|xt=0 • Idea: find LN1, L'N2 that agree on coord. 3,4,5...t and glue them • Problem: maybe many such L' • Idea: find L for which there is a unique L' • Problem: why such L exists? • Claim: if no such L then set of rows has too many edges • Proof: Consider L. If 8i 9Li L in Ni agreeing on all other coordinatesthen L has neighbor in i'th coordinate. If t is high then we have too many edges 

  27. Lifting the circuit • So far: reconstructed M1|V. Implies reconstruction of C|V. Need to lift C|V. • Idea: Learn C on many low-dimensional subspaces • Let Vi = span {V,ei}. • Find C|Vi for i=1...n. • Glue the circuits together. • Problem: Maybe the circuits cannot be glued(i.e. many different equivalent (2) circuits) • Structural Theorem implies we can glue (in the high rank case)

  28. Lifting the circuit • Idea: Assume we can learn C|V for low dim. V. • Let Vi = span {V,ei}. • Find C|Vi for i=1...n. • Glue the circuits together. • Problem: Maybe we cannot glue (e.g. many different equivalent circuits) • Claim:If rank(C':=C|V) ¸ log(d)2 then C' is unique • Proof: Assume C' = C''. Then C'-C''=0. • By structural theorem rank(C'-C'') < log(d)2  • Corollary:8 i, (C|Vi)|V C|V. Glue together linear functions that look the same on V. • Fact: we succeed w.h.p. over choice of V

  29. Low rank case • C = M1 + M2 = L1(X)¢L2(X)Ld(X) + L'1(X)¢L'2(X)L'd(X) • Dim(span{L1,...,Ld,L'1,...,L'd}) ≤ log(d)2 • Observation: C can be written as a polynomial in log(d)2 linear functions. • Reconstruction idea: find those log(d)2 linear functions, and then do interpolation to find C. • Problem: Finding the relevant linear functions.

  30. Low rank case cont. • C = M1 + M2 = L1(X)¢L2(X)Ld(X) + L'1(X)¢L'2(X)L'd(X) • Dim(span{L1,...,Ld,L'1,...,L'd}) ≤ log(d)2 • Observation: C is a polynomial in log(d)2 linear functions. • Reconstruction idea: find those log(d)2 lin. functions, and interpolate C • Problem: Finding the relevant linear functions. • Algorithm sketch: • Pick a random subspace V of dimension 2log(d)2. • Brute force, find a basis {Li}i=1...r (over V) for C|V • Find polynomial Q s.t. C|V =Q(L1,...,Lr) • Fact: 9! {Li}i=1...r s.t. C=Q(L1,...,Lr) and Li|V=Li • Lift {Li}i=1...r to a basis for C over Fn

  31. Reconstruction algorithm • Restrict to a random subspace • Guess high rank or low rank • high rank: • Learn one multiplication gate by looking at many restrictions to co-dim 1 subspaces, factoring the restricted circuit and gluing • Find the second gate by factoring • low rank: guess a basis and reconstruct • Lift the dimension by 1 • high rank: reconstruct using uniqueness • low rank: guess the lift of the basis • Verify using identity testing

  32. What's next: • Structural theorem for zero (k) • Black Box PIT for (k) • Reconstruction of(k)

  33. Higher values of 2 • Bad news: work so far is just a warm up... • How to generalize to (k)? • Two possible problems: • For k=2 different alg. for low rank and high rank • A (k) circuit may be the sum of low rank and high rank circuits • In the case of high rank we "singled out" one gate • How do we single out a gate when there are many gates. Is it possible? • Need to understand the algorithm better

  34. New ideas • Canonical circuits: • We define a distance function for multiplication gates • Cluster "close by" multiplication gates • A cluster has low rank (after removing g.c.d.) • Theorem: every (k) circuit can be written uniquely as a sum of clusters. • Note: a (2) circuit is either low rank (one cluster) or high rank (two clusters) • Isolation lemma: can find many restrictions that eliminate all clusters but one.

  35. Canonical circuits • Clustering lemma: if C is (k) then: 9 partition I1t I2t ... t Im = [k] s.t. • rank(C|Ij) ≤ log(d)a(k) • 8lj dist(CIl,CIj):= rank(C|Il + C|Ij) ≥ log(d)A(k) • Intuitively: A cluster is more "robust" than a multiplication gate • Theorem: If C= C1 + ... +Cr and C= C'1+...+C't are two clustered representation of f, then 9 permutation  s.t. Ci=C'(i) (as polynomials) • Corollary: 9 unique canonical representation for C CI is the sum of gates in I

  36. More on canonical circuit • Def: V is D-rank-preserving for C if: • no two linearly independent linear functions in C are linearly dependent on V • rank(CI|V) ≥ min{rank(CI),D} • Intuitively: linear functions remain as independent as possible • Theorem: If V is (log(d)2r-2)-rank-preserving for C and C= C1 + ... +Cr is the canonical circuit for C then CV= C1|V + ... +Cr|V a is the (unique) canonical circuit for C|V • Corollary: if we reconstruct the restrictions of the clusters to V, {Ci|V}, and lift each cluster separately then we get C. • Note: this is similar in nature to previous algorithm.

  37. (2) revisited • Algorithm has the following form: • restrict to an (log(d)2)-rank-preserving subspace V • reconstruct the canonical circuit of C|V • one cluster if the rank is low • two clusters if the rank is high • Lift each cluster separately to Fn • We shall generalize this view of the algorithm • Need to show how to learn a cluster

  38. Isolation lemma • Theorem: 9 many "high"-dimensional subspaces Vi½V that all but one cluster vanish on. • Proof: main technical difficulty of paper (generalizes main lemma of structural theorem) • Theorem: Given restrictions of a cluster {C1|Vi} there exists an efficient gluing algorithm that outputs C1|V. • Theorem: Lifting is possible due to uniqueness • Corollary: If we can find those subspaces then we can learn C. • Question: how to find such subspaces?

  39. Separating the clusters Assume C|_V is on poly(log n) variables. Question: how to single out a cluster in C ? Claim: 9 many "high"-dimensional subspaces that all but one cluster vanish on. Claim: If we can find those subspaces then we can learn the special cluster Corollary: If we can find such a subspace then we can learn C. Question: how to find such a subspace? Answer: Go over all possible subspaces Requires exp(poly(log n)) time Question: how to verify that we have a cluster? Answer: be patient and wait till the end...

  40. The reconstruction algorithm • Let V be a random poly(log(n)) dimensional subspace. Consider C|V • Guess subspaces V1,...,Vt'µ V • Assume C|Vi is a uni-cluster circuit • Learn C|vi by low-rank reconstruction • Glue {Cvi} with gluing algorithm, to get C1|V • Recursively learn C'|V = C|V - C1|V • Verify correctness by Black-Box identity testing • Lift each cluster, separately, to C

  41. Final Remarks • We can make the above algorithm deterministic (i.e. can find rank-preserving subspace in an efficient way). • Can we break the O(1) (or actually o(n)) barrier on the number of multiplication gates?(both for identity testing and reconstruction)

  42. A concrete open problem • Tightness of structural thm: is it true thatif C  0 is simple and minimal (3) then rank(C) = O(1) • (over characteristic zero!) • Namely, if i Ai() + i Bi() + i Ci() = 0, no g.c.d., then rank{Ai,Bj,Cl} = O(1)? • If true then we get a Black-Box PIT in poly. time for (k) (over char. 0) • If true over a finite field, then implies poly time reconstruction.

  43. Thank You

More Related