310 likes | 324 Views
Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II. Paul Beame University of Washington. joint work with Erik Vee, Mike Saks, T.S. Jayram, Xiaodong Sun. The Trace of an Input. v 0. Partition a subset of the layers L j into sets 1 , 2. L 1. v 1. L 2.
E N D
Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part II Paul Beame University of Washington joint work with Erik Vee, Mike Saks, T.S. Jayram, Xiaodong Sun
The Trace of an Input v0 Partition asubset of the layers Lj into sets 1, 2 L1 v1 L2 • The trace of input x • the sequence of nodes reached on input x as the computation moves from one set ito the other • E.g. trace(x)=(v1,v2,v3) • a =length of trace = # of alternations in the partition • 2Sa possible traces v2 kn v3 L5 0 1
Embedded (m,a)-rectangles • An embedded (m,a)-rectangleRDnis a subset defined bydisjoint setsA,B{1,...,n}, feeta partial assignmentsDAUB,spinesets of assignmentsRADA, RBDBlegs • R={z|zAUB=s, zARA,zBRB} • |A|,|B| =m • |RA|/|DA|, |RB|/|DB|a density
s An embedded (m,a)-rectangle RA RB DB spine Dn DA m m xn x1 RA RB legs feet A B RA and RB eachhave density at least a Wlog AB
Properties of a set of layers • r layers (of height kn/r) • Let Layers(x,i) be the set of layers in which variable xi is read on input x • For a set of layers, • unread(x, ) ={ i : Layers(x,i) =} • core(x, ) = { i : Layers(x,i) }
Embedded rectangle partition (1,2) of f-1(1) induced by 1,2 • Two inputs x,yf-1(1) are equivalent iff • trace(x, 1,2)= trace(y, 1,2) • core(x, 1)= core(y, 1) • core(x, 2)= core(y, 2) • stem(x, 1,2)= stem(y, 1,2) where • stem(x, 1,2) is the partial assignment that has the values of x outside core(x, 1) and core(x, 2) • Fixing the trace and the two cores induces the partition into pseudo-rectangles we used before • Fixing the stems, fixes the common part of each pseudo-rectangle and produces the embedded rectangles we later reasoned about
Previous argument • Throw out all embedded rectangles in (1,2)for which|core( , 1)| or |core( , 2)| is smaller than m • Compute density bound a on what’s left • Problem with applying it to the Boolean case • The density bound a is too small • Denominator contains Better density bounds?
Boolean bounds • This talk will cover • Density-bounding technique from [Ajtai 99a]with improvements from[B-Saks-Sun-Vee 00] • Yields density 2em which is large enough for the Boolean case • Yields
Generalized method for choosing 1,2 • Generalization of the method from [BRS 89], [BST 98] • Distribution q for probability q 1/2 • Pr[Li1] = Pr[Li2] = q • Pr[Li12] = 1 2q • Independent for eachi • E[|core(x, 1)|]= E[|core(x, 2)|] n qk
Second Moment Method • Var[|core(x, )|] (k2n/r) E[|core(x, )|] = (k2n/r) m • By Chebyshev’s inequality • Pr[ m/2|core(x, )| 3m/2] 1 Var[|core(x, )|]/(m/2)2 1 4k2(1/q)k/r sincem n qk • Choose r=8k2(1/q)k
First fix the trace • f-1(1) and (1,2) are both disjoint unions • over the 2Sa choices of the trace • we’ll bound densities in each separately From now on when working with a fixed partition, without saying it explicitly, we will usually assume that the function f=ft for some trace t
A simplifying assumption On every input the BP reads every variable at least once • Can easily ensure this by starting with n dummy queries • Why bother? • It gives an alternate characterization of core(x, 1) • core(x, 1) = unread(x, 1)
Analyzing density - key observations Embedded rectangle R in (1,2) s r RB RA Every x in R has A=core(x, 1) and B=core(x, 2) |RA| = # of ways of varying x on A and staying in R |RA| = # of ways of extending rand staying in R Let r be the part outside A of some x in R super-stem
How the cores can vary v0 1, 2, rest L1 v1 Path of x Path of y r L2 v2 i core(x, 1) xi not read outside 1on input x xi not read outside 1on input y i core(y, 1) v3 v4 L5 0 1
Analyzing density - key observations Embedded rectangle R in (1,2) s r RB RA Every x in R has A=core(x, 1) and B=core(x, 2) |RA| = # of ways of varying x on A and staying in R |RA| = # of ways of extending rand staying in R super-stem Let r be the part outside A of some x in R Any input yDn agreeing with r has A=core(y, 1)
Lower-bounding density of rectangles • Look at rectangles that contain assignments in f-1(1) (DAr) • R1A, R2A, R3A,… partition the projection of f-1(1) (DAr) on A • To show that most inputs are in rectangles R with large |RA| it suffices to show that • Any assignment r super-stems(1) is consistent with very few rectangles: numrects(r) • I.e., show numrects(r) is small relative to |D|n|r|
Bounding numrects(r) • For rsuper-stems(1), any rectangle containing r has the same A=core( , 1) • Only option is choice of B=core( , 2) since the stem will be fixed byr • To count # of choices it suffices to show that B D B’is small for any rectangles R,R’ agreeing with r
New Goal: Bounding Symmetric Differences For rsuper-stems(1) and x, y agreeing with r, show |core(x, 2)D core(y, 2)| is small … and the same with roles of 1 ,2 reversed
How the cores can vary v0 1, 2, rest L1 Path of x, Path of y,r v1 L2 v2 Variables read outside 1 are the same on x and y since all are set by r Only way i core(x, 2) core(y, 2) is if xi is read in 1 on input y but not on input x v3 v4 L5 Key: variables in the symmetric difference are read more! 0 1
Using the access pattern to bound the core difference • Partition f-1(1) into classes depending on the access pattern of the input • For xf-1(1) define patternx:[r] [n] given by • patternx(t) = # {i: |Layers(x,i)| = t } • number of variables read in exactly t layers • For each class C will define 1 ,2 so that • for all x in C, variables read in t layers will account for almost all of core(x, 1),core(x, 2) • Variables in core(x, 2) Dcore(y, 2) will be read in t layers on input either x or y
More precise characterization • For any t, core(x, 2) Dcore(y, 2) iscontained in G2(x,t)G2(y,t)H2(x,t)H2(y,t) where • iG2(z,t) ifficore(z, 2) but |Layers(z,i)|t • iH2(z,t) iff|Layers(z,i) 2|t, |Layers(z,i) 1| 1, and Layers(z,i) 1 2
Recall method for choosing 1,2 • Distribution q for probability q 1/2 • Pr[Li1] = Pr[Li2] = q • Pr[Li12] = 1 2q • Independent for eachi
Choosing the probabilities Claim: There is a set Q of 2kprobabilities q, each at least k16k,such that for almost all z, there is an integer tt(z)k with E[|G2(z,t) H2(z,t)|] E[|core(z, 2)|] for 1,2 chosen from q where q=q(z)Q With these valuesE[|core(z, 2)|] nqk n (k-16k)k n (k-16k2)
Some issues • Inputs x and yextending some rsuper-stems(1) may not have the same q and t • We actually apply the above reasoning separately for disjoint subsets Iq,tf-1(1) of inputs • We can bound |core(x, 2)D core(y, 2)| relative to max{|core(x,2)|, |core(y,2)|} but need it in terms of |core(x,1)| |core(y, 1)| • Expectations of the cores of an input on 1 and 2 are the same and concentration of core(z, 1) about its mean says these are similar for x and y because core(x,1) core(y, 1)
Randomized Lower Bounds • Recall: once 1,2 are fixed we obtain the partition (1,2) of f-1(1) into embedded rectangles • We only keep the good part of each partition • There are 2k choices of 1,2 that suffice to cover most of f-1(1) • Each input in the good part of f-1(1) is contained in at most 2k embedded rectangles • Implies original error multiplied at most 2k times when looking at embedded rectangles • Works with initial error O(1/k)
Proof of the Claim: Tailoring q to the access pattern to bound G2(z,t) and H2(z,t) • Let pt=patternz(t) = # {i: |Layers(x,i)| = t } • Define m(z,q) = tpt qt • Note m(z,q)=E[|core(z, 1)|]=E[|core(z, 2)|] • Let t(q) be the index of the largest term pt qt in tpt qt • Pick the smallest such index if there are ties • Want to choose q so the term with index t(q) is the rest
m(z,q) = tpt qt • Let t(q) is non-increasing in q • Decreasing q shifts weight away from larger terms • If q 1/(4k)then t(q) k • Since tpt = n it follows that tkpt qt nqk+1 • tpt qt=E[|core(z, 1)|] n qk • First k terms add up to all but a q=1/(4k) fraction of tpt qt • One of the first k terms must be larger than all the other terms
Choices of q • Q= { qb=k-8b: 1b 2k } • Since 1 t(qb)k and t(qb+1) t(qb) are integral, by PHP there must be a b such that t(qb+1) t(qb) t(qb-1) • Set q(z)=qb • t(qb+1) t(qb) implies term terms with smaller t • t(qb) t(qb-1) implies term with larger t • This bounds G2(z,t) • Bounding H2(z,t) a little trickier since accesses divided between 1,2;forces at least a factor of k decrease between qb and qb+1
What functions are this hard? • Computing xTMyx 0 (mod2)forx{0,1}n,y {0,1}2n-1 • Defined in [Ajtai 99b] • Givenx{0,1}n, compute the parity of the number of (i,j) such that xixj xi+j is true • By reduction from previous problem [Ajtai 99b] • Element distinctness: Given x[n2]n determine whether or not all xi are distinct.
Why ED doesn’t have large embedded rectangles • Let RADA and RBDB have density more than 2-|A| and 2-|B| respectively • Then more than |D|/2 elements of D appear in RA and similarly for RB • Rectangle contains non-distinct input vector • If |D|n2 then |ED-1(1)| |Dn|/e • Randomized bounds extend set-disjointness technique of [Babai-Frankl-Simon 86] • n-2error
The end • Bounds for quadratic form based on rigidity argument [Ajtai 99b] • Given rigidity, randomized bounds follow from discrepancy argument using pairwise independence (Lindsay’s Lemma) [BSSV 00] • Open:better bounds, more functions