Time-space tradeoff lower bounds for non-uniform computation

Time-space tradeoff lower bounds for non-uniform computation Paul Beame University of Washington 4 July 2000

Why study time-space tradeoffs? • To understand relationships between the two most critical measures of computation • unified comparison of algorithms with varying time and space requirements. • non-trivial tradeoffs arise frequently in practice • avoid storing intermediate results by re-computing them

e.g. Sorting n integers from [1,n2] • Merge sort • S = O(n log n), T = O(n log n) • Radix sort • S = O(n log n), T = O(n) • Selection sort • only need - smallest value output so far - index of current element • S = O(log n) , T = O(n2)

Complexity theory • Hard problems • prove LP • prove non-trivial time lower bounds for natural decision problems in P • First step • Prove a space lower bound, e.g. S=w (log n), given an upper bound on time T, e.g. T=O(n) for a natural problem in P

An annoyance • Time hierarchy theorems imply • unnatural problems in P not solvable in time O(n) • Makes ‘first step’ vacuous for unnatural problems

Non-uniform computation • Non-trivial time lower bounds still open for problems in P • First step still very interesting even without the restriction to natural problems • Can yield bounds with precise constants • But proving lower bounds may be harder

Talk outline • The right non-uniform model (for now) • branching programs • Early success • multi-output functions, e.g. sorting • Progress on problems in P • Crawling • restricted branching programs • That breakthrough first step (and more) • true time-space tradeoffs • The path ahead

Branching programs x1 1 x3 x2 0 x4 x5 x5 x3 x1 x2 x7 x7 x8 0 1

Branching programs x1 1 x3 x2 0 x4 x5 x5 x=(0,0,1,0,...) x3 x1 To compute f:{0,1} n {0,1} on input (x1,…,xn) follow path from source to sink x2 x7 x7 x8 0 1

Branching program properties • Length = length of longest path • Size = # of nodes • Simulate TM’s • node = configuration with input bits erased • time T= Length • space S=log2Size =TM space +log2n (head) = space on an index TM • polysize = non-uniform L

TM space complexity read-only input x1 x2 x3 x4 … xn working storage Space = # of bits of working storage output

Branching program properties • Simulate random-access machines (RAMs) • not just sequential access • Generalizations • Multi-way version for xi in arbitrary domain D • good for modeling RAM input registers • Outputs on the edges • good for modeling output tape for multi-output functions such as sorting • BPs can be leveled w.l.o.g. • like adding a clock to a TM

Success for multi-output problems • Sorting • T S = W (n2/log n) [Borodin-Cook 82] • T S = W (n2) [Beame 89] • Matrix-vector product • T S = W (n3) [Abrahamson 89] • Many others including • Matrix multiplication • Pattern matching

Proof ideas: layers and trees v0 • m outputs on input x • at least m/r outputs in some tree Tv • Only2Strees Tv • Typical Claim • ifT/r = en,each treeTvoutputspcorrect answers on only a c-pfraction of inputs • Correct for all x implies 2Sc-m/r is at least 1 • S=W(m/r)=W(mn/T) v1 T/r v T/r vr-1 T vr 0 1

Limitation of the technique • Never more than T S =W (nm) where m is number of outputs • “It is unfortunately crucial to our proof that sorting requires many output bits, and it remains an interesting open question whether a similar lower bound can be made to apply to a set recognition problem, such as recognizing whether all n input numbers are distinct.” [Cook: Turing Award Lecture, 1983]

Talk outline • The right non-uniform model (for now) • branching programs • Early success • multi-output functions, e.g. sorting • Problems in P • Crawling • restricted branching programs • That breakthrough first step (and more) • true time-space tradeoffs • The path ahead

Restricted branching programs • Constant-width - only a constant number of nodes per level • [Chandra-Furst-Lipton 83] • Read-once - every variable read at most once per path • [Wegener 84], [Simon-Szegedy 89], etc. • Oblivious - same variable queried per level • [Babai-Pudlak-Rodl-Szemeredi 87], [Alon-Maass 87], [Babai-Nisan-Szegedy 89] • BDD = Oblivious read-once

BDDs and best-partition communication complexity x7 • Givenf:{0,1}8->{0,1} • Two-player game • Player A has {x1,x3,x6,x7} • Player B has {x2,x4,x5,x8} • Goal:communicate fewest bits possible to compute f • Possible protocol: Player A sends the name of node. • BDD space # of bits sent for best partition into A and B x1 x6 A x3 x2 x8 B x4 x5 0 1

Communication complexity ideas • Each conversation for f:{0,1}Ax{0,1}B{0,1} corresponds to arectangleYAxYBof inputs YA {0,1}AYB {0,1}B • BDD lower bounds • sizemin(A,B)# of rectangles in tiling of inputs by f-constant rectangles with partition (A,B) • Read-once bounds • same tiling as BDD bounds but each rectangle in tiling may have a different partition

Restricted branching programs • Read-k - no variable queried >k times on • any path - syntactic read-k • [Borodin-Razborov-Smolensky 89], [Okol’nishnikova 89], etc. • any consistent path - semantic read-k • many years of no results • nothing for general branching programs either

Uniform tradeoffs • SAT is not solvable using O(n1-e) space if time is n1+o(1). [Fortnow 97] • uses diagonalization • works for co-nondeterministic TM’s • Extensions for SAT • S=logO(1) n impliesT= W (n1.4142..-e ) deterministic[Lipton-Viglas 99] • with up to no(1)advice [Tourlakis 00] • S= O(n1-e) implies T=W (n 1.618..-e). [Fortnow-van Melkebeek 00]

Non-uniform computation • [Beame-Saks-Thathachar FOCS 98] • Syntactic read-k branching programs exponentially weaker than semantic read-twice. • f(x) = “xTMx=0 (mod q)” x GF(q)n • e nloglog n time W(n log1-en) space for q~n • f(x) = “xTMx=0 (mod 3)” x {0,1}n • 1.017n time implies W (n) space • first Boolean result above time n for general branching programs

Non-uniform computation • [Ajtai STOC 99] • 0.5lognHamming distance for x [1,n2]n • kn time implies W(n logn) space • follows from [Beame-Saks-Thathachar 98] • improved to W(nlog n) time by [Pagter-00] • element distinctness for x [1,n2]n • kn time implies W(n) space • requires significant extension of techniques

That breakthrough first step! x {0,1}n • [Ajtai FOCS 99] • f(x,y) = “xTMyx (mod 2)” • kn time implies W(n) space • First result for non-uniform Boolean computation showing • time O(n) spacew(log n) y {0,1}2n-1

Ajtai’s Boolean function y1 0 y2 f(x,y)= xTMyx (mod 2) y3 y4 yn y6 y7 y8 y2n-1 My My is a modified Hankel matrix

Superlinear lower bounds • [Beame-Saks-Sun-Vee FOCS 00] • Extension to e-errorrandomized non-uniform algorithms • Better time-space tradeoffs • Apply to both element distinctness and f(x,y) = “xTMyx (mod 2)”

(m,a)-rectangles • An (m,a)-rectangleRDXis a subset defined by disjoint sets A,BX, s DAUB SA DA, SB DBsuch that • R = { z | zAUB = s, zA SA, zB SB } • |A|,|B| m • |SA|/|DA|, |SB|/|DB|a

s m m x1 xn SA SB A B An (m,a)-rectangle SA SB DB DA SA and SB eachhave density at least a In general A and Bmay be interleaved in [1,n]

Key lemma [BST 98] • Let program P use • time T = kn • space S • accept fraction d of its inputs in Dn • then P accepts all inputs in some (m,a)-rectangle where • m = bn • ais at leastd 2-4(k+1)m - (S+1)r • b-1 ~ 2k and r~ k2 2k

Improved key lemma [Ajtai 99 s] • Let program P use • time T = kn • space S • accept fraction d of its inputs in Dn • then P accepts all inputs in some (m,a)-rectangle where • m = bn • ais at least • b-1 and r are constants depending on k

Proving lower bounds using the key lemmas • Show that the desired function f • evaluates to 1 a large fraction of the time • i.e., d is large • evaluates to 0 on some input in any large(m,a)-rectangle • where large is given by the lemma bounds • or ... do the same for f

Our new key lemma • Let program P use time T = kn space S and accept fraction d of its inputs in Dn • Almost all inputsP accepts are in (m,a)-rectangles accepted by P where • m = bn • ais at least • b-1 and r are • no input is in more thanO(k)rectangles

Proving randomized lower bounds from our key lemma • Show that the desired function f • evaluates to 1 a large fraction of the time • i.e,d is large • evaluates to 0 on a g fraction of inputs in any large-enough (m,a)-rectangle • or ... do the same for f • Gives space lower bound forO(gd/k)-errorrandomized algorithms running in time kn

f (v1,…,vr-1) (v1,…,vr-1) f f f (v1,…,vr-1) vi-1vi vi-1vi Proof ideas: layers and trees v0 f = v1 kn/r v2 # of (v1,…,vr-1) is 2S(r-1) kn r kn/r vr-1 = i=1 vr can be computed inkn/r height 0 1

f f (v1,…,vr-1) (v1,…,vr-1) (r,e)-decision forest • The conjunction of r decision trees (BP’s that are trees) of height en • Each is a computed by a (r,k/r)-decision forest • Only 2S(r-1) of them • The various accept disjoint sets of inputs

T1 T2 T3 T4 Tr Decision forest • Assume wlog all variables read on every input • Fix an input x accepted by the forest • Each tree reads only a small fraction of the variables on input x • Fix two disjoint subsets oftrees,FandG kn/r

Core variables kn/r • Can split the set of variables into • core(x,F)=variables read only inF (=not read outside F) • core(x,G)=variables read only in G (=not read outside G) • remaining variables • stem(x,F,G)=assignment to remaining variables • General idea: use core(x,F), core(x,G), and stem(x,F,G) to define (m,a)-rectangles T1 T2 T3 T4 Tt

A partition of accepted inputs • Fix F, G,xaccepted byP • Rx,F,G={ y | core(y,F)=core(x,F),core(y,G)=core(x,G),stem(y,F,G)=stem(x,F,G), and P accepts y} • For each F, G the Rx,F,G partition the accepted inputs into equivalence classes • Claim: the Rx,F,G are (m,a)-rectangles

Classes are rectangles • Let A=core(x,F),B=core(x,G), s=stem(x,F,G) • SA={yA| y in Rx,F,G }, SB={zB| z in Rx,F,G } • Letw=(s,yA,zB) • wagrees with y in all trees outsideG • core(w,G)=core(y,G)=core(x,G) • wagrees with z in all trees outsideF • core(w,F)=core(z,F)=core(x,F) • stem(w,F,G)=s=stem(x,F,G) • Paccepts w since it accepts yand z • So... w is in Rx,F,G

Few partitions suffice • Only 4k pairs F,G suffice to cover almost all inputs accepted by Pby large(m,a)-rectangles Rx,F,G • Choose F,G uniformly at random of suitable size, depending on access pattern of input • probability that F,G isn’t good is tiny • one such pair will work for almost all inputs with the given access pattern • Only 4ksizes needed.

Special case: oblivious BPs • core(x,F), core(x,G) don’t depend on x • Choose Tiin F with prob qG with prob qneither with prob 1-2q

xTMyx on an (m,a)-rectangle B A x For every son AUB, f(xAUB,s,y) = xAT MABxB + g(xA,y) + h(xB,y) A B My x

Rectangles, rank, & rigidity • largest rectangle on which xATMxB is constant has a2-rank(M) • [Borodin-Razborov-Smolensky 89] • Lemma [Ajtai 99] Can fix y s.t. every bnxbn minor MAB of My has rank(MAB) cbn/log2(1/b) • improvement of bounds of [Beame-Saks-Thathachar 98] & [Borodin-Razborov-Smolensky 89] for Sylvester matrices

High rank implies balance • For any rectangle SAxSB{0,1}Ax{0,1}B with m(SAxSB) |A||B|23-rank(M)Pr[xATMxB= 1 | xA SA, xB SB]1/32Pr[xATMxB= 0 | xA SA, xB SB]1/32 • derived from result for inner product in r dimensions • So rigidity also implies balance for all large rectangles and so • Also follows for element distinctness • [Babai-Frankl-Simon 86]

Improving the bounds • What is the limit? • T=W(nlog(n/S))? • T=W(n2/S) ? • Current bounds for general BPs are almost equal to best current bounds for oblivious BPs ! • T=W(nlog(n/S)) using 2-party CC [AM] • T=W(nlog2(n/S)) using multi-party CC [BNS]

Improving the bounds • (m,a)-rectangles a 2-party CC idea • insight: generalizing to non-oblivious BPs • yields same bound as [AM] for oblivious BPs • Generalize to multi-party CC ideas to get better bounds for general BPs? • similar framework yields same bound as [BNS] for oblivious BPs • Improve oblivious BP lower bounds? • ideas other than communication complexity?

Extension to other problems • Problem should be hard for (best-partition) 2-party communication complexity (after most variables fixed). • try oblivious BPs first • Prime candidate: (directed) st-connectivity • Many non-uniform lower bounds in structured JAG models [Cook-Rackoff], [BBRRT], [Edmonds], [Barnes-Edmonds], [Achlioptas-Edmonds-Poon] • Best-partition communication complexity bounds known

Limitations of current method • Need n>T/r = decision tree height • else all functions trivial • so r > T/n • A decision forest works on a 2-Sr fraction of the accepted inputs • only place space bound is used • So need Sr<n else d.f. need only work on one input • implies ST/n < n, i.e. T < n2/S

Time-space tradeoff lower bounds for non-uniform computation