290 likes | 444 Views
Circuit Depth & Space Complexity. Slides by Michael Lewin & Robert Sayegh. Adapted from Oded Goldreich’s course lecture notes by Vered Rosen and Alon Rosen. Introduction. In this lecture we will study some of the relations between Boolean circuits and Turing machines:
E N D
Circuit Depth & Space Complexity Slides by Michael Lewin & Robert Sayegh. Adapted from Oded Goldreich’s course lecture notes by Vered Rosen and Alon Rosen.
Introduction In this lecture we will study some of the relations between Boolean circuits and Turing machines: • We will define and explore the classes NC and AC • Establish a strong connection between space complexity and depth of circuits.
Boolean Circuits definitions 20.1.1 Definition: A Boolean Circuit is a directed acyclic graph with labeled vertices: • The input vertices, labeled with a variable xi or a constant (0 or 1), and have fan-in 0. • The gate vertices, have fan-in k>0, and are labeled with a Boolean function on k inputs ( ) - The () gate has fan-in 1 always. • The output vertices, labeled ‘output’, have fan-out 0. Given an assignment {0,1}m on variables x1…x, C() denotes the value of the circuit’s output. By assigning to each vertex its Boolean-operation value.
Boolean Circuits definitions (2) • Size(C)of a circuit, denotes the number of gates in a circuit C. • Depth(C) of a circuit, denotes the maximum distance from an input to an output. • A bounded fan-incircuit, is a circuit with an a-priori upper bound on the fan-in of its AND and OR gates. An unbounded fan-in circuit, is a circuit with no limitation on the fan-in of its AND and OR gates.
Boolean Circuits observations 20.1.2 • Any circuit with bounded fan-in K, can be trans-formed into a circuit with bounded fan-in 2, paying only a constant factor in its depth and size. • Using De-Morgan’s laws, any circuit can be modified in such a way that all the negations appear only in the input layer. • We can construct any unbounded fan-in circuit, in the special form where all and gates are organized into alternating layers, with edges only between adjacent layers.
Boolean Circuits observations (2) • For any Turing machine M running on some input x{0,1}n in time TM(n),we can construct a circuit of size TM2(n) and of depth bounded by TM(n). • Circuits may be organized into disjoint layers, where each layer consists of gates having equal distance from the input vertices. Such circuit presentations capture a notion of parallel programming(more on this later).
Families of Circuits definitions 20.1.3 Definition: A language L is said to be decided by a family of circuits {Cn}, when Cn accepts n variables as input iffn x{0,1}n Cn(x)=L(x) Definition: Functions Depth D and size S of a family implies that Cn, Size(Cn) s(n) Cn, Depth(Cn) d(n) Where s()S and d()D
Log-space uniformity Definition: A family {Cn}, is called log-space uniform if there exists a DTM, M, s.t. n M(1n)=<Cn>and M is in space log(|<Cn>|) <Cn> is the description of the circuit !!Note that M runs in log-space of the output size, so that we can produce circuits of super-polynomial size. • By requiring uniformity we correlate the size and depth of a family {Cn}, which decides language L, with the complexity of the language’s TM !! Otherwise, The family of circuits with constant output (true or false) on input 1n, can easily decide languages, even outside R; Simply by representing the truth table of any language.
Small-depth Circuits 20.2 Motivation: A small depth circuit is a polynomial-size circuit whose depth is poly-logarithmic in its sizeThat is: a circuit with size=p(n) and depth=O(logkn) • Next we show that for such circuits the unbounded fan-in will not add much power. • And that such circuits capture the notion of efficientparallel computation.
Classes NC and AC 20.2.1 Definition:class NC For K0, NCkis the class of languages that can be decided by families of bounded fan-in, small circuits: {Cn} s.t. size(Cn)=p(n) and depth(Cn)=O(logkn). Definition:class AC For K0, ACkis the class of languages that can be decided by families of unbounded fan-in, small circuits: {Cn} s.t. size(Cn)=p(n) and depth(Cn)=O(logkn). NC kNCk AC kACk
AC = NC Theorem:k0 NCk ACk NCk+1 Proof: • The first inclusion is trivial • The second inclusion is easy to observe: The fan-in of ACK circuits must be bound by poly(n), thus each gate can be converted to a tree of identical gates, with fan-in=2 and depth O(logn). By Transforming all ACk gates, we get a circuit with bound fan-in 2 and with a poly-factor in size and with a logarithmic-factor in depth. This is an NCk+1 circuit . Open Questions: - Does the hierarchy collapse? - What’s the inclusion between NC and P
AC0 NC1 20.2.2 Theorem:AC0 NC1 We next show a sketch for proving that the parity problem is in NC1 but not in AC0. Definition:Parity(x1, … ,xn) i xi(mod 2) - The theorem implies that uniform-AC0 P - Whereas the question uniform-NC1?NP is open !
AC0 NC1 (2) Claim: ParityNC1 Proof: Parity can be computed by a binary tree of xor gates. We then replace each xor gate with three gates: ab=(a b) (a b) We increased the size and depth in factors 2 and 3 respectively. Consequently, Parity is computed by circuits of logarithmic depth, and polynomial size; thus in NC1.
AC0 NC1 (3) Claim: Parity AC0 Proof (sketch): We show that every constant depth circuit computing Parity, must have a sub-exponential size.Therefore, Parity cannot be in AC0: - First we Prove that depth 2 parity circuits must be large - Then we prove that depth d small circuits solving parity, can be converted to d-1 depth small-circuits.Thus, contradicting the induction hypothesis. Theorem:d constant, a circuit computing Parity must have size exp((n1/(d-1))).
AC0 NC1 (4) Applies Symmetrically for an OR of ANDs circuit Base (d=2): Assuming the circuit is an AND of OR gates: - Any AND gate evaluating to ‘1’ will determine the value of the circuit. - Every AND must be of fan-in = n, otherwise such AND gate evaluates to ‘1’ independently of some xi - Thus, each AND gate represents some assignment of the input variables. There must be at least 2n-1 AND gates, otherwise there is an assignment =x1,…,xn s.t. Parity()=1, of which no AND gate evaluates to ‘1’.
AC0 NC1 (5) The Induction step: The induction is based on the Lemma of Hastad: Given a depth 2 circuit, say AND of ORs; if one gives random values to a randomly selected subset of variables, it is possible to write the induced circuit as OR of ANDs with very high probability. • Given a depth d circuit computing parity, we assign random values to a large number of its inputs. Consequently we obtain a simplified circuit with fewer variables; but still computing parity. • By Virtue of the lemma, we can interchange the two layers closest to the input. Then merge the two now adjacent levels with the same connective, thus decreasing the depth of the circuit to d-1.
AC0 NC1 (6) Formally: On input variables x1,…,xn the random restriction treats each xi independently as follows: w.p. (1-)/2 set xi=0 xi = w.p. (1-)/2 set xi=1 w.p. leave xi as a variable - The expected number of variables is m=n. - We would like to reduce the size of the transformed circuit to be smaller than exp(o(m1/(d-2))) Requesting n1/(d-1) < m1/(d-2) we get n1/(d-1) < m1/(d-2) thus we choose, =n-1/(d-1). This is a small-o
NC and Parallel Computation 20.2.3 Definition (PRAM) A PRAM machine, consists of several independentRAMs, each having a separate set of registers. In addition, there is an infinite shared memory accessible by all RAMs. We denote as PRAM(t(),p()) the class of languages decidable by A PRAM working in parallel time t() by using p() processors. A parallel computation is said to be efficient, if it can obtains an exponential run-time drop, in solving a problem, comparing to sequential machines.
NC and Parallel Computation (2) Theorem: uniform-NC = PRAM(polylog, poly) The class NC captures the notion of efficient computation by PRAM machines. Similarly to the way class Pcaptures the notion of efficiency for the RAM machines. The class NC ignores two important aspects of parallel computation: - Communication between processors - The real bottleneck, which is the number of the processors. A PRAM(t(log2n),p(n)) is more likely useful than PRAM(t(logn),p(n2)).
Circuit Depth and Space Complexity 20.3 Definition:Depth/Size(d(),s()) is the class of all languages that can be decided by a uniform-family of bounded fan-in circuits of depth d() & size s() Definition:Depth (d()) is the class of all languages that can be decided by a uniform-family of bounded fan-in circuits of depth d() NC is Depth/Size(polylog,poly) NC Depth(polylog) - Actually Depth(d()) is Depth/Size(d(),2O(d())) - Therefore, potentially Depth(polylog) contains languages that do not belong to NC
Circuit Depth and Space Complexity (2) Theorem: For any integer function s()log() NSPACE(s) Depth/Size(O(s2),2O(s)) Proof: (in the next slide we use a claim to continue the proof) Given a NTM with s(n)-space M, we construct a family {Cn} of depth O(s2) and size 2O(s) s.t. X{0,1}* C|x|(x) = M(x) Recall that the computation of M on x, can be represented by the configuration graph GM,xwhere M accepts x is the problem of connectivity between the initial configuration vertex to the accepting configuration vertex.
Circuit Depth and Space Complexity (3) Claim:CONN NC2 proof: - Given a directed graph G, let A be the adjacency matrix. - And let B = A+I (allowing self loops) - Let B2i,j = k (Bi,k Bk,j) B2i,j=1 iff (i,j) are connected with a path of length2 - Using logn such Boolean multiplications, we can compute the matrix Bn, which is the adjacency matrix of the transitive closure of A. - Finally: The squaring action is in AC0, thus in NC1 Therefore, logn NC1 multiplications will be in NC2 . Corollary: NLNC2
Circuit Depth and Space Complexity (4) Proof (cont): The circuit we build is a composition of two circuits. - The first circuit, generates GM,x for input x and M (i.e. the matrixA) Given x and M, there are 2O(s) configurations, each represented by O(s) bits; For each pair of configurations, we check if they are adjacent by comparing the contents of the work tape in the two configurations, which is a depth O(logs) operation.
Circuit Depth and Space Complexity (5) Proof (cont): - The second circuit, accepts as input the matrix A, and decides the CONN problem on GM,x. Since GM,x is a 2O(s) circuit, and using the claim, then CONN problem can be decided on GM,x indepth O(s2) and size 2O(s). Overall, we obtain a circuit Cn in depth O(s2) and size 2O(s) s.t. Cn(x)=M(x). We actually proved NSPACE(s)Depth(O(s2))
Depth(d) DSpace(d) Theorem: For any integer function d()log() Depth(d) DSPACE(s) Proof: Given uniform family {Cn}, of depth d(n), we construct a DTM d(n)-space M, s.t. x{0,1}* M(x)=C|x|(X) The algorithm will be the composition of two algorithms, each using d(n) space. Lemma:Let M1, M2 be two s(n)-space Turing machines. Then, there exists an s(n)-space TM M that on input x outputs M2(M1(x)).
Depth(d) DSpace(d) (2) The algorithm: (given input x{0,1}n) 1. Obtain a description of Cn • List of gates and their predecessors • The description may be exponential (as the number of gates is) 2. Evaluate Cn(x) The proof is presented by proving the following two claims: Claim:<Cn> can be generated using O(d(n)) space. Proof: By uniformity of {Cn}, there exists a DTM M, s.t. M(1n) = < Cn > (description of Cn), using log(<Cn>). Since |<Cn>|2O(d(n)), M uses O(d(n)) space as required.
Depth(d) DSpace(d) (3) Assuming fan-in=2 Claim: Circuit evaluation for bounded fan-in circuits can be solved in space=O(circuit depth) Proof: Given circuit C of depth d and input x, we want to compute C(x). Our implementation is recursive: A natural recursion would be s.t. for every operation node op in the circuit we define Value(Cx,w) : - If the node is a leaf, simply return its value. - Value(Cx,op)= Value(Cx,v) op Value(Cx,u) where v and u are the predecessors ofop. !! However, such recursion consumes space O(d2) : There are O(d) recursion levels, where at each level we remember a vertex name which is also O(d)
Depth(d) DSpace(d) (4) • We represent each vertex by a path reaching it from the output vertex. - The output vertex is represented by , the empty string - Then, its right predecessor by 0, and the left one by 1 Consequently, each vertex is represented by a binary string of length O(d). Where its predecessors are achieved by concatenating 0 and 1 respectively! For a path representing a leaf or a node operation: - If path is a leaf then return its value. - Value(Cx,op) = Value(Cx, patho1) op Value(Cx, patho0) ! At each recursion level, path determines precisely all the previous recursion levels; Thus, space consumption is O(d).
Corollary • We summarize the relation between Circuit depth and space complexity: NC1 L NL NC2