690 likes | 848 Views
Complexity Theory Lecture 12. Lecturer: Moni Naor. Recap. Last week: Hardness and Randomness Semi-random sources Extractors This Week: Finish Hardness and Randomness Circuit Complexity The class NC Formulas = NC 1 Lower bound for Andreev’s function
E N D
Complexity TheoryLecture 12 Lecturer:Moni Naor
Recap Last week: Hardness and Randomness Semi-random sources Extractors This Week: Finish Hardness and Randomness Circuit Complexity The class NC Formulas = NC1 Lower bound for Andreev’s function Communication characterization of depth
Derandomization A major research question: • How to make the construction of • Small Sample space `resembling’ large one • Hitting sets Efficient. Successful approach: randomness from hardness • (Cryptographic) pseudo-random generators • Complexity oriented pseudo-random generators
Extending the result Theorem : if E contains 2Ω(n)-unapproximable functions then BPP = P. • The assumption is an average case one • Based on non-uniformity Improvement: Theorem: If E contains functions that require size 2Ω(n)circuits (for the worst case), then E contains 2Ω(n)-unapproximable functions. Corollary: If E requires exponential size circuits, then BPP = P.
How to extend the result • Recall the worst-case to average case reduction For permanent • The idea: encode the function in a form allowing you to translate a few worst case error into random errors
Properties of a code Want a code C:{0,1}2n {0,1}2ℓ Where: • 2ℓis polynomial in 2n • C is polynomial time computable • efficient encoding • Certain local decoding properties
Codes and Hardness • Use for worst-case to average case: truth table of f:{0,1}n {0,1} worst-case hard truth table of f’:{0,1}ℓ {0,1} average-case hard mf: 0 1 1 0 0 0 1 0 C(mf) 0 1 1 0 0 0 1 0 0 0 0 1 0
Codes and Hardness • if 2ℓis polynomial in 2n, then f Eimpliesf’ E • Want to be able to prove: if f’is s’-approximable, then f is computable by a size s = poly(s’) circuit
Codes and Hardness Key point: circuit C that approximates f’ implicitly defines a received word RCthat is not far from C(mf) • Want the decoding procedure D to computes f exactly RC: 0 0 1 0 1 0 1 0 0 0 1 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 C(mf): Requires a special notion of efficient decoding C D
Decoding requirements • Want that • for any received word Rthat is not far from C(m), • for any input bit 1 · i · 2n can reconstruct m(i) with probability2/3 by accessing only poly(n) locations in R Example of code with good local decoding properties: Hadamard But exponential length This gives a probabilistic circuit for f of size poly(n) ¢ size(C) + size of decoding circuit Since probabilistic circuit have deterministic version of similar size - contradiction
Extractor • Extractor: a universal procedure for “purifying” imperfect source: • The function Ext(x,y) should be efficiently computable • truly random seed as “catalyst” • Parameters: (n, k, m, t, ) source string 2kstrings Ext near-uniform seed mbits {0,1}n t bits Truly random
Extractor: Definition (k, ε)-extractor: for all random variables X with min-entropy k: • output fools all tests T: |Prz[T(z) = 1] – Pry 2R{0,1}t, xX[T(Ext(x, y)) = 1]| ≤ ε • distributions Ext(X, Ut)andUm are ε-close (L1 dist ≤ 2ε) Umuniform distribution on :{0,1}m • Comparison to Pseudo-Random Generators • output of PRG fools all efficient tests • output of extractor fools all tests
Extractors: Applications • Using extractors • use output in place of randomness in any application • alters probability of any outcome by at mostε • Main motivation: • use output in place of randomness in algorithm • how to get truly random seed? • enumerate all seeds, take majority
Extractor as a Graph Want every subset of size 2kto see almost all of the rhs with equal probability 2t Size: 2k {0,1}m {0,1}n
Extractors: desired parameters • Goals: good:optimal: short seed O(log n) log n+O(1) long outputm = kΩ(1) m = k+t–O(1) many k’sk = nΩ(1)any k = k(n) source string 2kstrings Ext near-uniform seed m bits {0,1}n t bits Allows going over all seeds
Extractors • A random construction for Ext achieves optimal! • but we need explicit constructions • Otherwise we cannot derandomize BPP • optimal construction of extractors still open • Trevisan Extractor: • idea: any string defines a function • String C over of length ℓ define a function fC:{1… ℓ } by fC(i)=C[i] • Use NW generator with source string in place of hard function From complexity to combinatorics!
Trevisan Extractor • Tools: • An error-correcting code C:{0,1}n {0,1}ℓ • Distance between codewords: (½ - ¼m-4)ℓ • Important: in any ball of radius ½- there are at most 1/2 codewords. = ½ m-2 • Blocklength ℓ = poly(n) • Polynomial time encoding • Decoding time does not matter • An (a,h)-design S1,S2,…,Sm {1…t } where • h=log ℓ • a = δlog n/3 • t=O(log ℓ) • Construction: Ext(x, y)=C(x)[y|S1]◦C(x)[y|S2]◦…◦C(x)[y|Sm]
Trevisan Extractor Ext(x, y)=C(x)[y|S1]◦C(x)[y|S2]◦…◦C(x)[y|Sm] Theorem: Extis an extractor for min-entropyk = nδ, with • output lengthm = k1/3 • seed lengtht = O(log ℓ ) = O(log n) • errorε ≤ 1/m C(x): 010100101111101010111001010 seed y
Proof of Trevisan Extractor Assume X µ{0,1}nis a min-entropykrandom variable failing toε-pass a statistical test T: |Prz[T(z) = 1] - PrxX, y {0,1}t[T(Ext(x, y)) = 1]| > ε By applying usual hybrid argument: there is a predictor A and 1 · i · m: PrxX, y{0,1}t[A(Ext(x, y)1…i-1) = Ext(x, y)i] > ½+ε/m
The set for which A predict well Consider the set B of x’s such that Pry{0,1}t[A(Ext(x, y)1…i-1) = Ext(x, y)i] > ½+ε/2m By averaging Prx[x 2 B] ¸ε/2m Since X has min-entropyk: there are at least ε/2m 2k different x 2 B The contradiction will be by showing a succinct encoding for each x 2 B
…Proof of Trevisan Extractori, A and B are fixed If you fix the bitsoutside of Si to andand let y’vary over all possible assignments to bits in Si. Then Ext(x, y)i = Ext(x, y’)i = C(x)[y’|Si] = C(x)[y’] goes over all the bits of C(x) For every x 2 B short description of a string z close to C(x) • fix bitsoutside of Si to andpreserving the advantage Pry’[P(Ext(x, y’)1…i-1)=C(x)[y’] ] > ½ + ε/(2m) andis the assignment to {1…t}\Si maximizing the advantage of A • for j ≠ i, asy’varies,y’|Sj varies over only 2avalues! • Can provide (i-1) tables of 2a values to supply Ext(x,y’)1…i-1
Trevisan Extractorshort description of a string z agreeing with C(x) Output isC(x)[y’ ]w.p.½ + ε/(2m)over Y’ Y’ {0,1}log ℓ A y’
…Proof of Trevisan Extractor Up to (m-1) tables of size 2adescribe a string z that has a ½ + ε/(2m)agreement with C(x) • Number of codewords of C agreeing with z: on½ + ε/(2m) places is O(1/δ2)= O(m4) Given z: there are at most O(m4) corresponding x’s • Number of strings z with such a description: • 2(m-1)2a= 2nδ2/3 = 2k2/3 • total number of x 2 B O(m4) 2k2/3 << 2k(ε/2m) • Johnson Bound: • A binary code with distance (½ - δ2)n has at mostO(1/δ2)codewords in any ball of radius(½ - δ)n. • C has minimum distance (½ - ¼m-4)ℓ
Conclusion • Given a source of n random bits with min entropy k which is n(1) it is possible to run any BPP algorithm with and obtain the correct answer with high probability
Application: strong error reduction • L BPP if there is a p.p.t. TM M: x L Pry[M(x,y) accepts] ≥ 2/3 x L Pry[M(x,y) rejects] ≥2/3 • Want: x L Pry[M(x,y) accepts] ≥1 - 2-k x L Pry[M(x,y) rejects] ≥1 - 2-k • Already know: if we repeat O(k) times and take majority • Use n = O(k)·|y|random bits; Of them 2n-kcan bebad strings
Strong error reduction Better: Ext extractor for k = |y|3 = nδ, ε < 1/6 • pick random w R {0,1}n • run M(x, Ext(w, z)) forall z {0,1}t • take majority of answers • call w “bad” if majzM(x, Ext(w, z)) is incorrect |Prz[M(x,Ext(w,z))=b] - Pry[M(x,y)=b]| ≥ 1/6 • extractor property: at most 2k bad w • n random bits; 2nδbad strings
Strong error reduction Strings where the majority of neighbors are bad Property: every subset of size 2ksees almost all of the rhs with equal probability 2t Bad strings for input at most 1/4 Upper bound on Size: 2k All strings for running the original randomized algorithm {0,1}m {0,1}n
Two Surveys on Extractors • Nisan and Ta-Shma, Extracting Randomness: A Survey and New Constructions 1999, (predates Trevisan) • Shaltiel, Recent developments in Extractors, 2002, www.wisdom.weizmann.ac.il/~ronens/papers/survey.ps Some of the slides based on C. Umans course: www.cs.caltech.edu/~umans/cs151-sp04/index.html
Circuit Complexity • We will consider several issues regarding circuit complexity
Parallelism • Refinement of polynomial time via (uniform) circuits allow depth parallel time circuit C size parallel work depth of a circuit is the length of longest path from input to output Represents circuit latency
Parallelism • the NCHierarchy (of logspace uniform circuits): NCk = O(logk n) depth, poly(n) size circuits Bounded fan-in (2) NC = [kNCk • Aim: to capture efficiently parallelizable problems • Not realistic? • overly generous in size • Does not capture all aspects of parallelism • But does capture latency • Sufficient for proving (presumed) lower bounds on best latency What isNC0
Matrix Multiplication • Parallel complexity of this problem? • work = poly(n) • time = logk(n)? • which k? n x n matrix A n x n matrix B n x n matrix AB =
Matrix Multiplication arithmetic matrix multiplication… A = (ai, k) B = (bk, j) (AB)i,j = Σk (ai,k x bk, j) … vs. Boolean matrix multiplication: A = (ai, k) B = (bk, j) (AB)i,j = k (ai,k bk, j) • single output bit: to make matrix multiplication a language: on input A, B, (i, j) output (AB)i,j
Matrix Multiplication • Boolean Matrix Multiplication is inNC1 • level 1: compute n ANDS: ai,k bk, j • next log n levels: tree of ORS • n2 subtrees for all pairs (i, j) • select correct one and output
Boolean formulas and NC1 • Circuit for Boolean Matrix Multiplication is actually a formula. • Formula: fan-out 1. Circuit looks like a tree This is no accident: Theorem: L NC1iff decidable by polynomial-size uniform family of Boolean formulas.
Boolean formulas and NC1from small depth circuits to formulas • Proof: • convert NC1 circuitinto formula • recursively: • note: logspace transformation • stack depth log n, stack record 1 bit – “left” or “right”
Boolean formulas and NC1from forumulas to small depth circuits • convert formula of size n into formula of depth O(log n) • note: size ≤ 2depth, so new formula has poly(n) size key transformation D C C1 C0 D 1 D 0
Boolean formulas and NC1 • D any minimal subtree with size at least n/3 • implies size(D) ≤ 2n/3 • define T(n) = maximum depth required forany size n formula • C1, C0, D all size ≤ 2n/3 T(n) ≤ T(2n/3) + 3 impliesT(n) ≤ O(log n)
Relation to other classes • Clearly NCµP • P uniform poly-size circuits • NC1µLogspace on input x, compose logspace algorithms for: • generating C|x| • converting to formula • FVAL C|x|(x) • FVAL is: given formula and assignment what is the value of the output logspace composes!
Relation to other classes • NLµNC2: Claim: Directed S-T-CONN NC2 • Given directed G = (V, E) vertices s, t • A = adjacency matrix (with self-loops) • (A2)i, j = 1 iff path of length at most 2 from node i to node j • (An)i, j = 1 iff path of length at most n from node i to node j • Compute with depthlog n a tree of Boolean matrix multiplications, output entry s, t • Repeated squaring! • log2 n depth total Boolean MM
NC vs. P Can every efficient algorithm be efficiently parallelized? NC = P • Common belief: NC (P ?
P-Completeness A language L is P-Complete if: • L 2 P • Any other language in P is reducible to L via a Logspace reduction P-complete problems are the least-likely to be parallelizable if a P-complete problem is in NC, then P = NC • we use logspace reductions to show problem P-complete and we have seen Logspace in NC
Some P-Complete Problems • CVAL – Circuit value problem • Given a circuit and an assignment, what is the value of the output of circuit • Canonical P-Complete problem • Lexicographically first maximal independent set • Linear Programming • Finding a happy coloring of a graph
NC vs. P Can every uniform, poly-size Boolean circuit family be converted into a uniform, poly-size Boolean formulafamily? NC1= P Is the NC hierarchy proper: is it true that for all NCi( NCi+1 Define ACk = O(logk n)depth, poly(n) size circuits with unbounded fan-in and gates Is the following true: ACi( NCi+1 ( ACi+1 ?
Lower bounds • Recall: NP does not have polynomial-size circuits (NP P/poly) implies P≠NP • Major goal: prove lower bounds on (non-uniform) circuit size for problems in NP • Belief: exponential lower bound • super-polynomial lower bound enough for P≠NP • Best bound known: 4.5n • don’t even have super-polynomial bounds for problems in NEXP!
Lower bounds • lots of work on lower bounds for restricted classes of circuits • Formulas • Outdegree of each gate is 1 • Monotone circuits • No nots (even at the input level) • Constant Depth circuits • Polynomial size but unbounded fan-in
Counting argument for formulas • frustrating fact: almost all functions require hugeformulas Theorem [Shannon]: With probability at least 1 – o(1), a random function f:{0,1}n {0,1} requires a formula of size Ω(2n/log n).
Shannon’s counting argument • Proof (counting): • B(n) = 22n= # functions f:{0,1}n {0,1} • # formulas with n inputs + size s, is at most F(n, s) ≤ 4s2s(2n)s n+2 choices per leaf 4s binary trees with s internal nodes 2 gate choices per internal node
Shannon’s counting argument • F(n, c2n/log n) < (16n)(c2n/logn) < 16(c2n/log n)2(c2n) = (1 + o(1))2(c2n) < o(1)22n(if c ≤ ½) Probability a random function has a formula of size s = (½)2n/log n is at most F(n, s)/B(n) < o(1)
Andreev’s function • best lower bound for formulas: Theorem (Andreev, Hastad ‘93): the Andreev function requires (,,)-formulas of size at Ω(n3-o(1)).