Complexity Theory Lecture 12

Complexity TheoryLecture 12 Lecturer:Moni Naor

Recap Last week: Hardness and Randomness Semi-random sources Extractors This Week: Finish Hardness and Randomness Circuit Complexity The class NC Formulas = NC1 Lower bound for Andreev’s function Communication characterization of depth

Derandomization A major research question: • How to make the construction of • Small Sample space `resembling’ large one • Hitting sets Efficient. Successful approach: randomness from hardness • (Cryptographic) pseudo-random generators • Complexity oriented pseudo-random generators

Extending the result Theorem : if E contains 2Ω(n)-unapproximable functions then BPP = P. • The assumption is an average case one • Based on non-uniformity Improvement: Theorem: If E contains functions that require size 2Ω(n)circuits (for the worst case), then E contains 2Ω(n)-unapproximable functions. Corollary: If E requires exponential size circuits, then BPP = P.

How to extend the result • Recall the worst-case to average case reduction For permanent • The idea: encode the function in a form allowing you to translate a few worst case error into random errors

Properties of a code Want a code C:{0,1}2n {0,1}2ℓ Where: • 2ℓis polynomial in 2n • C is polynomial time computable • efficient encoding • Certain local decoding properties

Codes and Hardness • Use for worst-case to average case: truth table of f:{0,1}n  {0,1} worst-case hard truth table of f’:{0,1}ℓ  {0,1} average-case hard mf: 0 1 1 0 0 0 1 0 C(mf) 0 1 1 0 0 0 1 0 0 0 0 1 0

Codes and Hardness • if 2ℓis polynomial in 2n, then f  Eimpliesf’  E • Want to be able to prove: if f’is s’-approximable, then f is computable by a size s = poly(s’) circuit

Codes and Hardness Key point: circuit C that approximates f’ implicitly defines a received word RCthat is not far from C(mf) • Want the decoding procedure D to computes f exactly RC: 0 0 1 0 1 0 1 0 0 0 1 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 C(mf): Requires a special notion of efficient decoding C D

Decoding requirements • Want that • for any received word Rthat is not far from C(m), • for any input bit 1 · i · 2n can reconstruct m(i) with probability2/3 by accessing only poly(n) locations in R Example of code with good local decoding properties: Hadamard But exponential length This gives a probabilistic circuit for f of size poly(n) ¢ size(C) + size of decoding circuit Since probabilistic circuit have deterministic version of similar size - contradiction

Extractor • Extractor: a universal procedure for “purifying” imperfect source: • The function Ext(x,y) should be efficiently computable • truly random seed as “catalyst” • Parameters: (n, k, m, t, ) source string 2kstrings Ext near-uniform seed mbits {0,1}n t bits Truly random

Extractor: Definition (k, ε)-extractor: for all random variables X with min-entropy k: • output fools all tests T: |Prz[T(z) = 1] – Pry 2R{0,1}t, xX[T(Ext(x, y)) = 1]| ≤ ε • distributions Ext(X, Ut)andUm are ε-close (L1 dist ≤ 2ε) Umuniform distribution on :{0,1}m • Comparison to Pseudo-Random Generators • output of PRG fools all efficient tests • output of extractor fools all tests

Extractors: Applications • Using extractors • use output in place of randomness in any application • alters probability of any outcome by at mostε • Main motivation: • use output in place of randomness in algorithm • how to get truly random seed? • enumerate all seeds, take majority

Extractor as a Graph Want every subset of size 2kto see almost all of the rhs with equal probability 2t Size: 2k {0,1}m {0,1}n

Extractors: desired parameters • Goals: good:optimal: short seed O(log n) log n+O(1) long outputm = kΩ(1) m = k+t–O(1) many k’sk = nΩ(1)any k = k(n) source string 2kstrings Ext near-uniform seed m bits {0,1}n t bits Allows going over all seeds

Extractors • A random construction for Ext achieves optimal! • but we need explicit constructions • Otherwise we cannot derandomize BPP • optimal construction of extractors still open • Trevisan Extractor: • idea: any string defines a function • String C over  of length ℓ define a function fC:{1… ℓ } by fC(i)=C[i] • Use NW generator with source string in place of hard function From complexity to combinatorics!

Trevisan Extractor • Tools: • An error-correcting code C:{0,1}n  {0,1}ℓ • Distance between codewords: (½ - ¼m-4)ℓ • Important: in any ball of radius ½- there are at most 1/2 codewords.  = ½ m-2 • Blocklength ℓ = poly(n) • Polynomial time encoding • Decoding time does not matter • An (a,h)-design S1,S2,…,Sm  {1…t } where • h=log ℓ • a = δlog n/3 • t=O(log ℓ) • Construction: Ext(x, y)=C(x)[y|S1]◦C(x)[y|S2]◦…◦C(x)[y|Sm]

Trevisan Extractor Ext(x, y)=C(x)[y|S1]◦C(x)[y|S2]◦…◦C(x)[y|Sm] Theorem: Extis an extractor for min-entropyk = nδ, with • output lengthm = k1/3 • seed lengtht = O(log ℓ ) = O(log n) • errorε ≤ 1/m C(x): 010100101111101010111001010 seed y

Proof of Trevisan Extractor Assume X µ{0,1}nis a min-entropykrandom variable failing toε-pass a statistical test T: |Prz[T(z) = 1] - PrxX, y  {0,1}t[T(Ext(x, y)) = 1]| > ε By applying usual hybrid argument: there is a predictor A and 1 · i · m: PrxX, y{0,1}t[A(Ext(x, y)1…i-1) = Ext(x, y)i] > ½+ε/m

The set for which A predict well Consider the set B of x’s such that Pry{0,1}t[A(Ext(x, y)1…i-1) = Ext(x, y)i] > ½+ε/2m By averaging Prx[x 2 B] ¸ε/2m Since X has min-entropyk: there are at least ε/2m 2k different x 2 B The contradiction will be by showing a succinct encoding for each x 2 B

…Proof of Trevisan Extractori, A and B are fixed If you fix the bitsoutside of Si to  andand let y’vary over all possible assignments to bits in Si. Then Ext(x, y)i = Ext(x, y’)i = C(x)[y’|Si] = C(x)[y’] goes over all the bits of C(x) For every x 2 B short description of a string z close to C(x) • fix bitsoutside of Si to  andpreserving the advantage Pry’[P(Ext(x, y’)1…i-1)=C(x)[y’] ] > ½ + ε/(2m)  andis the assignment to {1…t}\Si maximizing the advantage of A • for j ≠ i, asy’varies,y’|Sj varies over only 2avalues! • Can provide (i-1) tables of 2a values to supply Ext(x,y’)1…i-1

Trevisan Extractorshort description of a string z agreeing with C(x) Output isC(x)[y’ ]w.p.½ + ε/(2m)over Y’ Y’  {0,1}log ℓ A y’

…Proof of Trevisan Extractor Up to (m-1) tables of size 2adescribe a string z that has a ½ + ε/(2m)agreement with C(x) • Number of codewords of C agreeing with z: on½ + ε/(2m) places is O(1/δ2)= O(m4) Given z: there are at most O(m4) corresponding x’s • Number of strings z with such a description: • 2(m-1)2a= 2nδ2/3 = 2k2/3 • total number of x 2 B O(m4) 2k2/3 << 2k(ε/2m) • Johnson Bound: • A binary code with distance (½ - δ2)n has at mostO(1/δ2)codewords in any ball of radius(½ - δ)n. • C has minimum distance (½ - ¼m-4)ℓ

Conclusion • Given a source of n random bits with min entropy k which is n(1) it is possible to run any BPP algorithm with and obtain the correct answer with high probability

Application: strong error reduction • L BPP if there is a p.p.t. TM M: x  L  Pry[M(x,y) accepts] ≥ 2/3 x  L  Pry[M(x,y) rejects] ≥2/3 • Want: x  L  Pry[M(x,y) accepts] ≥1 - 2-k x  L  Pry[M(x,y) rejects] ≥1 - 2-k • Already know: if we repeat O(k) times and take majority • Use n = O(k)·|y|random bits; Of them 2n-kcan bebad strings

Strong error reduction Better: Ext extractor for k = |y|3 = nδ, ε < 1/6 • pick random w R {0,1}n • run M(x, Ext(w, z)) forall z  {0,1}t • take majority of answers • call w “bad” if majzM(x, Ext(w, z)) is incorrect |Prz[M(x,Ext(w,z))=b] - Pry[M(x,y)=b]| ≥ 1/6 • extractor property: at most 2k bad w • n random bits; 2nδbad strings

Strong error reduction Strings where the majority of neighbors are bad Property: every subset of size 2ksees almost all of the rhs with equal probability 2t Bad strings for input at most 1/4 Upper bound on Size: 2k All strings for running the original randomized algorithm {0,1}m {0,1}n

Two Surveys on Extractors • Nisan and Ta-Shma, Extracting Randomness: A Survey and New Constructions 1999, (predates Trevisan) • Shaltiel, Recent developments in Extractors, 2002, www.wisdom.weizmann.ac.il/~ronens/papers/survey.ps Some of the slides based on C. Umans course: www.cs.caltech.edu/~umans/cs151-sp04/index.html

Circuit Complexity • We will consider several issues regarding circuit complexity

Parallelism • Refinement of polynomial time via (uniform) circuits allow depth  parallel time circuit C size  parallel work depth of a circuit is the length of longest path from input to output Represents circuit latency

Parallelism • the NCHierarchy (of logspace uniform circuits): NCk = O(logk n) depth, poly(n) size circuits Bounded fan-in (2) NC = [kNCk • Aim: to capture efficiently parallelizable problems • Not realistic? • overly generous in size • Does not capture all aspects of parallelism • But does capture latency • Sufficient for proving (presumed) lower bounds on best latency What isNC0

Matrix Multiplication • Parallel complexity of this problem? • work = poly(n) • time = logk(n)? • which k? n x n matrix A n x n matrix B n x n matrix AB =

Matrix Multiplication arithmetic matrix multiplication… A = (ai, k) B = (bk, j) (AB)i,j = Σk (ai,k x bk, j) … vs. Boolean matrix multiplication: A = (ai, k) B = (bk, j) (AB)i,j = k (ai,k  bk, j) • single output bit: to make matrix multiplication a language: on input A, B, (i, j) output (AB)i,j

Matrix Multiplication • Boolean Matrix Multiplication is inNC1 • level 1: compute n ANDS: ai,k  bk, j • next log n levels: tree of ORS • n2 subtrees for all pairs (i, j) • select correct one and output

Boolean formulas and NC1 • Circuit for Boolean Matrix Multiplication is actually a formula. • Formula: fan-out 1. Circuit looks like a tree This is no accident: Theorem: L  NC1iff decidable by polynomial-size uniform family of Boolean formulas.

Boolean formulas and NC1from small depth circuits to formulas • Proof: • convert NC1 circuitinto formula • recursively: • note: logspace transformation • stack depth log n, stack record 1 bit – “left” or “right”   

Boolean formulas and NC1from forumulas to small depth circuits • convert formula of size n into formula of depth O(log n) • note: size ≤ 2depth, so new formula has poly(n) size    key transformation  D C C1 C0 D 1 D 0

Boolean formulas and NC1 • D any minimal subtree with size at least n/3 • implies size(D) ≤ 2n/3 • define T(n) = maximum depth required forany size n formula • C1, C0, D all size ≤ 2n/3 T(n) ≤ T(2n/3) + 3 impliesT(n) ≤ O(log n)

Relation to other classes • Clearly NCµP • P uniform poly-size circuits • NC1µLogspace on input x, compose logspace algorithms for: • generating C|x| • converting to formula • FVAL C|x|(x) • FVAL is: given formula and assignment what is the value of the output logspace composes!

Relation to other classes • NLµNC2: Claim: Directed S-T-CONN NC2 • Given directed G = (V, E) vertices s, t • A = adjacency matrix (with self-loops) • (A2)i, j = 1 iff path of length at most 2 from node i to node j • (An)i, j = 1 iff path of length at most n from node i to node j • Compute with depthlog n a tree of Boolean matrix multiplications, output entry s, t • Repeated squaring! • log2 n depth total Boolean MM

NC vs. P Can every efficient algorithm be efficiently parallelized? NC = P • Common belief: NC (P ?

P-Completeness A language L is P-Complete if: • L 2 P • Any other language in P is reducible to L via a Logspace reduction P-complete problems are the least-likely to be parallelizable if a P-complete problem is in NC, then P = NC • we use logspace reductions to show problem P-complete and we have seen Logspace in NC

Some P-Complete Problems • CVAL – Circuit value problem • Given a circuit and an assignment, what is the value of the output of circuit • Canonical P-Complete problem • Lexicographically first maximal independent set • Linear Programming • Finding a happy coloring of a graph

NC vs. P Can every uniform, poly-size Boolean circuit family be converted into a uniform, poly-size Boolean formulafamily? NC1= P Is the NC hierarchy proper: is it true that for all NCi( NCi+1 Define ACk = O(logk n)depth, poly(n) size circuits with unbounded fan-in  and  gates Is the following true: ACi( NCi+1 ( ACi+1 ?

Lower bounds • Recall: NP does not have polynomial-size circuits (NP  P/poly) implies P≠NP • Major goal: prove lower bounds on (non-uniform) circuit size for problems in NP • Belief: exponential lower bound • super-polynomial lower bound enough for P≠NP • Best bound known: 4.5n • don’t even have super-polynomial bounds for problems in NEXP!

Lower bounds • lots of work on lower bounds for restricted classes of circuits • Formulas • Outdegree of each gate is 1 • Monotone circuits • No nots (even at the input level) • Constant Depth circuits • Polynomial size but unbounded fan-in

Counting argument for formulas • frustrating fact: almost all functions require hugeformulas Theorem [Shannon]: With probability at least 1 – o(1), a random function f:{0,1}n {0,1} requires a formula of size Ω(2n/log n).

Shannon’s counting argument • Proof (counting): • B(n) = 22n= # functions f:{0,1}n {0,1} • # formulas with n inputs + size s, is at most F(n, s) ≤ 4s2s(2n)s n+2 choices per leaf 4s binary trees with s internal nodes 2 gate choices per internal node

Shannon’s counting argument • F(n, c2n/log n) < (16n)(c2n/logn) < 16(c2n/log n)2(c2n) = (1 + o(1))2(c2n) < o(1)22n(if c ≤ ½) Probability a random function has a formula of size s = (½)2n/log n is at most F(n, s)/B(n) < o(1)

Andreev’s function • best lower bound for formulas: Theorem (Andreev, Hastad ‘93): the Andreev function requires (,,)-formulas of size at Ω(n3-o(1)).

Complexity Theory Lecture 12

Complexity Theory Lecture 12

Presentation Transcript

Tracing Complexity Theory

Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

Segment: Computational game theory Lecture 1b: Complexity

CS151 Complexity Theory

Complexity Theory

Complexity Theory Lecture 1

CS151 Complexity Theory

Complexity Theory Lecture 2

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

Complexity Theory Lecture 10

CS151 Complexity Theory

Complexity Theory Lecture 11

CS151 Complexity Theory