Computability & Complexity I

Computability & Complexity I Chris Umans Caltech CBSSS

Outline • Turing Machines, languages • Halting Problem • reductions • Rice’s Theorem • Time Complexity Classes • P vs. EXP CBSSS

From previous lectures… • many equivalent models of computation • we will standardize on TMs: • consider decision problems • also called “languages” (sets of strings) input tape finite control … 1 1 0 0 1 1 0 0 0 0 1 1 read/write head q0 CBSSS

From previous lectures… • Can describe a TM to implement any procedure for which we could imagine writing a program or algorithm • but perhaps tedious to write out… • Can construct a Universal TM that recognizes the language: {<M, w> : M is a TM and M accepts w} • there is a general purpose TM whose input can be a “program” to run CBSSS

Church-Turing Thesis • the belief that TMs formalize our intuitive notion of an algorithm is: • Note: this is a belief, not a theorem. The Church-Turing Thesis everything we can compute on a physical computer can be computed on a Turing Machine CBSSS

Deciding and Recognizing • accept • reject • loop forever • L(M) = strings that M accepts • M recognizes L(M) • set of languages recognized by some TM is called Turing-recognizable or recursively enumerable (RE) • if M rejects all x  L(M) we say it decides L(M) • set of languages decided by some TM is called Turing-decidable or decidable or recursive input TM M CBSSS

Do all problems have an algorithm that solves them? decidable  RE  all languages are these containments proper? decidable all languages RE CBSSS

Undecidability • Definition of the “Halting Problem”: HALT = { <M, w> : TM M halts on input w } • HALT is Turing-recognizable (RE) • proof? Theorem: HALT is not decidable (undecidable). CBSSS

The Halting Problem HALT = { <M, w> : TM M halts on input w } Proof: • suppose TM H decides HALT • define new TM H’: on input M • if H accepts <M, M> then loop • if H rejects <M, M> then halt • consider H’ on input H’: • if it halts, then H rejects <H’, H’>, which implies it cannot halt • if it loops, then H accepts <H’, H’> which implies it must halt • contradiction. CBSSS

Diagonalization box (M, w): does M halt on w? inputs Y Turing Machines n Y The existence of H which tells us yes/no for each box allows us to construct a TM H’ that cannot be in the table. n n Y n H’ : n Y n Y Y n Y CBSSS

So far… decidable • Can we exhibit a natural language that is non-RE? • on problem set all languages RE HALT CBSSS

Reductions • Given a new problem NEW, want to determine if it is easy or hard • right now, “easy” means decidable • right now, “hard” means undecidable • One option: • prove from scratch that the problem is decidable, or • prove from scratch that the problem is undecidable (dream up a diag. argument) CBSSS

Reductions • A better option: • to prove NEW is decidable, show how to transform it into a known decidable problem OLD so that solution to OLD can be used to solve NEW. • to prove NEW is undecidable, show how to transform a known undecidable problem OLD into NEW so that solution to NEW can be used to solve OLD. • called a reduction CBSSS

Definition of reduction • More refined notion of reduction: • “many-one” reduction A B f yes yes reduction from language A to language B f no no CBSSS

Definition of reduction A B f • YES maps to YES • NO maps to NO • function f should be computable Definition: f : Σ*→ Σ* is computable if there exists a TM Mf such that on every wΣ* Mf halts on w with f(w) written on its tape. yes yes f no no CBSSS

Definition of reduction • Notation: “A many-one reduces to B” is written A ≤m B • Meaning: B is at least as “hard” as A • more accurate: B at least as “expressive” as A CBSSS

Using reductions Definition: A ≤m B if there is a computable function f such that for all w w  A  f(w)  B Theorem: if A ≤m B and B is decidable then A is decidable Proof: • decider for A: on input w, compute f(w), run decider for B, do whatever it does. CBSSS

Using reductions • Main use: given language NEW, prove it is undecidable by showing OLD ≤m NEW, where OLD known to be undecidable • proof by contradiction • if NEW decidable, then OLD decidable • OLD undecidable. Contradiction. • common to reduce in wrong direction. • review this argument to check yourself. CBSSS

Many-one reduction example • Consider the language: NONEMPTY = {<M> : L(M) Ø} • f(<M, w>) = <M’> • where M’ is TM that • on input x, if x  w, then reject • else simulate M on x, and accept if M halts • f clearly computable f yes yes f no no NONEMPTY HALT CBSSS

Many-one reduction example f • f(<M, w>) = <M’> • where M’ is TM that • on input x, if x  w, then reject • else simulate M on x, and accept if M halts • yes maps to yes? • if <M, w>  HALT then f(<M, w>)  NONEMPTY • no maps to no? • if <M, w>  HALT then f(<M, w>)  NONEMPTY yes yes f no no NONEMPTY HALT CBSSS

Undecidable problems Theorem: The language REGULAR = {<M>: M is a TM and L(M) is a regular language} is undecidable. • a regular language is set of strings described by an expression built from a finite alphabet, concatenation, union, and “*” • fact: {0,1}* is regular • fact: 0n1n is not regular CBSSS

Many-one reduction example f HALT = { <M, w> : TM M halts on input x } REGULAR = {<M>: M is a TM and L(M) is a regular language} • what should f(<M, w>) produce? yes yes f no no REGULAR HALT CBSSS

Many-one reduction example Proof: • f(<M, w>) = <M’> described below • is f computable? • YES maps to YES? • <M, w>  HALT f(M, w)  REGULAR • NO maps to NO? • <M, w>  HALT f(M, w)  REGULAR • on input x: • if x has form 0n1n, accept • else simulate M on w and accept if M halts CBSSS

Rice’s Theorem • We have seen that the following properties of TM’s are undecidable: • TM halts • TM accepts a nonempty language • TM accepts a regular language • How widespread is undecidability phenomenon? CBSSS

Rice’s Theorem Rice’s Theorem: Every nontrivial TM property is undecidable. • A TM property is a language P for which • if L(M1) = L(M2) then <M1>  P iff <M2>  P • TM property P is nontrivial if • there exists a TM M1 for which <M1>  P, and • there exists a TM M2 for which <M2>  P. CBSSS

Rice’s Theorem • The setup: • let TØ be a TM for which L(TØ) = Ø • assume <TØ>  P • technicality: if <TØ>  P then work with property complement-of-P instead of P • non-triviality ensures existence of TM M1 such that <M1>  P CBSSS

Rice’s Theorem Proof: (know: <TØ>  P and <M1>  P) • reduce from HALT(i.e. show HALT≤m P) • what should f(<M, w>) produce? • f(<M, w>) = <M’> described below: • f computable? • YES maps to YES? • <M, w>  HALT  L(f(M, w)) = L(M1) f(<M, w>)  P • on input x, • accept iff M halts on w and M1 accepts x CBSSS

Rice’s Theorem Proof: • reduce from HALT(i.e. show HALT≤m P) • what should f(<M, w>) produce? • f(<M, w>) = <M’> described below: • NO maps to NO? • <M, w>  HALT  L(f(M, w)) = L(TØ) f(M, w)  P • on input x, • accept iff M halts on w and M1 accepts x CBSSS

Computability summary • Main message: some problems have no algorithms • proof bydiagonalization • can use reductionsfrom a known undecidable problem to a new problem to prove undecidability of the new problem • undecidability a widespread phenomenon CBSSS

Complexity • So far we have classified problems by whether they have an algorithm at all. • In real world, we have limited resources with which to run an algorithm: • time • storage space • need to further classify decidable problems according to resources they require CBSSS

Complexity • Complexity Theory = study of what is computationally feasible (or tractable) with limited resources: • running time • storage space • number of random bits • degree of parallelism • rounds of interaction • others… CBSSS

Worst-case analysis • Always measure resource (e.g. running time) in the following way: • as a function of the input length • function value is the maximum quantity of resource used over all inputs of given length • called worst-case analysis CBSSS

Time complexity Definition: the running time (“time complexity”) of a TM M is a function f:N→ N where f(n) is the maximum number of steps M uses on any input of length n. • “M runs in time f(n),” “M is a f(n) time TM” CBSSS

Time complexity • We care about the behavior on large inputs. Why? • general-purpose algorithm should be “scalable” • overhead (e.g. for initialization) shouldn’t matter in big picture CBSSS

Time complexity • Measure time complexity using asymptotic notation (“big-oh notation”) • disregard lower-order terms in running time • disregard coefficient on highest order term • example: f(n) = 6n3 + 2n2 + 100n + 102781 • “f(n) is order n3” • write f(n) = O(n3) CBSSS

Time complexity Definition: TIME(t(n)) = {L : there exists a TM M that decides L in time O(t(n))} Definition: “P” or “polynomial-time” is P = k ≥ 1 TIME(nk) Definition: “EXP” or “exponential-time” is EXP = k ≥ 1 TIME(2nk) CBSSS

Time complexity • interested in a course classification of problems. For this purpose, • treat any polynomial running time as “efficient” • problems in P are “tractable” • treat any exponential running time as inefficient • problems require exponential time are “intractable” CBSSS

Time complexity • Why polynomial-time? • insensitive to particular deterministic model of computation chosen • closed under modular composition • empirically: qualitative breakthrough to achieve polynomial running time is followed by quantitative improvements from impractical (e.g. n100) to practical (e.g. n3 or n2) CBSSS

A puzzle • Find an efficient algorithm to solve the following problem: • Input: sequence of pairs of symbols e.g. (A, b), (E, D), (d, C), (B, a) • Goal: determine if it is possible to circle at least one symbol in each pair without circling upper and lower case of same symbol. CBSSS

A puzzle • Find an efficient algorithm to solve the following problem. • Input: sequence of pairs of symbols e.g. (A, b), (E, D), (d, C), (b, a) • Goal: determine if it is possible to circle at least one symbol in each pair without circling upper and lower case of same symbol. CBSSS

2SAT • This is a disguised version of the language 2SAT = {formulas in Conjunctive Normal Form with 2 literals per clause for which there exists a satisfying truth assignment} • CNF = “AND of ORs” (A, b), (E, D), (d, C), (b, a) (x1 x2)(x5  x4)(x4  x3)(x2  x1) • satisfying truth assignment = assignment of TRUE/FALSE to each variable so that whole formula is TRUE CBSSS

Algorithm for 2SAT • Build a graph with separate nodes for each literal. • add directed edge (x, y) iff formula includes clause (x  y) (equiv. to x  y) x4 x4 x3 x1 x2 x5 x3 x1 x2 x5 e.g. (x1 x2)(x5  x4)(x4  x3)(x2  x1) CBSSS

Algorithm for 2SAT Claim: formula is unsatisfiable iff there is some variable x with a path from x to x and a path from x to x in derived graph. • Proof () • edges represent implication . By transitivity of , a path from x to x means x  x, and a path from x to x means x  x. CBSSS

Algorithm for 2SAT • Proof () • to construct a satisfying assign. (if no x with a path from x to x and a path from x to x): • pick unassigned literal x with no path from x to x • assign it TRUE, as well as all nodes reachable from it; assign negations of these literals FALSE • well-defined: path from x to y and x to y implies path from y to x and y to x, implies path from x to x • consistent: path x to y (assigned FALSE) implies path from y (assigned TRUE) to x, so x already assigned at that point CBSSS

Algorithm for 2SAT • Algorithm: • build derived graph • for every pair x, x check if there is a path from x to x and from x to x in the graph • Running time of algorithm (input length n): • O(n) to build graph • O(n) to perform each check • O(n) checks • running time O(n2). 2SAT  P. CBSSS

Another puzzle • Find an efficient algorithm to solve the following problem. • Input: sequence of triples of symbols e.g. (A, b, C), (E, D, b), (d, A, C), (c, b, a) • Goal: determine if it is possible to circle at least one symbol in each pair without circling upper and lower case of same symbol. CBSSS

3SAT • This is a disguised version of the language 3SAT = {formulas in Conjunctive Normal Form with 3 literals per clause for which there exists a satisfying truth assignment} • don’t know if this problem is in P • much more on this later • for now, observe that it is in TIME(2n) CBSSS

decidable all languages • 3SAT  EXP; open whether it is in P. • 2SAT  P. • Can we at least prove that P is different from EXP? RE P EXP CBSSS

Time Hierarchy Theorem Theorem: For every proper complexity function f(n) ≥ n: TIME(f(n)) TIME(f(2n)3). • Note:P TIME(2n)  TIME(2(2n)3)  EXP • Most natural functions (and 2n in particular) are proper complexity functions. CBSSS

Time Hierarchy Theorem Theorem: For every proper complexity function f(n) ≥ n: TIME(f(n)) TIME(f(2n)3). • Proof idea: • use diagonalization to construct a language that is not in TIME(f(n)). • constructed language comes with a TM that decides it and runs in time f(2n)3. CBSSS

Computability & Complexity I