350 likes | 481 Views
Fall 2008. The Chinese University of Hong Kong. CSC 3130: Automata theory and formal languages. Undecidable problems for CFGs and descriptive complexity. Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130. Decidable vs. undecidable. decidable. undecidable. “ DFA M accepts w ”.
E N D
Fall 2008 The Chinese University of Hong Kong CSC 3130: Automata theory and formal languages Undecidable problems for CFGsand descriptive complexity Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130
Decidable vs. undecidable decidable undecidable “DFAM accepts w” “TMM accepts w” “PDAM accepts w” “TM M halts on w” “TM Maccepts some input” “DFA M accepts all inputs” “TM M accepts all inputs” “TM Mand M’ accept same inputs” “PDA P accepts all inputs” ? “CFG G is ambiguous” other kinds of problems?
Computation is local q0 lotus M ›0q0l0o0t0u0s0 ›0o0q6o0t0u0s0 ›0o0k6q3t0u0s ›0o0k6r0q0u0s0 ›0o0k6r0a0q1s ›0o0k6r0qaa0☐ q6 ootus q3 oktus q0 okrus q1 okras computationtableau qacc okra The changes between rows occur in a 2x3 window
Computation histories as strings • If M halts on w, We can represent the computation tableau by a string t over alphabet G∪Q∪{#, ›} ›0q0l0o0t0u0s0 ›0o0q6o0t0u0s0 ›0o0k6q3t0u0s ›0o0k6r0q0u0s0 ›0o0k6r0a0q1s ›0o0k6r0qaa0☐ ›q0lotus#›oq6otus#...#›okrqaa☐# M accepts w qa occurs in string t qa does not occur in t M rejects w
Undecidable problems for PDAs • Theorem • Proof: We will show that ALLPDA = {〈P〉: Pis a PDA that accepts all inputs} The language ALLPDAis undecidable. If ALLPDAcan be decided, so can ATM.
Undecidable problems for PDAs accept ifP accepts all inputs A 〈P〉 reject if not accept ifM rej/loops w 〈M〉, w reject ifM accepts w A 〈P〉 P accepts all inputs if M rejects or loops on w P does not accept some input if M accepts w
Undecidability via computation histories P accepts all inputs if M rejects or loops on w P does not accept some input if M accepts w candidate computation history of M on w reject acceptinghistories P reject ›q0lotus#›oq6otus#...#›okrqaa☐ accept every other string M accepts w P rejects t M rej/loops on w no accepting histories P accepts everything
Undecidability via computation histories • Task: Design a PDA P such that candidate computation history t of M on w reject acceptinghistories P Expect t of the form w1#w2#...#wk# ›0q0l0o0t0u0s0 ›0o0q6o0t0u0s0 ›0o0k6q3t0u0s ›0o0k6r0q0u0s0 ›0o0k6r0a0q1s ›0o0k6r0qaa0☐ If w1 ≠›q0w , accept t. If t does not contain qa, accept t. If two consecutive blocks wi#wi+1 do not correspond to a propertransition of M, accept t.
Implementing P On input t: Nondeterministically make one of the following choices Look in the first block w1 of t If w1 ≠›q0w , accept t. Look for the appearance of qa If t does not contain qa, accept t. ›0o0k6q3t0u0s # ›0o0k6r0q0u0s0 Look for the beginning of the ith block of t If two consecutive blocks wi#wi+1 do not represent a valid transition of M, accept t. valid transition wi#wi+1 represents a valid transition if all 3x2 windows correspond to possible transitions of M
Valid and invalid windows … 6q3t0u0 … … 0k6t0q0 …0 … 6t3t0u0 … … 0t6t0u0…0 invalid window valid window … 6c3a0t0 … … 0c6a0p0…0 … 6t3q3u0 … … 0t6a0q7 …0 invalid window valid if d(q3, u) = (q7, a, R) … 6t3t0u0 … … 0t6t0q3 …0 … 6c3a0t0 … … 0b6a0t0…0 valid window valid window
Implementing P • To check this it is better to write t in boustrophedon wi#wi+1 represent a valid transition of M ›0q0l0o0t0u0s0 ›0o0q6o0t0u0s0 ›0o0k6q3t0u0s ›0o0k6r0q0u0s0 ›0o0k6r0a0q1s ›0o0k6r0qaa0☐ ›q0lotus#›oq6otus#...#›okrqaa☐# ›q0lotus#sutoq6o›#...#›okrqaa☐# Alternate rows are written in reverse
Implementing P wi#wi+1 represent a valid transition of M ›0o0k6q3t0u0s # ›0o0k6r0q0u0s0 …#›okq3tus#suq0rko›#… # proper transition wi wi+1 Nondeterministically look for beginning of 3x2 window Remember first row of window in state Use stack to detect beginning of second row Remember second row of window in state If window is not valid, accept, otherwise reject.
The Post Correspondence Problem • Input: A set of tiles like this • Given an infinite supply of such tiles, can you match top and bottom? bab cc a ab baa a a baba bab e c ab a ab baa a bab e c ab c ab bab cc a baba
Undecidability of PCP • Theorem • Proof: We will show that PCP = {D: Dis a collection of tiles that contains a top-bottom match} The language PCPis undecidable. If PCPcan be decided, so can ATM.
Undecidability of PCP • Idea: Matches represent accepting histories 〈M〉, w T (collection of tiles) If M accepts w, then T can be matched If M rej/loops on w, then T cannot be matched ›q0lotus#›oq6otus#›okq3t...#›qa☐☐☐☐ ›q0lotus#›oq6otus#›okq3r...#›qa☐☐☐☐ e›q0lotus# ›q0l ›oq6 o o t t u u s s # # › › oq60 okq3 …
Some technicalities • We will assume that • Before accepting, TM M erases its tape • One of the PCP tiles is marked as a starting tile • These assumptions can be made without loss of generality (we will see why later) s bab cc a ab baa a a baba c ab
Undecidability of PCP • To decide ATM, we construct these tiles for PCP 〈M〉, w T (collection of tiles) If M accepts w, then T can be matched If M rej/loops on w, then T cannot be matched s ☐# # a a #›qa e ☐ e e ›q0w# a1qia3 b1b2b3 for each valid window of this form for all a inG∪{#, ›} “final” tiles
Undecidability of PCP ›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐ accepting computation history ›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐ ›q0lotus#›oq6otus#...#›oq1☐☐☐#›qa☐☐☐☐ s ☐# # a a #›qa e ☐ e e ›q0w# a1qia3 b1b2b3
Undecidability of PCP • If M rejects on input w, then qr appears on bottom at some point, but it cannot be matched on top • If M loops on w, then matching keeps going forever s ☐# # a a #›qa e ☐ e e ›q0w# a1qia3 b1b2b3
A technicality • We assumed that one tile marked as starting tile • We can remove assumption by changing tiles a bit s a baba bab cc c ab *a* *b*a*b*a b*a*b* *c*c c* *a*b * “starting tile”begins with * “ending tile” matches last * “middle tiles”
Ambiguity of CFGs • Theorem • Proof: We will show that AMB = {G: Gis an ambiguous CFG} The language AMBis undecidable. If AMBcan be decided, so can PCP.
Ambiguity of CFGs • Proof:Step 1: Number the tiles T G (collection of tiles) (CFG) If T can be matched, then G isambiguous If T cannot be matched, then G is unambiguous 2 3 1 bab cc a ab c ab
Ambiguity of CFGs T G (CFG) (collection of tiles) a,b,c,1,2,3 Terminals: Variables: S, T, B Productions: T → babT1 T → cT2 S → aT3 B → ccS1 B → abB2 B → abB3 S → T | B 2 3 1 T → e a ab bab cc c ab B → e
Ambiguity of CFGs • Each sequence of tiles gives two derivations • If the tiles match, these two derive the same string 2 2 1 bab cc c ab c ab S → T → babT1 → babcT21→ babcc221 S → B → ccB1 → ccabB21→ ccabab221
Ambiguity of CFGs T G • Argue by contradiction: • If G is ambiguous then ambiguity must look like this (collection of tiles) (CFG) ✓ If T can be matched, then G isambiguous ✓ If T cannot be matched, then G is unambiguous S S Then n1...ni = m1…mj T B So there is a match a1 n1 b1 m1 T B n1 n2 ni … … a2 n2 b2 m2 a2 b2 ai bi a1 b1 … T B ai ni bj mj
Roulette • In a game of roulette, you bet $1 on even or odd • The outcome is a number between 1 and 36 • If you guessed correctly, double your bet • Otherwise, you lose 6 17 5 16 2 5 11 8 31 18 4 7 5 2 29 8 12 1
Randomness • If we write E for even, O for odd, what we saw is • It seems the wheel is crooked. If it wasn’t we would expect something more like • But both sequences have same probability! Why does one appear less random than the other? OEOEOEOEOEOEOEOEOEOE OOOEEOEOOEOEOOOEEEOE
Turing Machines with output • The goal of a Turing Machine with output is to write something on the output tape and go into state qhalt M work tape … output tape … 0 0 1 1 0 0
Descriptive complexity • The descriptive complexity K(x)of x is the shortestdescription of any Turing Machine that outputs x • We will assume x is long K(x) = minM: M outputs x|〈M〉| Andrey Kolmogorov (1903-1987)
Example of descriptive complexity • Turing machine implementation: x = “OE...OE” = (OE)n Repeat for n steps: At odd step print O At even step print E (n = 1,000,000,000) Write nin binary on work tape ≈ log2n states While work tape not equal to 0, Subtract 1 from number on work tape If number is odd, write O If number is even, write E ≈ 3states ≈ 15states ≈ 2states K(x) ≈ log2n + 20
Bounds on descriptive complexity • Theorem 1 • Proof: Let x = x1...xnand consider the following TM: n + O(1) For every x of length n, K(x) is at most O(n) Write x1 to output tape and move right Write x2 to output tape and move right ... Write xn to output tape and halt.
Descriptive complexity and randomness • Theorem 2 For 99% of strings of length n, K(x) ≥ n– 10. n– 10 n+ O(1) 0 O(logn) “randomness-deficient” strings “simple” strings 111...1, OEOE...OE, 3.14159265, 1212321234321 “random-looking” strings
Evaluating randomness • How do we know if the casino is crooked? • Idea: Compute K(sequence).If much less than n, indicates sequence is not random 8 6 12 17 5 16 2 5 31 11 8 14 31 18 13 11 4 5 2 12 29 8 12 1
Computing descriptive complexity • Proof: Suppose it is, fix n and consider this TM M:Let x = output of M, thenSo (when n is large) we get K(x) > K(x), impossible! It is not possible to compute K(x). Output the first x of length n (in lexicographic order) such that K(x) ≥ n– 10 K(x) ≥ n– 10 but K(x) ≤ |〈M〉| = log2n + O(1)