420 likes | 573 Views
Notes on assembly verification in the aTAM. Days 22, 24 and 25 of Comp Sci 480. Introduction. We are going to study algorithms that verify tile sets E.g., write an algorithm that determines if a particular is uniquely produced by a tile set Critical resource: running time
E N D
Notes on assembly verification in the aTAM Days 22, 24 and 25 of Comp Sci 480
Introduction • We are going to study algorithms that verify tile sets • E.g., write an algorithm that determines if a particular is uniquely produced by a tile set • Critical resource: running time • Important problems: design of self-assembly simulators
Warm up • Problem 0 (Assembly verification = AV) • Input: A temperature τ and an assembly A • Tile set inferred from the assembly A • Output: Yes if A is producible, and No otherwise • Design an algorithm that solves problem 0 as efficiently as possible • Is every assembly producible? • If yes, then running time would be O(1) • Not every assembly is producible
The algorithm Algorithm Greedy-Grow (A, τ) • Start with A' = A(0,0) // (0,0) always has the seed • While there is a site (x,y) with A(x,y) = t and A'(x,y) = empty such that t can be added to A' at (x,y), add it. • If A ≠ A', then output "A is not producible" • Else output "A is producible" • Running time: ??? • O(|A|2) • For each tile addition, the perimeter of A' is searched • O(|A|) • Don’t search the perimeter over and over again • Maintain a list of all sites at which a tile could be placed immediately • |A| tile additions • O(1) amount of work per tile addition
Easy and hard problems • Some problems are “easy”, some are “hard” • We’ll classify problems as “easy” if one can exhibit an algorithm that solves it with running time O(n4), where n is the size of the input • A “hard” problem is such that, no matter how clever the programmer, any algorithm that solves it has running time Ω(2n)
Easy problem example • Input: a list of numbers x0, x1, …, xn-1 and a number x • Output: Yes if x = xi for some i = 0, …, n-1 and No otherwise • Algorithm: linear search • Running time O(n) (worst, best and average case)
Another easy problem • Input: a temperature τ and an assembly A • Tile set inferred from the assembly A • Output: Yes if A is producible and No otherwise • Algorithm: use the Greedy-Grow algorithm • Running time is O(n), where n = |A| (size of input)
Hard problem example • Notation: • x V y = “x or y” // x || y • x Λ y = “x and y” // x && y • ¬x = “not x” // !x • Input: • n boolean variables x0, …, xn-1, • m clauses Cj • Disjunction (“or”) of three boolean literals • Boolean literal = a boolean variable, OR its negation • That is, Cj = (lp V lq V lr), where 0 ≤ p < q < r ≤ n-1 and lp = xp or ¬xp, lq = xq or ¬xq, lr = xr or ¬xr • A formula φ = C0Λ C1Λ ∙∙∙ Λ Cm-1 • Conjunction (“and”) of the clauses • Output: Yes if there is a way to assign boolean values to the variables x0, … xn-1 so as to make φ true and No otherwise • Example: φ = (x0 V ¬x1 V ¬x2) Λ (x0 V x1 V x3) • Output: Yes (setting x0 to TRUE satisfies the formula) • This problem is probably very hard • Not proven!
3SAT • The previous problem is usually called “3SAT” • Like I said, it is probably a very difficult problem for a computer program to solve • Most likely: any algorithm you write that solves 3SAT (finds a satisfying assignment, or reports that one doesn’t exist) will run in time Ω(cn), where c > 1 • Try it!
3SAT – What not to do • This doesn’t work… • For a given 3SAT formula φ… • Enumerate all possible True / False assignments for the n variables. • If one True / False assignment satisfies φ, then output Yes, otherwise output No • This algorithm has running time… • ? • O(2n)
3SAT is useful • We will use 3SAT to prove that certain problems in self-assembly are hard
Hw5 • Problem 1: self-assembly of “Pacman” shapes • What defines a thin Nxk Pacman shape? • N must be odd • k < log N / (log log N - log log log N) • Row 0 is 2k points wide • Row N/2 is k points wide • Row N-1 is 2k points wide • For 0 < i ≤ N/2, row i can be no wider than row i-1 • For N/2 < i ≤ N-1 , row i-1 can be no wider than row i
Pacman example Nxk Pacman shape, where N = 19, and k = 5 (not thin, but oh well) Must be 10 tiles wide Must be 5 tiles wide Must be 10 tiles wide Seed tile, placed at the origin If T uniquely produces some Nxk thin Pacman shape, then |T| ≥ ??? Note: for particular N,k values, there are many thin Pacman shapes.
3SAT warm up • Is the following formula “satisfiable”? • I.e., can you set the variables so that the formula is true? φ = (x0 V ¬x2 V ¬x4) Λ (¬x1 V x3 V x5) Λ (x2 V ¬x3 V x5) Λ (¬x4 V ¬x5 V x7) Λ (¬x2 V ¬x5 V ¬x6) Λ (¬x0 V x3 V ¬x7)
Two problems • Problem 1 (Unique assembly verification = UAV) • Input: A temperature τ and an assembly A • Tile set inferred from the assembly A • Output: YES if is A uniquely produced and NO otherwise • Problem 2 (Unique shape verification = USV) • Input: a tile set T, a temperature τ and a shape X • Output: YES if X is uniquely produced by T and NO otherwise • Design algorithms to solve these problems (one algorithm per problem) • What are the running time complexities? • Which one is easy? Which one is hard?
Algorithm for UAV Algorithm Unique-Assembly (A, T, τ) // Adleman, et. al., 2002 • Let A' = Greedy-Grow(A, τ)If A ≠ A', then A is not produced. • For all non-empty sites (x,y), test whether any tile t can be added at an adjacent site.If YES, then A is not terminal. • For all non-empty sites (x,y), let A/(x,y) be the assembly A with the tile at (x,y) removed.Let A'' = Greedy-Grow(A/(x,y), τ).If a tile t ≠ A(x,y) can be added to A'' at (x,y), then A is not uniquely produced. • If A does not fail any of the above three tests, then A is uniquely produced and terminal.
4 6 5 3 7 1 2 0 Algorithm for UAV Algorithm Unique-Assembly (A, T, τ) // Adleman, et. al., 2002 • Let A' = Greedy-Grow(A, τ)If A ≠ A', then A is not produced. • For all non-empty sites (x,y), test whether any tile t can be added at an adjacent site.If YES, then A is not terminal. • For all non-empty sites (x,y), let A/(x,y) be the assembly A with the tile at (x,y) removed.Let A'' = Greedy-Grow(A/(x,y), τ).If a tile t ≠ A(x,y) can be added to A'' at (x,y), then A is not uniquely produced. • If A does not fail any of the above three tests, then A is uniquely produced and terminal. Is this step necessary? YES! Stable, but not producible at temperature 2…
Running time complexity • What is the running time for the Unique-Assembly algorithm? • Step 1 (is producible?): O(|A|) • Step 2 (is terminal?): O(|A|*|T|) • Step 3 (checking locations): O(|A|2) • |A| calls to Greedy-Grow • Running time: O(|A|*|T| + |A|2)
A special case • If the temperature is 1, then we can do much better than O(|A|*|T| + |A|2) • Doty gives a special algorithm for verifying unique production at temperature 1 • Running time: O(|A|*log |T|) • Proven in ~2012 • Details omitted • See: http://www.dna.caltech.edu/~ddoty/papers/phsa.pdf
An observation • The unique assembly verification algorithms also work in 3D • We’ll be reminded of this later…
A really difficult problem • How do we design an algorithm for USV? • The problem: • In USV, a tile set may produce exponentially many assemblies that all have the same shape • Can’t do a brute-force verification over all possible assemblies
A different approach • Don’t try to solve USV directly by designing an algorithm that can determine if a tile set uniquely produces a shape • Use the USV problem to solve another (probably) hard problem… • 3SAT!
Solve 3SAT with self-assembly • Let φ be a 3SAT formula with n variables and m clauses • Let’s design a tile set Tφ that “solves” the 3SAT formula φ • Basic idea (two phases): • Tφ will generate all possible True/False assignments to the n variables • For each True/False assignment of the n variables, Tφ will determine whether or not that assignment solves φ (the assembly in which this happens should be rectangular) • If φ is NOT solved by some True/False assignment, don’t place a tile in the corner (the resulting shape is NOT a rectangle)
Something from last time • Can we prove the following “easily”?:log log N < log N / (log log N – log log log N) • Easy, once you realize this: log(a / b) = log a - log b • log N / (log log N – log log log N) = log N / log (log N / log log N) > log N / (log N / log log N) = log N * log log N / log N = log log N
More efficient algorithm for UAV? Algorithm Unique-Assembly (A, T, τ) // Adleman, et. al., 2002 • Let A' = Greedy-Grow(A, τ)If A ≠ A', then A is not produced. • For all non-empty sites (x,y), test whether any tile t can be added at an adjacent site.If YES, then A is not terminal. • Let A' be an empty assembly. While there is a site (x,y) with A(x,y) = t and A'(x,y) = empty such that t can be added to A' at (x,y), add it. If at any point in this process, two tile types could be placed at (x,y), then A is not uniquely produced. • If A does not fail any of the above three tests, then A is uniquely produced and terminal.
More efficient algorithm for UAV? Algorithm Unique-Assembly (A, T, τ) // Adleman, et. al., 2002 • Let A' = Greedy-Grow(A, τ)If A ≠ A', then A is not produced. • For all non-empty sites (x,y), test whether any tile t can be added at an adjacent site.If YES, then A is not terminal. • Let A' be an empty assembly. While there is a site (x,y) with A(x,y) = t and A'(x,y) = empty such that t can be added to A' at (x,y), add it. If at any point in this process, two tile types could be placed at (x,y), then A is not uniquely produced. • If A does not fail any of the above three tests, then A is uniquely produced and terminal. Running time: ?
More efficient algorithm for UAV? Algorithm Unique-Assembly (A, T, τ) // Adleman, et. al., 2002 • Let A' = Greedy-Grow(A, τ)If A ≠ A', then A is not produced. • For all non-empty sites (x,y), test whether any tile t can be added at an adjacent site.If YES, then A is not terminal. • Let A' be an empty assembly. While there is a site (x,y) with A(x,y) = t and A'(x,y) = empty such that t can be added to A' at (x,y), add it. If at any point in this process, two tile types could be placed at (x,y), then A is not uniquely produced. • If A does not fail any of the above three tests, then A is uniquely produced and terminal. Running time: O(|A|) + O(|A|*|T|) + O(|A|) = O(|A|*|T|) Does it work?
NO! Counter-example… Suppose we have a tile set that can produce two assemblies like this… 5 v w 4 v w x y u 3 u 3 The previously shown (efficient) algorithm for UAV would sometimes say that either of these assemblies is uniquely produced S 1 2 S 1 2
Variable tiles – one for each of the n variables * * T F F F SAT TL TL T T T T F F F F F T TL * OK Cj Cj OK xn-1 TL * xn-1 1 * OK OK Cj Cj OK OK 1xi xi 1xi 1xi 1xi 1xi 1xi 1xi 1xi Xn-1 * Cj Cj OK If xi = 1 Cj true Otherwise 0 * OK OK Cj Cj OK OK xi 0xi 0xi 0xi 0xi 0xi 0xi 0xi x0 x1 * * Cj Cj OK x0 0xi x0 If xi = 0 Cj true Otherwise * x0 * * C0 C0 Cm-1 Cm-1 * BL BL C0 C0 C1 Cm-1 BR BR
Clause tiles – one for each of the m clauses * * T F F F SAT TL TL T T T T F F F F F T TL * OK Cj Cj OK xn-1 TL * xn-1 1 * OK OK Cj Cj OK OK 1xi xi 1xi 1xi 1xi 1xi 1xi 1xi 1xi Xn-1 * Cj Cj OK If xi = 1 Cj true Otherwise 0 * OK OK Cj Cj OK OK xi 0xi 0xi 0xi 0xi 0xi 0xi 0xi x0 x1 * * Cj Cj OK x0 0xi x0 If xi = 0 Cj true Otherwise * x0 * * C0 C0 Cm-1 Cm-1 * BL BL C0 C0 C1 Cm-1 BR BR
Assignment tiles – these tiles attach non-deterministically to “guess” a true/false assignment for the n variables (2n possible assemblies) * * T F F F SAT TL TL T T T T F F F F F T TL * OK Cj Cj OK xn-1 TL * xn-1 1 * OK OK Cj Cj OK OK 1xi xi 1xi 1xi 1xi 1xi 1xi 1xi 1xi Xn-1 * Cj Cj OK If xi = 1 Cj true Otherwise 0 * OK OK Cj Cj OK OK xi 0xi 0xi 0xi 0xi 0xi 0xi 0xi x0 x1 * * Cj Cj OK x0 0xi x0 If xi = 0 Cj true Otherwise * x0 * * C0 C0 Cm-1 Cm-1 * BL BL C0 C0 C1 Cm-1 BR BR
Computation tiles – these tiles initially bind via south and west. Only one of the tile types per row is created * * T F F F SAT TL TL T T T T F F F F F T TL * OK Cj Cj OK xn-1 TL * xn-1 1 * OK OK Cj Cj OK OK 1xi xi 1xi 1xi 1xi 1xi 1xi 1xi 1xi Xn-1 * Cj Cj OK If xi = 1 Cj true Otherwise 0 * OK OK Cj Cj OK OK xi 0xi 0xi 0xi 0xi 0xi 0xi 0xi x0 x1 * * Cj Cj OK x0 0xi x0 If xi = 0 Cj true Otherwise * x0 * * C0 C0 Cm-1 Cm-1 * BL BL C0 C0 C1 Cm-1 BR BR
Propagation tiles – once a clause is satisfied (made to be true), then we propagate the OK signal to the top of the assembly * * T F F F SAT TL TL T T T T F F F F F T TL * OK Cj Cj OK xn-1 TL * xn-1 1 * OK OK Cj Cj OK OK 1xi xi 1xi 1xi 1xi 1xi 1xi 1xi 1xi Xn-1 * Cj Cj OK If xi = 1 Cj true Otherwise 0 * OK OK Cj Cj OK OK xi 0xi 0xi 0xi 0xi 0xi 0xi 0xi x0 x1 * * Cj Cj OK x0 0xi x0 If xi = 0 Cj true Otherwise * x0 * * C0 C0 Cm-1 Cm-1 * BL BL C0 C0 C1 Cm-1 BR BR
Status tiles –track the status of the whole formula being satisfied. Once a false clause is found, the formula cannot be made true after that. * * T F F F SAT TL TL T T T T F F F F F T TL * OK Cj Cj OK xn-1 TL * xn-1 1 * OK OK Cj Cj OK OK 1xi xi 1xi 1xi 1xi 1xi 1xi 1xi 1xi Xn-1 * Cj Cj OK If xi = 1 Cj true Otherwise 0 * OK OK Cj Cj OK OK xi 0xi 0xi 0xi 0xi 0xi 0xi 0xi x0 x1 * * Cj Cj OK x0 0xi x0 If xi = 0 Cj true Otherwise * x0 * * C0 C0 Cm-1 Cm-1 * BL BL C0 C0 C1 Cm-1 BR BR
An example • Here’s an example formula:φ = (x0 V x1 V ¬x2) Λ (x0 V ¬x1 V x2) Λ (¬x0 V x1 V ¬x2) • Can φ be solved? • Can you assign values to the 3 variables that make all the clauses of φ true? • Sure! • x0 = true, x1 = true and x2 = false • Not every true/false assignment works, e.g., x0 = true, x1 = false and x2 = true • We can convert φ into a tile set Tφ and use the tile assembly model to solve φ
* * * * * * * * T T T T T T T T T T T F T T T T T T T T T T T F F T F T T F T T SAT SAT SAT SAT SAT SAT x2 x2 x2 x2 x2 x2 x2 x2 0 1 1 0 0 1 0 1 * * * * * * * * * * * * * * * * * * * * * * * * x1 x1 x1 x1 x1 x1 x1 x1 1 0 0 1 1 0 0 1 OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK OK C1 C1 C1 x0 x0 x0 x0 x0 x0 x0 x0 0 1 0 1 1 0 1 0 C0 C0 C0 C0 C0 C0 C0 C1 C2 C2 C1 C1 C2 C2 C2 C2 C1 * * * * * * * * * * * * * * * * C0 C0 C0 C0 C0 C0 C0 C0 C1 C1 C1 C1 C1 C1 C1 C1 C2 C2 C2 C2 C2 C2 C2 C2 * * * * * * * * φ = (x0 V x1 V ¬x2) Λ (x0 V ¬x1 V x2) Λ (¬x0 V x1 V ¬x2) All possible true/false assignments tested… Each true/false assignment represented by a different terminal assembly
Tile complexity • If φ is a 3SAT formula with n variables and m clauses, then what is |Tφ| ?? • |Tφ| = O(m + n) • Time complexity to create Tφ: O(m + n)
What does this mean? • For each 3SAT formula φ, we can create a tile set Tφ that “solves” φ in a special way • I.e., either placing (or NOT placing) the upper-right corner tile depending on whether φ is solved by a particular True/False assignment • If Tφ does not uniquely produce a shape, then we can conclude that φ CAN be solved
Just pretend • Now pretend that there is an algorithm, say Unique-Shape, that solves USV in time O(|X|) • Input: a tile set T, a temperature τ and a shape X • Output: Yes if T uniquely produces a shape (also output the points in the shape in this case) and No otherwise (no output) • How do we use Unique-Shape in an algorithm, say Solve-3SAT, that can solve 3SAT with running time “O(n)”?
Solve 3SAT using USV Algorithm Solve-3SAT(φ) // φ is a 3SAT formula with n variables // and m clauses • Create a tile set Tφ as previously discussed. • If Unique-Shape(Tφ) outputs No then output Yes. • If Unique-Shape(Tφ) outputs Yes then output Yes if the shape is a rectangle, and No if the shape is a rectangle with the top-right corner missing. Running time of Solve-3SAT: O(m + n) + “O(|X|)” + “O(|X|)” + “O(1)” = O(m + n) + “O(1)” = O(m + n) + O(1) = O(n) But this means 3SAT is an “easy” problem, because we have an algorithm that computes it in time O(n) In other words, assuming USV is easy, so too is 3SAT. Contradiction! Thus, USV is probably not easy (so an efficient Unique-Shape probably doesn’t exist).
Summary • Assembly verification (2D/3D) • Easy • Unique assembly verification (2D/3D) • Easy • Unique shape verification (2D/3D) • Difficult