510 likes | 626 Views
Insert « Academic unit» on every page: 1 Go to the menu «Insert» 2 Choose: Date and time 3 Write the name of your faculty or department in the field «Footer» 4 Choose «Apply to all". Parameterized Algorithms Randomized Techniques. Bart M. P. Jansen. August 18th 2014, Będlewo.
E N D
Insert«Academic unit» on every page:1 Go to the menu «Insert»2 Choose: Date and time3 Write the name of your faculty or department in the field «Footer» 4 Choose «Apply to all" Parameterized AlgorithmsRandomized Techniques Bart M. P. Jansen August 18th 2014, Będlewo
Randomizedcomputation • For sometasks, finding a randomizedalgorithm is mucheasierthanfinding a deterministicone • We consideralgorithmsthat have access to a stream of uniformly random bits • So we do notconsiderrandomlygeneratedinputs • The actions of the algorithmdepend on the values of the random bits • Different runs of the algorithmmaygive different outcomes, for the same input
Monte Carlo algorithms • A Monte Carloalgorithmwithfalsenegativesandsuccessprobability is an algorithm for a decision problem that • given a no-instance, always returns no, and • given a yes-instance, returns yeswith probability • Since the algorithm is always correct on no–instances, but mayfail on yes-instances, it has one-sided error • If is a positive constant, we canrepeat the algorithm a constantnumber of times • ensurethat the probability of failure is smaller than the probabilitythat, cosmicradiationcauses a bit to flip in memory • (Whichwouldinvalidate even a deterministicalgorithm) • If is not a constant, we can also boost the success probability
Independent repetitionsincreasesuccessprobability • Suppose we have a Monte Carlo algorithmwithone-sided error probability, which may depend on • For example, • If we repeat the algorithm times, the probability that all runs fail is at most since • Probability ≥ that the repeated algorithm is correct • Using trials gives success probability • For example,
Colorcoding • Randomlyassigncolorsto the input structure • Ifthere is a solutionand we are luckywith the coloring, every element of the solution has received a different color • Thenfindanalgorithmtodetectsuchcolorfulsolutions • Solutions of elementswithpairwise different colors
The odds of gettinglucky • Lemma. • Let be a set of size , and let have size • Let be a coloring of the elements of , chosen uniformly at random • Each element of is colored with one of colors, uniformly and independently at random The probability that the elements of are colored with pairwise distinct colors is at least • Proof. • There are possible colorings • In of them, all colors on are distinct • We used
The Longest Path problem Input: An undirected graph and an integer Parameter: Question: Is there a simple path on vertices in ? • A solution is a -path • The LongestPathproblem is a restrictedversion of the problem of findingpatterns in graphs
ColorcodingforLongestPath • Color the vertices of randomly with colors • We want todetect a colorful-pathifoneexists • Usedynamicprogramming over subsets • For everysubset of colorsand vertex , define = trueiffthere is a colorfulpathwhosecolors are andthat has as anendpoint
The dynamic programming table • For everysubset of colorsand vertex , define = trueiffthere is a colorfulpathwhosecolors are andthat has as anendpoint • Colorful-pathif = trueforsome ,true ,false ,true
A recurrencetofill the table • If is a singleton set, containing some color : = true if and only if • If: if false otherwise • Fill the table in time
Randomized algorithmforLongestPath • AlgorithmLongPath(Graph, integer ) • repeat times: • Color the vertices of uniformly at random with colors • Fill the DP table • if such that = true then return yes • return no • By standard DP techniques we canconstruct the path as well • For eachcell, store a backlink to the earliercellthatdetermineditsvalue
Analysis for the Longest Path algorithm • Running time is is • Bythe get-luckylemma, ifthere is a -path, itbecomescolorfulwithprobability • If the coloringproduces a colorful-path, the DP finds it • Bythe independent repetitionlemma, repetitions give constant success probability • Theorem. There is a Monte Carlo algorithm for Longest Path with one-sided error that runs in time and has constant success probability
Discussion of colorcoding • When doingdynamicprogramming, colorcodingeffectivelyallowsustoreduce the number of statesfrom • keeping track of allverticesvisitedby the path, , to • keeping track of allcolorsvisitedby the path, • The techniqueextendstofindingsize-occurrences of other “thin” patterns in graphs • A size-patterngraph of treewidthcanbe found in time , with constant probability
The Subgraph Isomorphism problem Input: A host graph and pattern graph (undirected)Parameter: Question:Does have a subgraph isomorphic to ? Does contain ?
Background • The traditional color coding technique gives FPT algorithms for Longest Path • Even for Subgraph Isomorphism when the pattern graph has constant treewidth • If the pattern graph is unrestricted, we expect that no FPT algorithm exists for Subgraph Isomorphism • It generalizes the -Clique problem • Canonical W[1]-complete problem used to establish parameterized intractability (more later) • If the host graph (and therefore the pattern graph ) has constant degree, there is a nice randomized FPT algorithm
Random 2-coloring of host graphs • Suppose is a host graph that contains a subgraph isomorphic to a connected -vertex pattern graph • Color the edges of uniformly independently at random with colors red () and blue () • If all edges of are colored red, and all other edges incident to are colored blue, it is easy to identify • The pattern occurs as a connected component of • Isomorphism of two -vertex graphs in time
Probability of isolating the pattern subgraph • Let be a -vertex subgraph of graph • A 2-coloring of isolates if the following holds: • All edges of are red • All other edges incident to are blue • Observation. If the maximum degree of is , the probability that a random 2-coloring of isolates a fixed -vertex subgraph is at least • There are at most edges incident on • Each such edge is colored correctly with probability
RandomizedalgorithmforSubgraphIsomorphism • AlgorithmSubIso(Host graph, connectedpatterngraph) • Let be the maximum degree of • Let be the number of vertices in • repeat times: • Color the edges of uniformly at random with colors R, B • foreach-vertex connected component of : • if is isomorphicto, then return yes • return no • Easy toextend the algorithmtodisconnectedpatterns • Theorem.There is a Monte Carlo algorithm for SubgraphIsomorphismwith one-sided error and constant success probability. For -vertex pattern graphs in a host graph of maximum degree , the running time is
The -Clustering problem Input: A graph and an integer Parameter: Question:Is there a set of at most adjacencies such that consists of disjoint cliques? • Such a graph is called a -cluster graph
How to color • -Clustering looks for a set of (non-)edges, instead of vertices • We solve the problem on general graphs • By randomly coloring the input, we again hope to highlight a solution with good probability, making it easier to find • We color vertices of the graph
Proper colorings • A set of adjacencies is properly colored by a coloring of the vertices if: • For all pairs , the colors of and are different • As before, two crucial ingredients: • What is the probability that a random coloring has the desired property? • How to exploit that property algorithmically? • We assign colors to the vertices and hope to obtain a property for the (non-)edges in a solution • This allows us to save on colors
Probability of finding a proper coloring • Lemma. If the vertices of a simple graph with edges are colored independently and uniformly at random with colors, then the probability that is properly colored is at least • Corollary. If a -Clustering instance has a solution set of adjacencies, the probability that is properly colored by a random coloring with colors is at least For constant success probability, repetitions suffice
Detecting a properly colored solution (I) • Suppose properly colors a solution of • The graph is a -cluster graph • For ,let be the vertices colored • As is properly colored, no (non-)edge of has both ends in • No changes are made to by the solution • is an induced subgraph of a -cluster graph • For all , the graph is a -cluster graph • consists of cliques that are not broken by the solution • Observation. The -coloring partitions into cliques that are unbroken by the solution
Detecting a properly colored solution (II) • For each of the cliques into which is partitioned, guess into which of the final clusters it belongs • For each guess, compute the cost of this solution • Count edges between subcliques in different clusters • Count non-edges between subcliques in the same cluster • Total of guesses, polynomial cost computation for each • Running time is to detect a properly colored solution, if one exists • Using dynamic programming (exercise), this can be improved to time 1 3 3 3 2 2
Randomizedalgorithmfor-Clustering • Algorithm-Cluster(graph, integer ) • Define • repeat times: • Color the vertices of uniformly at random with colors • if there is a properly colored solution of size then • return yes • return no • Theorem. There is a Monte Carlo algorithm for -Clustering with one-sided error and constant success probability that runs in time
Why derandomize? • Truly random bits are very hard to come by • Usual approach is to track radioactive decay • Standard pseudo-random generators might work • When spending exponential time on an answer, we do not want to get it wrong • Luckily, we can replace most applications of randomization by deterministic constructions • Without significant increases in the running time
How to derandomize • Different applications require different pseudorandom objects • Main idea: instead of picking a random coloring , construct a family of functions • Ensure that at least one function in has the property that we hope to achieve by the random choice • Instead of independent repetitions of the Monte Carlo algorithm, run it once for every coloring in • If the success probability of the random coloring is , we can often construct such a family of size
Splitting evenly • Consider a -coloring of a universe • A subset is split evenly by if the following holds: • For every the sizes and differ by at most one • All colors occur almost equally often within • If a set of size is split evenly, then is colorful
Splitters • For , an -splitter is a family of functions from to such that: • For every set of size , there is a function that splits evenly • Theorem.For any one can construct an -splitter of size in time
Perfect hash families derandomizeLongest Path • The special case of an -splitter is called an -perfect hash family • Instead of trying random colorings in the Longest Path algorithm, try all colorings in a perfect hash family • If is the vertex set of a -path, then so some function splits evenly • Since , this causes to be colorful • The DP then finds a colorful path • Theorem.For any one can construct an -perfect hash family of size in time
Universal sets • For an -universal set is a family of subsets of such that for any of size , all subsets of are contained in the family: • Universal sets can be used to derandomize the random separation algorithm for Subgraph Isomorphism (exercise) • Theorem.For any one can construct an -universal set of size in time
Coloring families • For , an -coloring family is a family of functions from to with the following property: • For every graph on the vertex set with at most edges, there is a function that properly colors • Coloring families can be used to derandomize the chromatic coding algorithm for -Clustering • Instead of trying random colorings, try all colorings in an -coloring family • Theorem.For any one can construct an -coloring family of size in time
The Feedback Vertex Set problem Input: A graph and an integer Parameter: Question: Is there a set of at most vertices in , such that each cycle contains a vertex of ?
ReductionrulesforFeedback Vertex Set (R1) Ifthere is a loop at vertex , then delete anddecreasebyone (R2) Ifthere is anedge of multiplicitylargerthan, thenreduceitsmultiplicityto (R3) Ifthere is a vertex of degree at most , then delete (R4) Ifthere is a vertex of degreetwo, then delete andaddanedgebetween’s neighbors • If (R1-R4) cannot be applied anymore, then the minimum degree is at least • Observation. If is transformed into , then: • fvs of size in fvs of size in • Any feedback vertex set in is a feedback vertex set in when combined with the vertices deleted by (R1)
How randomizationhelps • We have seen a deterministicalgorithmwithruntime • There is a simplerandomizedMonte Carlo algorithm • In polynomial time, we canfind a size- solution withprobability at least, ifoneexists • Repeatingthistimesgivesanalgorithmwith running time and constant successprobability • Keyinsight is a simple procedure to select a vertex that is contained in a solution withconstant probability
Feedback vertex sets in graphs of min.deg. • Lemma. Let be an -vertex multigraph with minimum degree at least 3. For every feedback vertex set of , more than half the edges of have at least one endpoint in . • Proof. Consider the forest • We prove that • for any forest • It suffices to prove • Let be the edges with one end in and the other in • Let , and be the vertices of with -degrees , • Every vertex of contributes to • Every vertex of contributes to in any forest
Monte Carlo algorithmforFeedback Vertex Set • Theorem. There is a randomizedpolynomial-time algorithmthat, given a Feedback Vertex Setinstance, • either reports a failure, or • finds a feedback vertex set in of size at most . • If has an fvs of size , it returns a solution with probability at least
Monte Carlo algorithmforFeedback Vertex Set • Algorithm fvs(Graph , integer ) • Exhaustivelyapply (R1)-(R4) toobtain • Let be the vertices with loops removed by (R1) • if thenfailure • if is a forest then return • Uniformly at random, pick an edge of • Uniformly at random, pick an endpoint of • return fvs(
Correctness (I) • The algorithm outputs a feedback vertex set or failure • Claim: If has a size-fvs, then the algorithm finds a solution with probability at least • Proof by induction on • Assume has a size-feedback vertex set • By safety of (R1)-(R4), ’ has a size-’ fvs • We have • Since loops are in anyfvs, we have • If, thenso’ is a forest • Algorithmoutputswhich is a valid solution • If, we willuse the induction hypothesis
Correctness (II) • Case : • Probability that random has an endpoint in is • Probability that is • If , then has an fvs of size • Then, by induction, with probability recursion gives a size-(fvs of • Sois a size-fvsof • Byreductionrules, output is an fvs of • Size is at most • Probability of success is • Theorem.There is a Monte Carlo algorithm for Feedback Vertex Setwith one-sided error and constant success probability that runs in time
Discussion • This simple, randomizedalgorithm is fasterthan the deterministicalgorithmfrom the previouslecture • The methodgeneralizesto-minor-free deletion problems: delete vertices from the graph to ensure the resulting graph contains no member from the fixed set as a minor • Feedback Vertex Set is -minor-free deletion
Summary • Several variations of color coding give efficient FPT algorithms • The general recipe is as follows: • Randomly color the input, such that if a solution exists, one is highlighted with probability • Show that a highlighted solution can be found in a colored instance in time • For most problems we obtained single-exponential algorithms • For -Clustering we obtained a subexponential algorithm