UMass Lowell Computer Science 91.503 Analysis of Algorithms Prof. Karen Daniels Fall, 2008

UMass Lowell Computer Science 91.503Analysis of AlgorithmsProf. Karen DanielsFall, 2008 Lecture 7 Tuesday, 10/28/08 Approximation Algorithms

Approximation Algorithms Chapter 35

Motivation: Some Techniques for Treating NP-Complete Problems • Approximation Algorithms • Heuristic Upper or Lower Bounds • Greedy, Simulated Annealing, Genetic “Alg”, AI • Mathematical Programming • Linear Programming for part of problem • Integer Programming • Quadratic Programming... • Search Space Exploration: • Gradient Descent, Local Search, Pruning, Subdivision • Randomization, Derandomization • Leverage/Impose Problem Structure • Leverage Similar Problems

Basic Concepts Definitions

C = cost of algorithm’s solution C* = cost of optimal solution n = number of inputs = size of instance Definitions • Approximation Algorithm • produces “near-optimal” solution • Algorithm hasapproximation ratior(n) if: • Approximation Scheme • (1+e)-approximation algorithm for fixed e • e > 0 is an input • Polynomial-TimeApproximation Scheme • time is polynomial in n • Fully Polynomial-TimeApproximation Scheme • time is also polynomial in (1/e) • constant-factor decrease in e causes only constant-factor running-time increase min max source: 91.503 textbook Cormen et al.

Resources beyond textbook… • “Approximation Algorithms for NP-Hard Problems”, editor Dorit Hochbaum, PWS Publishing, Co., 1997. • “Approximation Algorithms” by Vazirani, 2001. • “Computers and Intractability” by Garey & Johnson, 1979. • UMass Lowell course taught this semester by Prof. Jie Wang

Overview • VERTEX-COVER • Polynomial-time 2-approximation algorithm • TSP • TSP with triangle inequality • Polynomial-time 2-approximation algorithm • TSP without triangle inequality • Negative result on polynomial-time r(n)-approximation algorithm • MAX-3-CNF Satisfiability • Randomized r(n)-approximation algorithm • SET-COVER • polynomial-time r(n)-approximation algorithm • r(n) is a logarithmic function of set size • SUBSET-SUM • Fully polynomial-time approximation scheme

Vertex Cover Polynomial-Time 2-Approximation Algorithm

B source: Garey & Johnson Vertex Cover A vertex coverof size 2 C Vertex Coverof an undirected graph G=(V,E) is a subset D F E NP-Complete

b d c d a f e g a f e g d a g a c a a f g Vertex Cover source: 91.503 textbook Cormen et al.

Any vertex cover must include >= 1 endpoint of each edge in A APPROX-VERTEX-COVER adds both endpoints of each edge of A Vertex Cover Theorem:APPROX-VERTEX-COVER is a polynomial-time 2-approximation algorithm. Proof: Observe: no 2 edges of A share any vertices in C due to removal of incident edges transitivity Algorithm runs in time polynomial in n.

Traveling Salesman TSP with triangle inequality Polynomial-time 2-approximation algorithm TSP without triangle inequality Negative result on polynomial-time r(n)-approximation algorithm

Hamiltonian Cycle 34.2 Hamiltonian Cycleof an undirected graph G=(V,E) is a simple cycle that contains each vertex in V. NP-Complete source: 91.503 textbook Cormen et al.

u v x w Traveling Salesman Problem (TSP) TSP Tourof a complete, undirected, weighted graph G=(V,E) is a Hamiltonian Cycle with a designated starting/ending vertex. TSP Decision Problem: cost function provides edge weights NP-Complete source: 91.503 textbook Cormen et al.

Invariant: Minimum weight spanning forest Becomes single tree at end Invariant: Minimum weight tree 2 A B 4 3 G 6 Spans all vertices at end 5 1 1 E 7 6 F 8 4 D C 2 Minimum Spanning Tree:Greedy Algorithms Time: O(|E|lg|E|) given fast FIND-SET, UNION • Produces minimum weight tree of edges that includes every vertex. Time: O(|E|lg|V|) = O(|E|lg|E|) slightly faster with fast priority queue • forUndirected, Connected, Weighted Graph • G=(V,E) source: 91.503 textbook Cormen et al.

u v 5 3 3 5 x w 4 Cost Function Satisfies Triangle Inequality TSP with Triangle Inequality “full walk” = abcbhbadefegeda optimal tour (not necessarily found by approximation algorithm) final approximate tour source: 91.503 textbook Cormen et al.

TSP with Triangle Inequality Theorem:APPROX-TSP-TOUR is a polynomial-time 2-approximation algorithm for TSP with triangle inequality. Proof: Algorithm runs in time polynomial in n = |V |. (since deleting 1 edge from a tour creates a spanning tree) (lists vertices when they are first visited and when returned to after visiting subtree) (since full walk traverses each edge of T twice) Now make W into a tour H using triangle inequality. (New inequality holds since H is formed by deleting duplicate vertices from W.)

TSP without Triangle Inequality Theorem:If , then for any constant polynomial-time approximation algorithm with ratio r for TSP without triangle inequality. Suppose there is one --- call it A Proof: (by contradiction) Showing how to use A to solve NP-complete Hamiltonian Cycle problem contradiction! Convert instance G of Hamiltonian Cycle into instance of TSP (in polynomial time): (G’=(V,E’) is complete graph on V) (assign integer cost to each edge in E’) For TSP problem (G’,c): G has Hamiltonian Cycle (G’,c) contains tour of cost |V| Tour of G’ must use some edge not in E G does not have Hamiltonian Cycle Cost of that tour of G’ Can use A on (G’,c)! A finds tour of cost at most r(length of optimal tour) G has Hamiltonian Cycle A finds tour of cost at most r|V| cost = |V| G does not have Hamiltonian Cycle A finds tour of cost > r |V| If , solving NP-complete hamiltonian cycle in polynomial time is a contradiction!

MAX-3-CNF Satisfiability 3-CNF Satisfiability Background Randomized Algorithms Randomized MAX-3-CNF SAT Approximation Algorithm

MAX-3-CNF Satisfiability • Background on Boolean Formula Satisfiability • Boolean Formula Satisfiability: Instance of language SAT is a boolean formula f consisting of: • n boolean variables: x1, x2, ... , xn • m boolean connectives: boolean function with 1 or 2 inputs and 1 output • e.g. AND, OR, NOT, implication, iff • parentheses • truth, satisfying assignments notions apply NP-Complete source: 91.503 textbook Cormen et al.

MAX-3-CNF Satisfiability (continued) • Background on 3-CNF-Satisfiability • Instance of language SAT is a boolean formula f consisting of: • literal: variable or its negation • CNF = conjunctive normal form • conjunction: AND of clauses • clause: OR of literal(s) • 3-CNF: each clause has exactly 3 distinct literals MAX-3-CNF Satisfiability: optimization version of 3-CNF-SAT - Maximization: satisfy as many clauses as possible - Input Restrictions: - exactly 3 literals/clause - no clause contains both variable and its negation (this assumption can be removed) NP-Complete source: 91.503 textbook Cormen et al.

Definition • Randomized Algorithmhasapproximation ratior(n) if, forexpected cost Cof solution produced by randomized algorithm: size of input = n source: 91.503 textbook Cormen et al.

Randomized Approximation Algorithm for MAX-3-CNF SAT Theorem: Given an instance of MAX-3-CNF satisfiability with n variables x1, x2,...,xn with m clauses, the randomized algorithm that independently sets each variable to 1 with probability 1/2 and to 0 with probability 1/2 is a randomized 8/7-approximation algorithm. Proof: source: 91.503 textbook Cormen et al.

Set Cover Greedy Approximation Algorithm polynomial-time r(n)-approximation algorithm r(n) is a logarithmic function of set size

Set Cover Problem NP-Complete source: 91.503 textbook Cormen et al.

Greedy Set Covering Algorithm Greedy: select set that covers the most uncovered elements source: 91.503 textbook Cormen et al.

paid only when x is covered for the first time assume x is covered for the first time by Si (spread cost evenly across all elements covered for first time by Si ) Number of elements covered for first time by Si Set Cover Theorem:GREEDY-SET-COVER is a polynomial-time r(n)-approximation algorithm for Proof: H(0)=0 Algorithm runs in time polynomial in n.

1 unit is charged at each stage of algorithm Set Cover (proof continued) Theorem:GREEDY-SET-COVER is a polynomial-time r(n)-approximation algorithm for Proof: (continued) Each x is in >= 1 S in C* Cost assigned to optimal cover:

And then conclude that: which will finish the proof Set Cover (proof continued) Theorem:GREEDY-SET-COVER is a polynomial-time r(n)-approximation algorithm for Proof: (continued) How does this relate to harmonic numbers?? We’ll show that:

Set Cover (proof continued) Proof of: For some set S: Number of elements of S remaining uncovered after S1…Si selected k = least index for which uk=0. Since, due to greedy nature of algorithm: since j <= ui-1 telescoping sum

Subset-Sum Exponential-Time Exact Algorithm Fully Polynomial-Time Approximation Scheme

Subset-Sum Problem positive integers Decision Problem: Optimization Problem seeks subset with largest sum <= t NP-Complete source: 91.503 textbook Cormen et al.

source: 91.503 textbook Cormen et al. Exponential-Time Exact Algorithm Identity: Li is sorted list of every element of Pi <= t MERGE-LISTS( L,L’ ) returns sorted list = merge of sorted L, L’ with duplicates removed.

Fully Polynomial-Time Approximation Scheme Theorem:APPROX-SUBSET-SUM is a fullypolynomial-time approximation scheme for subset-sum. see textbook Proof: source: 91.503 textbook Cormen et al.

UMass Lowell Computer Science 91.503 Analysis of Algorithms Prof. Karen Daniels Fall, 2008