290 likes | 408 Views
On the Minimum Common Integer Partition Problem. Author: Xin Chen, Lan Liu, Zheng Liu, Tao Jiang Presenter: Lan Liu. Outline. Introduction Problem definitions Biological applications Approximation of 2-MCIP Approximation of k -MCIP Conclusion and future work.
E N D
On the Minimum Common Integer Partition Problem Author: Xin Chen, Lan Liu, Zheng Liu, Tao Jiang Presenter: Lan Liu
Outline • Introduction • Problem definitions • Biological applications • Approximation of 2-MCIP • Approximation of k-MCIP • Conclusion and future work
IP(S): given a multiset S= {x1, L, xm}, an integer partition is a disjointunion Example:given S= {3, 3, 4}, {2,2,3,3} is an IP({3,3,4}). Problem Definitions • P(n): given an integer n, a partition is a set of integers, say {n1,n2,…, nr}, s.t.åi=1r ni=n. Example: given n=4, {2,2} is a P(4); given n=3, {3} is a P(3). Observation:å S= å IP(S)
Examples • CIP(S1, S2, …, Sk): given multisets S1, S2, …, Sk , a common integer partition of all multisets. Example: given S= {3, 3, 4}, T={2,2,6}, {2,2,3,3} is a CIP(S,T); {1,1,2,2,4} is also a CIP(S,T). Observation: (1)9 CIP(S1,…, Sk) $ å S1=…= å Sk (2)|CIP(S1,…, Sk)| ¸ |Si| , i2 {1,..,k} • MCIP(S1, S2, L, Sk):a common integer partition with the minimum cardinality. • Example: {2,2,3,3} is a MCIP(S,T). # P(100)=190569292
MCIP is NP-hard • Subset sum ·P MCIP • Subset sum problem Given a set of integer x1, x2,…, xn, s.t. X=åixi, ask if there is a subset with the sum X/2. • Reduction to MCIP problem - Let S={X/2, X/2}, T={x1, x2,…, xn}, find MCIP(S,T). - If {x1, x2,…, xn} is a MCIP(S,T), the answer is “yes” to Subset sum problem; otherwise, the answer is “no”.
Minimum Common Substring Partition a b c de f gh i j k h h i j k h e f ga b c d Biological Applications(1) • Genetic distance between two genomes • The distance between two strings a b c d e f g h i j k h h i j k h e f g a b c d
Biological Applications(2) • MCIP is a special case of Minimum Common Substring Partition(MCSP) MCSP(S,T) S= T= MCIP(S',T') S'= {x1, x2, L, xm} T'= {y1, y2, L, yn}
Outline • Introduction • Approximation of 2-MCIP • Positive results • Negative results • Approximation of k-MCIP • Conclusion and future work
Some basic facts • |MCIP(S1,S2,…,Sk)|¸ max(|S1|,|S2|,…,|Sk|) • |MCIP(S,T)|· m+n-1. |S|=m,|T|=n
Algorithm Analysis • An example: S= {3, 3, 4},T={2,2,6} • |MCIP(S,T)|· m+n-1 • |MCIP(S,T)| ¸max(m,n) • Approximation ratiois 2
Basic related multisets: if there are no S'½S and T'½T, s.t. S' and T' are related. • Example: Definitions for MRSP(1) • Related multisets: if åS=åT and S,T¹;, S and T are a pair of related multisets. Example:
Definitions for MRSP(2) • Maximum Related Multiset Partition problem(MRSP) Given S and T, partition them into related submultisets with the maximum cardinality. Observation: If S, T are a pair of basic related multisets, |MRSP|=1.
For each component, #edges ¸ #vertices –1 • Each component is related. • |CIP| ¸m+n-|RSP| • |MCIP| ¸m+n-|RSP| ¸m+n-|MRSP| MRSP $ 2-MCIP • CIP! RSP
MRSP $2-MCIP • CIP Ã RSP • For each related submultisets (S', T'), we run Greedy_CIP(S', T'), |CIP (S', T')| · |S'|+ |T'|-1 • |CIP| ·m+ n- |RSP| • |MCIP| · |CIP| ·m+ n-| MRSP|
MRSP $ 2-MCIP • |MRSP| = m+n –|MCIP| • If S, T are a pair of basic related multisets,|MCIP|= m+n-1, because |MRSP|=1. • When m+n ¸ 5, |MCIP| =m+n-1¸4/5(m+n). • A new way to solve MCIP • Step1. find MRSP; • Step2. for each basic related submultiset, run Greedy_CIP(S', T').
Approximate 2-MCIP • Algorithm intuition: Step 1.find related submulitsets Step 2.set packing Step 3. Greedy-CIP mimic MRSP
Set Packing Problem(1) • Set Packing • Given aset of subsets S, find the largest number of mutually disjoint subsets from S?
Set Packing Problem(2) • Bad news - It is NP-hard to find related submultisets of large size. - Set packing itself is NP-hard. • Good news We can find the small related submultisets and approximate set packing efficiently.
Approximate 2-MCIP • Main idea: use different strategies for the submultisets with different sizes. • The approximation ratio is 5/4. If there are no basic related submultisets with size smaller than 5, 4/5 (m+n) · |MCIP| ·m+n-1.
Outline • Introduction • Approximation of 2-MCIP • Positive results • Negative results • Approximation of k-MCIP • Conclusion and future work
General framework If P1 cannot be approximated within some constant ratio c, P2 cannot be approximated by some constant ratio c'. • Linear Reduction ·L • OPTP2(f(x)) ·a OPTP1(x) • | OPTP1(x)- g(x,y)| · b|OPTP2(f(x))-y|
Maximum 3DM-3 • Problem Definition Given a set DµX£ Y£ Z, where X, Y and Z are disjoint sets, and each element occurs in at most three triples, find a matching with the maximum cardinality. • Known fact Maximum 3DM-3 cannot be approximated within some constant ratio. [Kann91]
L-reduction(1) • f:S={4i| i2 X[Y[Z } T={4i1+4i2+4i3 | (i1,i2,i3)2D} • OPTMCIP· 70*OPT3DM
L-reduction(2) • g: - CIP ! RSP • |OPTRSP –SOLRSP|· |OPTMCIP –SOLMCIP | • - RSP ! 3DM • OPT3DM¼OPTRSP • Each related submultiset includes at least one triple |OPT3DM –SOL3DM|· |OPTRSP –SOLRSP |
L-reduction(3) • There is a constant cs.t. Maximum 3DM-3 cannot be approximated within c. • There is a L-reduction s.t. • OPTMCIP· 70*OPT3DM • |OPT3DM –SOL3DM|· |OPTMCIP –SOLMCIP| • There is a constant c's.t.2-MCIP cannot be approximated within c'. c'<5/4
Outline • Introduction • Approximation of 2-MCIP • Approximation of k-MCIP • Conclusion and future work
Approximate k-MCIP • Run Greedy_CIP(S,T) sequentially on S1,S2, …, Sk. • |MCIP(S1,S2,…,Sk)| ·|S1|+|S2|+…+|Sk| • |MCIP(S1,S2,…,Sk)| ¸max(|S1|,|S2|,…,|Sk|) • Approximation ratio is k • We can get a {3k(k-1)}/(3k-2)- approximation by removing the common elements.
Outline • Introduction • Approximation of 2-MCIP • Approximation of k-MCIP • Conclusion and future work