240 likes | 507 Views
The Minimum Test Set Problem (MTS). Leen Stougie TU Eindhoven and CWI Amsterdam Joint work with: Koen de Bontridder - Siemens Bjorni Halldorsson – Iceland University Cor Hurkens – TU Eindhoven Magnus Halldorsson – Iceland University
E N D
The Minimum Test Set Problem (MTS) Leen Stougie TU Eindhoven and CWI Amsterdam Joint work with: Koen de Bontridder - Siemens Bjorni Halldorsson – Iceland University Cor Hurkens – TU Eindhoven Magnus Halldorsson – Iceland University Ben Lageweg – Ortec R.Ravi – CMU Pittsburgh Jan Karel Lenstra – CWI Jim Orlin – MIT Cambridge MA
Set of m items {1,2,...,m} Collection of n tests {T1,T2,...,Tn } Test Tj distinguishes items that react positively (1) on Tj from the items that react negaitively (0) on Tj A test is given by the items that react positively A test set is a subcollection of tests such that each pair of items is distinguished by at least one test in the the subcollection Find a test set of minimum cardinality
Potatoes and diseases Potato Varieties Potato diseases V1 D1 V2 D2 V3 D3 V4 D4 V5 . Test Set is a set of varieties that discriminates between all diseases minimum test set {V1,V4} D1 has { 1 , 1 } D2 has { 1 , 0 } 23 items (potato diseases) D3 has { 0 , 0 } 68 tests (potato varieties) D4 has { 0 , 1 }
Individuals (items) potato diseases proteins faults in product diseases Binary attributes (tests) potato varieties antibodies detecting presence of epitopes (short peptide sequences) fault detecting tests fysical and chemical tests IdentificationA test set gives each of a set of individuals(items) a unique binary signature
The Set Cover Problem (SCP) Set of M elements {1,2,...,M} Collection of N sets {S1,S2,...,SN } Each set is a subset of the elements Set Sj covers the elements it contains A set cover is a subcollection of sets such that each element is covered by at least one set in the subcollection Find a set cover of minimum cardinality
MTS pair of items i,j m items test T n tests Ti1,Ti2,...,Tik test set SCP element e(i,j) M=m(m-1)/2 elements set S containing all e(i,j) s.t. i in T and j not in T n sets Si1,Si2,...,Sikset cover MTS and the Set Cover Problem (SCP)
SCP is well studied and is the problem that models crew scheduling problems, workforce planning, class-scheduling etc. SCP is NP-hard Column generation methods solve practical SCP’s
SCP is well studied and is the problem that models crew scheduling problems, workforce planning, class-scheduling etc. SCP is NP-hard Column generation methods solve practical SCP’s MTS can be solved as SCP MTS is NP-hard (reductionfrom SCP) MTS tends to give difficult instances of SCP
Three directions - Approximation algorithms - Exact optimization algorithms - Heuristics
Approximation algorithms (1) Greedy algorithm: At each iteration, given a partial test set (set of already selected tests), select the test that distinguishes most yet undistinguished item pairs and add to the partial test set Stop if all item pairs are distinguished Lemma: Greedy has approximation ratio O(ln m) Lemma: 2-phase Greedy has approximation ratio O(log k)forkthe size of the largest test Lemma: Greedy has approximation ratio 11/8 for k=2
A beautiful graph problem (1) MTS2: Each test contains exactly 2 items Item Vertex of graph, Test {i,j} Edge {i,j} of graph Example 7 items 10 tests
A beautiful graph problem (2) MTS2: Each test contains exactly 2 items Item Vertex of graph, Test {i,j} Edge {i,j} of graph Example 7 items 10 tests By the red edge its two vertices are distinguished from all other vertices but not from one another
A beautiful graph problem (3) MTS2: Each test contains exactly 2 items Item Vertex of graph, Test {i,j} Edge {i,j} of graph Example 7 items 10 tests By the path of two red edges its three vertices are distinguished from all other vertices and also from one another
A beautiful graph problem (4) MTS2: Each test contains exactly 2 items Item Vertex of graph, Test {i,j} Edge {i,j} of graph Example 7 items 10 tests red paths form a test cover (1 isolated vertex is allowed) Graph Problem: Given a graph, pack as many vertex disjoint paths of length 2 as possible
Approximation algorithms (2) • No polynomial time algorithm gives a solution guaranteed within o(log m) times optimal unless P=NP (was proved for SCP in [Raz&Safra 1997]) • No polynomial time algorithm gives a solution guaranteed within (1-b)ln m for any b>0 unless NP iscontained in DTIME(m^{loglogm}) (was proved for SCP in [Feige 1998]) • No polynomial time algorithm for the problem with at most 2 items per test (MTS2) gives a solution guaranteed within (1+b) for any b>0 unless P=NP (MTS2 is APX-hard)
Branch-and-Bound algorithms (1)Ingredients The nodes of the search tree correspond to partial test sets together with sets of rejected tests A partial test setdefines an equivalence relation on the set of items Definition: Given a partial test set, two items are equivalentif there is no test that distinguishes them A partial test set T gives equivalence classes of items
Branch-and-Bound (2)Quality criteria Criterion 1: Separation criterion for test T not in T Criterion 2: Power criterion for test T not in T Criterion 3: Information criterion for test T not in T with
Branch-and-Bound (3)Branching 2 different branching rules
Branch-and-Bound (4)Lower bounds • Lower bound by ideal tests • Lower bound by power with F(m,n) the minimum power any set of n tests need to discriminate any set of m items ..... 2 more lower bounds
Heuristics Halldorsson et al. applied heuristics for the proteomic test set problem We have no experience, but it is interesting to investigate in combination with real-life problems
Minimum Test Set in the future • Find some more applications • Improve Branch and Bound algorithms • Apply homeopathic algorithms • Introduce possibilities for test results other than 0 or 1 • Construct software