Computational Molecular Biology

Computational Molecular Biology Pooling Designs – Inhibitor Models

An Inhibitor Model • In sample spaces, exists some inhibitors • Inhibitor = anti-positive • (Positives + Inhibitor) = Negative _ _ _ _ _ Inhibitor + _ x + Negative My T. Thai mythai@cise.ufl.edu

An Example of Inhibitors My T. Thai mythai@cise.ufl.edu

Inhibitor Model • Definition: • Given a sample with d positive clones, subject to at most r inhibitors • Find a pooling design with a minimum number of tests to identify all the positive clones (also design a decoding algorithm with your pooling design) My T. Thai mythai@cise.ufl.edu

Inhibitors with Fault Tolerance Model • Definition: • Given n clones with at mostdpositive clones and at most rinhibitors, subject to at most e testing errors • Identify all positive items with less number of tests My T. Thai mythai@cise.ufl.edu

Preliminaries My T. Thai mythai@cise.ufl.edu

2-stages Algorithm What is AI? The set AI should contains all the inhibitors and no positives. Hence the set PN contains all positives (and some negatives) but no inhibitors My T. Thai mythai@cise.ufl.edu

2-stages Algorithm At this stage, the problem become the e-error-correcting problem. My T. Thai mythai@cise.ufl.edu

Non-adaptive Solution (1 stage) • P contains all positives • N contains all negatives • O contains all inhibitors and no positives My T. Thai mythai@cise.ufl.edu

Non-adaptive Solution My T. Thai mythai@cise.ufl.edu

Generalization • The positive outcomes due to the combination effect of several items • Items are molecules • Depends on a complex: subset of molecules • Example: complexes of Eukaryotic DNA transcription and RNA translation My T. Thai mythai@cise.ufl.edu

A Complex Model • Definition • Given n items and a collection of at mostd positive subsets • Identify all positive subsets with the minimum number of tests • Pool:set of subsets of items • Positive pool: Contains a positive subset My T. Thai mythai@cise.ufl.edu

What is Hypergraph H? • H = (V,E ) where: • V is a set of n vertices (items) • E a set of m hyperedges Ej where Ej is a subsets of V • Rank: r = max {| Ej| s.t Ej inE } My T. Thai mythai@cise.ufl.edu

Group Testing in Hypergraph H • Definition: • Given H with at most d positive hyperedges • Identify all positive hyperedges with the minimum number of tests • Hyperedges = suspect subsets • Positive hyperedges = positive subsets • Positive pool: contains a positive hyperedge • Assume that Ei Ej My T. Thai mythai@cise.ufl.edu

d(H)-disjunct Matrix • Definition: • M is a binary matrix with t rows and n columns • For any d + 1 edges E0, E1, …, Ed of H, there exists a row containing E0 but not E1, …, Ed • Decoding Algorithm: • Remove all negatives edges from the negative pools • Remaining edges are positive My T. Thai mythai@cise.ufl.edu

Construction Algorithms Consider a finite field GF(q). Choose k, s, and q: Step 1: for each v in V associate vwith pv of degree k -1 over GF(q) My T. Thai mythai@cise.ufl.edu

Step 2: Construct matrixAsxmas follows: forx from 0 to s -1 (rkd <=s < q) for each edge Ej inE A[x,Ej] = PE(x) = {pv(x) | v in Ej} E1 E2 Ej Em 0 1 A= x PE2(x) PEj(x) s-1 A Proposed Algorithm My T. Thai mythai@cise.ufl.edu

Step 3: Construct matrixBtxnfromAsxmas follows: forx from 0 to s -1 for each PEj(x) for each vertex v in V if pv(x) in PEj(x), then B[(x, PEj(x)),v] = 1 else B[(x, PEj(x)),v] = 0 E1 E2 Ej Em 0 1 A= x PEj(x) s-1 A Proposed Algorithm v1 v2 vj vn (0, PE0(0)) (0, PE1(0)) B= (x, PEj(x)) (s-1, PEm(s-1)) 0 1 My T. Thai mythai@cise.ufl.edu

Analysis • Theorem: If rd (k -1) + 1≤ s ≤ q, then B is d(H)-disjunct My T. Thai mythai@cise.ufl.edu

Proof of d(H)-disjunct Matrix Construction • Matrix A has this property: • For any d + 1 columns C0, …, Cd, there exists a row at which the entry of C0 does not contain the entry of Cj for j = 1…d • Proof: Using contradiction method. Assume that that row does not exist, then there exists a j (in 1…d) such that entries of C0 contain corresponding entries of Cj at least r(k-1)+1 rows. Then PEj(x) is in PE0(x) for at least r(k-1)+1 distinct values of x. This means that Ej is in E0 My T. Thai mythai@cise.ufl.edu

Proof of d(H)-disjunct Matrix Construction (cont) • Prove B is d(H)-disjunct • Proof: A has a row x such that the entry F in cell (x, E0) does not contain the entry at cell (x, Ej) for all j = 1…d. Then the row <x,F> in B will contain E0 but not Ej for all j = 1…d My T. Thai mythai@cise.ufl.edu

Computational Molecular Biology