390 likes | 408 Views
Explore Boolean networks, attractor detection, and control techniques in biological networks. Learn about the mathematical models, attractors, and algorithms used. Discover the benefits and challenges of Boolean networks in studying genetic networks.
E N D
九大数理集中講義Comparison, Analysis, and Control of Biological Networks (4)Analysis and Control of Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University
Contents • Boolean Network • Attractor Detection/Enumeration • Algorithms for Singleton AttractorDetection/Enumeration • Control of Boolean Networks • Integer Linear Programming-based Approach
Boolean Network • Mathematical model of genetic networks • node⇔gene • State of node: 1 (active) /0 (inactive) • Regulation rules • Boolean function(AND, OR, NOT …) • Edge from y to x⇔y directly controls x • Synchronized update • Almost the same as digital circuits(with clocks) [Kauffman, The Origin of Order, 1993]
A time t time t+1 A ’ B’ C’ A B C 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 1 0 1 1 1 0 1 B C 0 1 0 0 0 0 0 1 0 0 A ’ = B 1 0 1 0 1 1 1 0 1 1 0 B ’ = A and C 1 1 1 INPUT OUTPUT C ’ = not A Example of Boolean Network Boolean Network State Transition Table Example of state transition: 111 ⇒ 110 ⇒ 100 ⇒ 000 ⇒ 001 ⇒ 001 ⇒ 001 ⇒ 。。。
Why Boolean Networks ? • Criticism that BN is too simplified • Unless simplified, difficult for theoretical analysis, inference, and control • though complex models can be used for simulation • Maybe useful for qualitative analyses • One of most simple non-linear models • Negative results on BN suggest negative results on more general (non-linear) models • Almost the same as digital circuits • Theories and techniques in computer science can beutilized
Our focus: Time Complexity • Many problems for BN are NP-hard • NP-hard means that there is no polynomial time algorithm (unless P=NP) • It will take O(2n) time or more if we use naïve methods • But, we want to solve much better • Because we can solve the cases of • n=300 for O(1.1n) • n=600 for O(1.05n) • Important for coping with large-scale networks
time t time t+1 A ’ B’ C’ A B C 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 1 INPUT OUTPUT Attractor(1) State Transition Table • Steady state • Different attractors ⇔ Different cell types • Example • 011 ⇒ 101 ⇒ 010 ⇒ 101 ⇒ 010 ⇒… • 111 ⇒ 110 ⇒ 100 ⇒ 000 ⇒ 001 ⇒ 001 ⇒001 ⇒ …
111 010 000 100 110 011 001 101 time t time t+1 A ’ B’ C’ A B C 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 1 INPUT OUTPUT Attractor (2)
indegree=2 indegree=3 v v N-K Model (Kauffman Network) • N: Number of nodes (We use n instead of N) • K: Indegree • Indegree = the number of input edges = the number of genes directly affecting node v • Each node has (maximum or average) indegree K • Boolean function assigned to each node is randomly selected
Distribution of Attractors inN-KModel • Classical conjecture • The number of attractors is • Recent results suggest that this conjecture may not be true • Superpolynomial growth ( > nγ for any γ) of the number of attractors (Samuelsson & Troein, PRL, 2003) • Superpolynomial growth of the average size of attractors (Drossel et al., PRL, 2005) • No conclusive result is known
Singleton Attractor (or Point Attractor) • Biological interpretation of attractors • Different attractors ⇔ Different cell types • Point attractor • Attractor with period 1 • Corresponding to a steady state • Definition: satisfying • Attractor Detection • Input: Boolean Network • Output: Point Attractor (if any) (or, )
Attractor Detection: Previous Works • Around time is enough since there are2n global states • But, it cannot be applied to largen • Several heuristics are known, but no theoretical guarantee [Irons, Pysica D, 2006], [Devloo et al., Bull. Math. Biol. 2003], … • Detection of a singleton attractor is NP-hard [Akutsu et al., GIW 1998] • We developed algorithms with average case theoretical bounds[Zhang et al., EURASIP JBSB 2007] • We also developed time algorithms for AND-OR BNs [Tamura & Akutsu, FCT07, Trans. IEICE 2009] [Tamura & Akutsu, AB08, Math. in CS 2009] [Melkman, Tamura & Akutsu, 2010]
Singleton Attractor(=Attractor with Period 1) attractor attractor
indegree=2 indegree=3 v v Indegree • Indegree = the number of input edges = the number of genes directly affecting node v • We use Kto denote the maximum indgree
Simple Recursive Enumeration Algorithm (1) • Examine 0-1 assignment one-by-one, and backtrack as soon as some contradiction occurs [Zhang et al., EURASIP JBSB 2007]
Illustration of Recursive Algorithm 0 0 0 0 0 1 0 1 0 0 0 1 Output
Simple Recursive Enumeration Algorithm (2) • Examine 0-1 assignment one-by-one, and backtrack as soon as some contradiction occurs. • 0 • 00 X backtrack • 01 • 010 X backtrack • 011 X backtrack • 10 • Several variants depending on ordering of nodes • Much better than trivial O(n2n) time
Analysis of Average Case Time Complexity t=0 t=1 v1 • Probability that vi(0)≠vi(1)is detected when 0-1 assignment for first m bits is examined: • Probability that a random assignment for m bits is consistent (with def. of singleton attractor): • Expected number of consistent 0-1 assignments for m bits: • By taking the maximum of the above for m in [1…n] , we can estimate the complexity vm-1 vm vm+1 K
ComputationalExperiment • Exponential increases, but bases are less than 2 Empirical Time Complexity
Issues on Worst Case Time Complexity • Detection of a Singleton Attractor for BNs with indegree K (K+1)-SAT • O(1.322n) time for K=2 (randomized) • We developed O((1.322-δ)n) time algorithm for K=2 • Detection problem remains NP-hard even for K=2 • O(1.587n) time algorithm for BNs with AND/OR nodes (no constraint on K) [Melkman, Tamura & Akutsu, 2010]
Reduction from BN-ATTRACTOR to SAT • Detection of Singleton Attractor with Max. Indegree K (K+1)-SAT (Boolean SATisfiability problem) vj vk vi
Basic Idea in O(1.587n) Time Algorithm (A) (B) u • Consider recursive assignment of 0-1 values to nodes (A) v=0 ⇒ u=0, v=1 ⇒ w=1 (B) v=0 ⇒ u=0 and w=1 • Letf(k) be #(assignments) for BN with k variables • By solving the above (like Fibonacci number), f(k) is O(1.4656n) • However, above procedure cannot be applied to all cases (e.g., not to bipartite networks) combination with SAT is required O(1.587n) time u v v w All nodes are OR NOT input w
Attractor Detection: Previous Works (2) Singleton Attractors Cyclic Attractors (Recursive, Average Case)
BN-Control: Previous Works • Datta et al. defined a problem of control of PBN (Probabilistic Extension of BN) and proposed a dynamic programming based method • They also proposed various extensions • But, their method must handle 2n×2n matrices • BN-Control (also PBN-Control) is NP-hard • BN-Control can be solved in polynomial time if the network has a tree structure [Akutsu et al., JTB 2007] • Practical approach based on Model Checking/SAT [Langmund & Jha, APBC 2008, JBCB 2009] • Theoretical studies using Semi-Tensor Product [Cheng, 2009] [Machine Learning, 52:169-191, 2003]
Definition of BN-Control • Input • Internal nodes: v1 ,…, vn External nodes:u1 ,…, um • Initial state:v0Desired state: vMBN • Output • Sequence of states of external nodes:u(0), u(1), …, u(M) • v(0)=v0, v(M)=vM (leading to the desired state at time M) [Akutsu et al., J. Theo. Biol. 2007]
Dynamic Programming for Control of BN • BN version of the algorithm by Datta et al. • DP table: • takes 1 if there is a control seq. leading to the target state • can be computed by
Illustration of DP Algorithm D[1,1,1, 2] =1 D[0,0,0, 2] = 0 u1=1, u2=1 DP Computation D[0,1,1, 3] = 1 But, the size of DP table is exponential
Integer Programming • Linear Programming (LP) • Maximize (or minimize) an objective linear function under constraints of linear inequalities • Integer Linear Programming (ILP) • LP + constraints that specified variables must take integer value • Several efficient solvers: CPLEX, Gurobi • Used for solving various NP-hard problems
ILP for Attractor Detection (1) xi: state of vi
dummy for using ILP ILP forAttractorDetection(3)
ILP formalization for BN-Control major changes from Attractor Detection
Summary • Boolean network • A discrete model of a genetic network • Similar to digital circuits • Attractor Detection/Enumeration • NP-hard • Much better than a naïve O(2n) bound for bounded indegree cases • Identification of cyclic attractors is more difficult • Control of Boolean networks • NP-hard • Can be solved by DP algorithm (but, in exponential time) • Integer Linear Programming-based Approach • Simple • Flexible for modifications/extensions • Fast if indegree ≦ 2