300 likes | 457 Views
Quiz #2. What is pattern recognition Optimal classification error is: The minimum classification error The true error of the designed classifier The estimated error of the designed classifier None of the above. Quiz #2. The classification error is:
E N D
Quiz #2 • What is pattern recognition • Optimal classification error is: • The minimum classification error • The true error of the designed classifier • The estimated error of the designed classifier • None of the above
Quiz #2 • The classification error is: • The selection of the wrong feature vector • The probability of misclassification • A random variable • None of the above • A classifier is: • A function from the feature space to the space of all possible labels • A person who classifies • A random process • None of the above
Quiz #2 • The Bayes classifier is: • The classifier with the maximum designed error • The classifier designed by Bayes • The classifier with the minimal error • A classifier is: • A function from the feature space to the space of all possible labels • A person who classifies • A random process • None of the above
Quiz #2 • Describe the pattern recognition design cycle • The Bayes error is: • The smallest design error • <= 1/2 • > 1/2 • None of the above
Quiz #2 • Describe the goal(s) of Linear Discriminant Analysis • Predictors are • Specified by a feature vector • A real-valued function • A person making predictions • None of the above
Quiz #2 • Discuss the Boolean formalism in the context of gene regulatory networks • Define a gene regulatory network.
Probabilistic Boolean Networks • Share the appealing rule-based properties of Boolean networks. • Allows to model randomness • Explicitly represent probabilistic relationships between genes. • Robust in the face of uncertainty. • Dynamic behavior can be studied in the context of Markov Chains. • Boolean networks are just special cases. • Allows quantification of influence of genes on other genes.
PBN PBN := ({BN1, …, BNk}, p1, …, pk, p, q) 0 < p < 1 - probability of switching context 0 < pi < 1 – probability for BNi being used 0 < q < 1 – probability of gene flipping Context := Which BN is used for the next transition ~ the regime in which the cell operates/functions Gene flipping ~ mutation rate
Probabilistic Boolean Networks vs. Boolean Networks x1 x2 x3 xn Boolean Networks xi x1 x2 x3 xn Probabilistic Boolean Networks xi
p1 q p2 p
Context Switching X2 X2 X3 X3 p X1 X1 p1 q p2
PBN • Different combination of functions determines different Boolean Networks • The model can be seen as a Markov Chain, where the transition is deterministic once decided (randomly) which Boolean Network to use • Properties of the Boolean Networks determine properties of the Markov Chain • Transition probabilities • Stationary distributions • Steady-states (long run behavior) • (We are not so interested in transients)
Probabilistic Boolean Networks • Influence : determinative power of the variables (genes) • Intervention : changing the behavior of some genes to make the network to transition to desired states • External control : PBNs depending on external (control) variable. Interest in treatment strategies
Attractors in PBNs Attractors in the Boolean Networks should correspond to cellular types (Kauffman) • PBNs are formed by a family of Boolean Networks • Steady-state analysis of the PBN may be meaningful for classification based on gene-expression data • Relationships between steady-state distribution and the attractors of the Boolean Networks allow for structural analysis of the network
Dynamics of PBNs with perturbations The same Boolean Network being used Time In a basin In the Attractor Change of function or perturbation The system reaches the Attractor Next change of function or perturbation
Steady-state analysis • Steady-state: The state probability distribution of the network in a long run • In the long run, the system is expected to stay in the attractors of the Boolean Networks From the same initial point the system can transition to two different regions (attractors) depending on the Boolean Function being used
BN with perturbation • A state of the BNp is a vector s=[x1,…,xn] є {0,1}n • p є (0,1]: models random gene mutations: • At each time point, there is a probability p for any gene to change its value uniformly randomly, therefore: • Markov Chain is ergodic and aperiodic Steady- State Distribution (SSD) exists
Control Policy • A control policy Πg = {µ(t)}t>0 based on gene g, is a sequence of decision rules µg(t) : S {0,1} at each step t, where: • S: collection of all states s of the network • 0/1: not flipping/flipping the control gene • MFPT control policy: • Stationary for each candidate gene
Mean First Passage Time(MFPT) Algorithm* • Intuition behind the algorithm: • Given the control gene g, if desirable state s reaches U on average faster than ŝg , it is reasonable to apply control and start the next network transition from ŝg • More formal display of the algorithm: • For all states s in U (ŝg flipped control gene in s) : • If KD(s) - KD(ŝg) > γ then • µg(s) = 1 • else • µg(s) = 0 • Considering therapeutic interventions, states can be divided into 2 sets: • D: Desirable • U: Undesirable • KU: a vector containing MFPTs from each state in D to U • KD: a vector containing MFPTs from each state in U to D • γ: tuning parameter • γ is set to higher value when “cost of the control/cost of undesirable state” is higher, for applying less control
Methodology • Inducing control policy of the reduced network to the original network • Genes that cannot be deleted: • MFPT control policy for the original network is designed and has 2n decision rules • MFPT control policy of the reduced network is designed and has 2n-1 decision rules, since the number of genes is n-1 in the reduced network • The control policy designed on the reduced network, induced to another control policy for the original network • The original and induced control polices of the original network were compared • Xi: partitions states into D and U • Xj: control gene
Simulations • 100 BN0.1 • n= number of genes = 7 • γ: tuning parameter in MFPT algorithm • γ has the range 0-6 with 200 equally spaced values • The same values of γ were used for designing control policies in original and reduced network • Hamming Distance is used for measuring the difference between the original MFTP CP and induced CP of the original network
Results – Inducing CP Average Hamming distance for 100 BN0.1, 200 different Gammas
Results – Inducing CP Average Hamming distance for 100 BN0.1, 200 different Gammas
SSD Shift Stationary Policy Applied on WNT5a network using pirin as a control gene • Comparing shifts in Steady-State Distributions of the original and reduced network • Xi: partitions states into D and U • Xj: control gene • Genes that cannot be deleted: • A MFPT control policy with 2n decision rules designed for the original network, using γ • γ : MFPTs of D in the original network divided by 100 • A MFPT control policy with 2n-1 decision rules designed for the reduced network, using the same γ • The shifts in SSD of the original and reduced network were compared Original Network Applying Stationary Control Policy R. Pal, A. Datta and E. R. Dougherty, “Optimal Infinite Horizon Control for Probabilistic Boolean Networks”, IEEE Transactions on Signal Processing, Vol. 54, no. 6, 2375-2387, 2006.
SSD Shift SSD Shift for one BN0.1, comparing the original network’s total mass in U & D with the same measure in the smaller network • The 4th gene from RHS is deleted, resulting the following reduced network
Formulate the question Organizing and cleaning data Interpretation of results Normalize data Analyze data
http://gsp.tamu.edu/Publications http://gsp.tamu.edu/Publications/books