1 / 116

Reasoning about Uncertainty in Biological Systems

Reasoning about Uncertainty in Biological Systems. Andrei Doncescu LAAS CNRS. Aix-en-Province 18 September. Structural Bioinformatics.

sabina
Download Presentation

Reasoning about Uncertainty in Biological Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reasoning about Uncertainty in Biological Systems Andrei Doncescu LAAS CNRS Aix-en-Province 18 September

  2. Structural Bioinformatics • Cells buzz with activity. They take nutrients and convert to energy for a number of purposes. Reproduce themselves and are called upon constantly to synthesize protein molecules • Gene : a segment of DNA that are programmed for the • production of a specific protein • Gene expression: cell produces the protein encoded • by a particular gene • Genome: the entire set of genetic instruction for a given organism • Nucleotide : the fundamental unit of DNA and RNA • Protein: a molecule consisting of up to thousand of amino acids • Amino Acid : a class of 20 different molecules (C,H,N,O,S) which can merge to form a bond

  3. DNA Genome RNA Transcriptomic Proteins Proteomic Metabolites Metabolomics/Fluxomics Structure and Modeling of Metabolic Pathway

  4. Systemic approach : reconciliation of the 3 levels of observation (3M : macro,micro,molecular) • Mixing power, macro, micromixing, reactivity, - coupled systems • Expert systems, supervision • Scale-up and down ; CFD MACROSCOPIC LEVEL Tool : bioreactor MetrologyKineticsStoechiometry, mediaClassification of populationsPhysico-mechanical et physico chemical environment Hydrodynamics, transfers MICROSCOPIC LEVEL • Microorganism: a production • facility • Biological kinetics • Implementation • Metabolic flux ; fluxome • Metabolic network • In vivo, ex vivo enzymology, stock flux, energy/matter • Thermodynamics • In vivo, ex vivo NMR • Structured modelling and metabolic descriptor Information flux Biochips DNA, proteins,bioinformatic, network of genes, of proteins, of metabolites Metabolome Transcriptome Biochips Proteome Signal MOLECULAR LEVEL

  5. Scientific Reasoning Hypothesis Generation Deduction Abduction Prediction Observation Verification

  6. Reasoning about biological systems • Construction of a system model • The task of forming a model to explain a given set of experimental results is called model identification. • This is a form of inductive inference. For example, if the levels of the metabolites in glycolysis are observed over a series of time steps, and from this data the reactions of glycolysis are inferred, this would be model identification. • Simulation of the system behavior based on the model constructed • This is a form of deductive inference. For example, a dynamic model of glycolysis might tell you how the level of pyruvate in a cell varies over time as the amount of glucose increases. If the deductive predictions of a model are inconsistent with observed behaviour then the model is falsified. • A Model is asimplifieddescriptionofacomplexentityorprocess and consists : • A set of systems constraints in terms of state variable • And/or Their time derivatives

  7. Representation of Biological Systems • Directed graphs (for example, decision trees, cluster analysis) • Matrix models (for example, linear systems, Markov processes), • Dynamical systems • Cellular automata .

  8. M activation G inhibition The Problem • Development of Molecular Biology produces a huge quantity of data • Interaction between molecules has an effect on the cell behavior • Mathematical Models are used to extract the emergent laws of the combinatory interactions. • Difficulties : • interactions non-linear • Model parameters difficult to measure

  9. Our approach Relevant Information Fuzzy logic Hierarchical Classification Inductive Logic Programming Classification Machine Measures- 3 levels of analysis Hypotheses or « Classes » Biologic Knowledge Biologic Rules

  10. Time Series • Time series analysis is often associated with discovery of patterns such as : • Increasing • Decreasing • frequency of sequences, repeating sequences • prediction of future values or specifically termed forecasting in the time series context.

  11. CENPK 133-7D ("CFM" glucose 15 g/l) 6 15 5 12 4 Glucose 9 Biomasse Ethanol 3 (g/l) Glycérol 6 (g/l) 2 3 1 0 0 0 5 10 15 20 25 Métabolisme fermentaire Temps (h) Batch Fermentation

  12. CENPK 133-7D ("CFM" glucose 15 g/l) 6 15 5 12 4 Glucose 9 Biomasse Ethanol 3 (g/l) Glycérol 6 (g/l) 2 3 1 0 0 0 5 10 15 20 25 Métabolisme fermentaire Diauxie Temps (h) Batch Fermentation

  13. CENPK 133-7D ("CFM" glucose 15 g/l) 6 15 5 12 4 Glucose 9 Biomasse Ethanol 3 (g/l) Glycérol 6 (g/l) 2 3 1 0 0 0 5 10 15 20 25 Métabolisme fermentaire Diauxie Métabolisme oxydatif Temps (h) Batch Fermentation µmax= 0,45 h-1 YS/X= 0,37 g.(g glucose)-1

  14. Formalization of our problem : CProgol4.4 • We have 4 potential state for the bio-reactor.(e1,e2,e3,e4) • We add a specific state e5 corresponding to a stationary state • The predicate to learn with our ILP machine is: • to-state(Ei,Et,P1,P2,T) We want to obtain a causal relationship between the transition of the system and the values of differential Or the wavelet coefficients of the curve

  15. Formalization of our problem • Solution: add a predicate • derive(P1,P,T) • Express the fact that, for the curve of the parameter P at time T, the value of the differential is P1

  16. Results • We get a lot of rules but the next one could be explain by biochemical experts • to_state(E,E,A,B,C,T) :- derive(p1,A,T), • derive(p2,B,T), derive(p3,C,T), • positive(p1,T), positive(p2,T)positive(p3,T).

  17. pH CO2 X 6 5 5.75 CO2 5 pH 4 L 4 5.5 3 3 2 2 5.25 1 1 5 0 0 5 13 21 29 0.6 0.4 Appartenance 0.2 0 5 13 21 29 fermentaire diauxie oxydatif fin batch Visualisation of system evolution This rule indicates that there is no evolution ofthe metabolism state (the bio-reactor remains in the same state) when Theparameters have an increasing slope but that we do not encounter maxima or minima • Instead ofsimply giving classification results, we get some logical rulesestablishing a causality relationship between different parametersof the bio-machinery.

  18. Data Processing : Regularity Analysis

  19. Acid Consommation d’Ac. Aminés Comment caractériser une singularité ?

  20. Which tool for analysis on-line ??? • Multrifactal analysis studies functions of which punctually regularity varies from a point to other • Derivability continuity • Holder exponent

  21. Lipschitz Regularity A signal is considered to have regularity if it is possible to approximate it by a polynomial. mesure the error of polynomial approximation

  22. Analysis of singularities • The Taylor development of f in x0 Using Wavelet Analysis the dominant behavior is given by the term :

  23. Caracterisation of Lipschitz exponent • Définition • A function is Lipschitz of order  in a point  if in this point it exists point a K>0 and a polynomial pof degree m= such :

  24. Fourier Condition • TheoremA function f is bounded and uniformly Lipschitz  on  if : • Global regularity condition

  25. Holder Regularity • Hölder exponents measures the remainder of a Taylor expansion. • Characterize the local scaling properties. • Measure the local regularity/differentiability. • Is linked to the decay rate of the Fourier and wavelet coefficients.

  26. Holder Regularity • Measures the local differentiability: • 1≤ α, f(t) is continuous and differentiable. • 0 < α < 1, f(t) is continuous but no differentiable. • -1 < α ≤ 0, f(t) is discontinuous and non-differentiable. • α≤ -1, f(t) is not longer locally integrable

  27. Characterization of Lipschitz exponent by CWT • Théorème • If f is  Lipschitz in x0 , n then If f(x) is  Lipschitz in x0 , 0n if

  28. Waveletes • Efficiency for non-stationary signals • Good localization in time and frequency • The Wavelet Transform is defined as an integral operator which transforms a signal of energy f(x)L2(R) using a set of functions ab. • WT(f,ab)= < f | ab > •   where < > is the dot product .

  29. Morlet Wavelets Elementary Function : The wavelet coefficients are numbers :

  30. < s(.) , δ(. - t) > Tt Ff Combining time and frequencyShort-time Fourier Transform < s(.) , δ(. – f) > < s(.) , gt,f(.) > = Q(t,f) = <s(.) , TtFf g0(.) >

  31. Tt Ψ0( (u–t)/a ) Da Ψ0(u) Combining time and frequencyWavelet Transform frequency time < s(.) , TtDa Ψ0 > = O(t,f = f0/a)

  32. Maximum modulus of the wavelet transform (MMWT). is equivalent to the Canny edge detector.

  33. Detection of singularities (Hölder <0) • Temporally Segmentation • Calculus of the correlation between signal used to control the fermentation and others signal • Comparison of the correlation sign before and after singularities Differentiation of biological phenomena's from bio-physiques phenomena's (fed-batch).

  34. Fed-batch Processus for biomass production

  35. Oxydation Spontaneous oscillations of the yeast

  36. Fuzzy Logic : Clustering and Aggregation

  37. Our approach Relevant Information Fuzzy logic Hierarchical Classification Inductive Logic Programming Classification Machine Measures- 3 levels of analysis Hypotheses or « Classes » Biologic Knowledge Biologic Rules

  38. Fuzzy • Logic • Semantically using tables or Boolean algebra • Syntactically via proof method • Fuzzy logic based on real numbers • Dealing with vagueness e.g. for formalising common natural language

  39. x1 DAM de x1 pour Cj x2 DAM de x2 pour Cj Objet mCj(X) • Degré d’Adéquation Global (DAG) pour la classe Cj • Opérateurs logiques d’agrégation xn DAM de xn pour Cj Degré d’Adéquation Marginal (DAM) pour la classe Cj LAMDA (Learning Algorithm for Multivariate Data Analysis)

  40. DAM= Membership function • Parametrized membership function • And its solution is given • By Similar membership function Membership is defined as a function of the distance d(x) between a given object and a standard member

  41. Generalization of a binomial low {0,1} in [0,1] DAMij(xi)= ija(xi,cij) (1 - ij ) (1 - a(xi,cij)) a(xi,cij)=1- distance between xi et cij ij depends of the statical properties of the class LAMDA

  42. Indépendance cognitive Aggregation Operators

  43. Definition • An aggregation operator is simply a function, which assigns a real number y to any n-tuple • (x1,x2, …,xn) of real numbers : y =Aggreg( x 1, x2 , , xn ) • We define an aggregation operator as a function : • Aggreg (x) = x Identity when unary • Aggreg (0,…,0) = 0 and Aggreg (1,…,1) = 1 Boundary conditions • Aggreg (x1,…, xn) ≤Aggreg (y1,…, yn) Non decreasing • if (x1,…, xn) ≤ (y1,…, yn)

  44. T-norm • A t-norm is a function * : [0,1]2[0,1] such that for all x,y,z [0,1] : • Commutativity • Associativity • Monotonicity • Identity • Lukasiewicz • Godel t-norm • Product t-norm T-norms generalize intersection to fuzzy set

  45. Mean Operator • A mean operator is a function * : [0,1]2[0,1] such that : • Example : • Median • Bisymmetrical

  46. Reinforcement • One characteristic of many types of human information processing is what Yager and Rybalov full reinforcement. • A collection of high scores reinforces each other to give a resulting score more affirmative then any of the individual scores alone and on the other hand the tendency of a collection of low scores to reinforce each other to give a resulting score more "disfirmative" than any of the individual scores. • Good modeling of the human behavior • Refine the information related to the real world

  47. Completely Reinforced Operators 3 • (Silvert 1979, Yager & Rybalov 1998) Completely reinforced and symmetrical sum: If then If then

  48. Remark • The T-norms are negative reinforced, but they are not positive reinforced • The T-conorme are positive reinforced, but they are not negative reinforcement • The combination T-norms and T-conorms is not completly reinforced • The means operators are not positively or negative reinforced by definition

  49. Mean 3 • Approach: Mean Operator Generatrix Function: positive and increasing

  50. A new mean : Mean 3 • The commutativity: M3(x,y)=M3(y,x) • The monotonic: M3(x,y)  M 3(z,t) • if x  z and y  t • The idempotance M3(x,…,x)=x • The self identity M3 [B,<MPI(B)>]= M3(B) The first three conditions could be deduce easily from the properties of the product of n-square functions

More Related