170 likes | 280 Views
Detecting active subnetworks in metabolic interaction graphs with missing data. PI: Fritz Roth Student: Luke Hunter. Introduction. What is systems biology? Enumerate components Enumerate relationships Study with a model (computational/mathematical)
E N D
Detecting active subnetworks in metabolic interaction graphs with missing data PI: Fritz Roth Student: Luke Hunter
Introduction • What is systems biology? • Enumerate components • Enumerate relationships • Study with a model (computational/mathematical) • Analyze, predict, and interpret results • Why is systems biology important? • Increased amount of data • High throughput technology • Fast sharing of information (internet) • Understand emergent properties
Project Goal and Applications • Goal • Find regions of metabolism that are altered by a drug, environment, or disease • Applications • Determine what causes disease phenotype • Study affect of drugs on metabolism • Study action and signaling pathways • Determine how biological modules communicate • Remove human bias from “pathway” assignment
Step 1: Enumerate Components (1) • Determine differential expression of metabolites • Use mass spectrometry • Use t-test to obtain p-values • Null hypothesis is that control and disease patient metabolite concentrations are identical Sabatine, M., et. al. (2005).
Step 1: Enumerate Components (2) • The inverse cumulative normal distribution function converts p-values to z-scores: where
Step 2: Enumerate Relationships (1) • What is a graph? • A graph is an organizational structure made up of nodes and edges: • Represent metabolism as graph • Nodes are metabolites (with z-scores) • Edges are reactions connecting those metabolites
Step 2: Enumerate Relationships (2) • Kyoto Encyclepedia of Genes and Genomes (KEGG) • Compounds • Glycans • Reactions
Step 3: Modeling (1)Scoring Functions • What is an active subnetwork? • Scoring functions: Naïve Ideker et al. (2002) Whitlock (2005) Geometric Mean
Step 3: Modeling (2)Simulated Annealing (SA) Ideker et al. 2002
Step 3: Modeling (3) • Show example: presentation data\0 Corrected Score Iteration Number
Step 4: Predictions (1) • Simulated annealing predictions for glucose data: nodes3d-win32
Step 4: Predictions (2) • Predictions are in agreement with the literature
Acknowledgements • Dr. Fritz Roth & Dr. Gabriel Berriz • Dr. Jocelyn Spragg & Deborah Milstein • NSF REU Program • Everyone else
References • Papers • [1] Ahloulay M, Schmitt F, Dehaux M, Bankir L. 1999. Vasopressin and urinary concentrating activity in diabetes mellitus. Diabetes Metab. 25(3):213-22 • [2] Pappa KI, Vlachos G, Theodora M, Roubelaki M, Angelidou K, Antsaklis A. 2007. Intermediate metabolism in association with the amino acid profile during the third trimester of normal pregnancy and diet-controlled gestational diabetes. Am J Obstet Gynecol. 2007 Jan;196(1):65.e1-5. • [3] Ma XR, Zhou CF, Wang SQ, Wang WQ, Liu YX, Wang SX, Wang FF, Zhang JH, Li YY. 2007. Effects of ganoderma lucidum spores on mitochondrial calcium ion and cytochrome C in epididymal cells of type 2 diabetes rats. Zhonghua Nan Ke Xue. 2007 May;13(5):400-2. • [4] Ideker, T., Ozier, O., Schwikowski, B., and Siegel, A.F. 2002. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18: S233–S240. • [5] Whitlock, M. (2005). Combining probability from independent tests: the weighted Z-method is superior to Fisher’s approach. J. Evol. Biol. 16, 1368-1373. • [6] Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K.F., Itoh, M., Kawashima, S., Katayama, T., Araki, M., and Hirakawa, M.; From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D354-357 (2006). • [7] Sabatine, M., et. al. (2005). Metabolomic Identification of Novel Biomarkers of Myocardial Ischemia. Circulation. 2005; 112:3868-3875. • Books • [8] Systems Biology: Properties of Reconstructed Networks by Bernhard Ø. Palsson • Websites • [9] http://www.aber.ac.uk/compsci/Research/bio/robotsci/tech/ml/models.shtml • [10] http://search.cpan.org/~lbrocard/GraphViz-2.02/lib/GraphViz.pm • [11] http://brainmaps.org/index.php?p=desktop-apps-nodes3d
Step 3: Modeling (2) • Corrected z-score (background distributions) Std. Dev. of zagg Average zagg Subset size Subset size
Step 4: Analyze Results • Exp #3: Does a biological signal exist? (glucose data) • Compare high scores of scrambled vs. non-scrambled data • This is a very low p-value • We reject the null hypothesis Scrambled Non-scrambled