1 / 47

Integration of Expression Data and Genotype Data: Application of Chronic Fatigue Syndrome Data

Explore the integration of gene expression data and genotype data for identifying causal genes of Chronic Fatigue Syndrome. Use likelihood-based model selection and biological interpretation for insights.

finnea
Download Presentation

Integration of Expression Data and Genotype Data: Application of Chronic Fatigue Syndrome Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integration of Expression Data and Genotype Data: Application of Chronic Fatigue Syndrome Data EunJee Lee1, Seoae Cho1, Taesung Park2 1 Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea 2 Department of Statistics, Seoul National University, Seoul, Korea

  2. Contents • INTRODUCTION • Needs for Integration of Data • METHOD • Integration of Gene expression data and Genotype data • Likelihood based model selection • Test for identifying causal genes of disease • Biological Interpretation • RESULT • Application to Chronic Fatigue Syndrome data • SUMMARY AND CONCLUSION

  3. Introduction

  4. Introduction New Technology Central Dogma DNA Genotype Data (SNP polymorphism) SNP data analysis mRNA Gene expression Data DNA microarray analysis Protein Protein Expression Data Phenotype Disease

  5. Introduction • Analysis between Gene expression Data and Disease in Chronic Fatigue Syndrome data • One Way ANOVA , CyberT, SAM test • Result • No Significant Result!! • Limitation • Chronic Fatigue Syndrome is complex disease • One type of data may represent only partial information for a disease. • It would be quite useful to combine both types of data.

  6. DNA mRNA Protein Phenotype Introduction • Integration of expression and Genotype data. • Questions of the causality of gene expression level • Needs to identify the causal relationships

  7. Methods

  8. ANOVA Logistic Regression Integration of Expression and Genotype data • Causal Model • Reactive Model • Independent Model STEP 1 Causality Model Selection Logistic Regression Two Way ANOVA STEP 2 Test Gene Ontology Enrichment Pathway Enrichment STEP 3 Biological Interpretation

  9. mRNA SNP Disease mRNA SNP Disease mRNA SNP Disease STEP1 Model Selection • Models for causality (Schadt et al. 2005) • Causal Model • Reactive Model • Independent Model

  10. STEP1 Model Selection • Likelihood-based Causality Model Selection (LCMS) (Schadt et al. 2005) • Likelihood-based causality model selection test that uses conditional correlation measures to determine which relationship among traits is best supported by the data. • Likelihood associated with each of the models are constructed and maximized with respect to the model parameters, and the model with the smallest AIC value is identified as the model best supported by the data.

  11. : SNP genotype • : mRNA level • : Disease STEP1 Model Selection • Causal Model • Joint Probability • Likelihood

  12. : SNP genotype • : mRNA level • : Disease STEP1 Model Selection • Reactive Model • Joint probability • Likelihood

  13. : SNP genotype • : mRNA level • : Disease STEP1 Model Selection • Independent Model • Joint Probability • Likelihood

  14. mRNA SNP Disease STEP2 Identify causal genes for a disease • Causal Model • Logistic Regression • Model represents the probability of getting disease represents the genotype of one SNPs represents the gene expression values of DNA microarray represents the interaction effect between SNPs and DNA microarray

  15. mRNA SNP Disease STEP2 Identify causal genes for a disease • Reactive Model • Two Way ANOVA • Model represent expression levels represents the effect of the SNPs genotypes represents the effect of disease groups represents the interaction effect

  16. mRNA SNP Disease STEP2 Identify causal genes for a disease • Independent Model • Test use One type of Data • Logistic Regression • Model represent SNP genotype, represents frequency of getting Disease • detecting SNP in linkage with disease loci. • One way ANOVA • Model represents gene expression level, represents SNP genotype • identifying SNP regulating gene expression level.

  17. STEP 3 Biological Interpretation • Enrichment Study of Gene Ontology • Enrichment Study of Pathway

  18. Application to Chronic Fatigue Syndrome Data

  19. Application to Chronic Fatigue Syndrome Data • Expression data For Chronic Fatigue Syndrome Data • mRNA level in Mononuclear cell • Expression level of 20160 genes are shown

  20. Application to Chronic Fatigue Syndrome Data • Pre-processing of Gene expression Data • Filtering • Quantile Normalization • Significant level • FDR control using Benjamini and Hochberg method (Benjamini,Y. and Hochberg,Y. 1995 ) for multiple testing correction • 5% FDR

  21. Application to Chronic Fatigue Syndrome Data • CFS vs. Nonfatigued Groups • STEP 1. Model Seletcion • STEP 2. Test for Identifying causal Genes • STEP 3. Biological Intepretation • CFS-MDDm vs Nonfatigued Groups • STEP 1. Model Seletcion • STEP 2. Test for Identifying causal Genes • STEP 3. Biological Intepretation

  22. Application to CFS and Nonfatigued Groups • STEP 1. MODEL SELECTION Logistic Regression Two- way ANOVA Independent Test

  23. Application to CFS and Nonfatigued Groups • STEP 2. Test for identifying key driver genes • rs258750 in NR3C1 gene • Independent Model has the significant Results.

  24. mRNA SNP Disease Application to CFS and Nonfatigued Groups • Gene expression level and Genotype variation • rs258750 regulates expression level of 166 genes. rs258750 AA rs258750 AG/GG

  25. Application to CFS and Nonfatigued Groups The evidence of neuroendocrine regulation of immunity Webster and Tonelli Annu.Rev.Immunol.2002.

  26. Application to CFS and Nonfatigued Groups • STEP 2. Test for identifying key driver genes • rs2918419 in NR3C1 gene • Independent model is selected mostly

  27. ? mRNA SNP mRNA Disease SNP Disease Application to CFS and Nonfatigued Groups • SNPs except rs258750 in NR3C1 gene • Gene expression level and Genotype variation : no significant results. • Genotype variation and Disease : six SNPs in NR3C1 gene are significant

  28. Chr 7 NR3C1 Application to CFS and Nonfatigued Groups rs258750 rs6188 rs852977 rs860458 rs2918419 rs1866388 CFS • Glucocorticoid Receptor • regulates glucocorticoid levels in blood • The level of glucocorticoid in Hypothalamic-pituitary-adrenal(HPA) axis has a significant effect on fatigue (Chaudhuri. Et al. The LANCET)

  29. Application to Chronic Fatigue Syndrome Data • CFS vs. Nonfatigued Groups • STEP 1. Model Seletcion • STEP 2. Test for Identifying causal Genes • STEP 3. Biological Intepretation • CFS-MDDm vs Nonfatigued Groups • STEP 1. Model Seletcion • STEP 2. Test for Identifying causal Genes • STEP 3. Biological Intepretation

  30. Application to CFS-MDDm and Nonfatigued Groups • STEP 1. Model Selection ANOVA Logistic Regression Two Way ANOVA Logistic Regression

  31. Application to CFS-MDDm and Nonfatigued Groups • STEP 2. Test for identifying key driver genes • rs933271 in COMT gene • Independent Model

  32. ? mRNA SNP mRNA Disease SNP Disease Application to CFS-MDDm and Nonfatigued Groups • rs933271 and rs5993882 in COMT gene • Independent Model • Genotype variation and Gene expression level • No significant result • Genotype variation and disease

  33. Application to CFS-MDDm and Nonfatigued Groups • STEP 2. Test for identifying key driver genes • rs6188 in NR3C1 gene • Genes in reactive model has many significant results 234

  34. Application to CFS-MDDm and Nonfatigued Groups • STEP 3 Biological Interpretation • Gene Ontology Enrichment Study of results of Tests(GOstats)

  35. Application to CFS-MDDm and Nonfatigued Groups • Pathway Enrichment Study (from in BioCarta) • Agrin in Postsynaptic Differentiation

  36. Application to CFS-MDDm and Nonfatigued Groups • Eicosanoid metabolism -Eicosapentaenoic acid-rich essential fatty acid supplementation in chronic fatigue syndrome associated with symptom remission and structural brain changes. Int J Clin Pract. 2004 Mar;58(3):297-9. -The use of eicosapentaenoic acid in the treatment of chronic fatigue syndrome.Prostaglandins Leukot Essent Fatty Acids. 2004 Apr;70(4):399-401. Review. -Determination of fatty acid levels in erythrocyte membranes of patients with chronic fatigue syndrome.Nutr Neurosci. 2003 Dec;6(6):389-92. -Eicosanoids and essential fatty acid modulation in chronic disease and the chronic fatigue syndrome.Med Hypotheses. 1994 Jul;43(1):31-42. Review.

  37. Application to CFS-MDDm and Nonfatigued Groups • Actin regulation Dysregulated expression of tumor necrosis factor in chronic fatigue syndrome: interrelations with cellular sources and patterns of soluble immune mediator expression.Clin Infect Dis. 1994 Jan;18 Suppl 1:S147-53.

  38. Application to CFS-MDDm and Nonfatigued Groups • Other significant pathway (in BioCarta) • Biosynthesis of cystein in mammals • Biosynthesis of Threonine and methionine • Inactivation of Gsk3 AKT cause accumulation of b-catein in Alveolar Macrophages • Basic Mechinisms of SUMOylaation • Catabolic pathways for Methionine, isoleucine, threonine and valine • ALK in cardiac myocytes • Overview of telomerase RNA component gene hTerc transcriptional regulation • Biosynthesis of neurotransmitters

  39. Chr22 COMT rs933271 rs5993882 Nonfatigued CFS-MDDm Application to CFS-MDDm and Nonfatigued Groups Chr 7 NR3C1 rs6188 rs852977 CFS-MDDm

  40. Summary and Conclusion

  41. Summary and Conclusion • CFS and CFS-MDDm has causal relationships, and different pathway to provoke the disease

  42. Chr 7 NR3C1 CFS and Nonfatigued Groups rs258750 rs6188 rs852977 rs860458 rs2918419 rs1866388 CFS

  43. Chr22 COMT rs933271 rs5993882 Nonfatigued CFS-MDDm CFS-MDDm and Nonfatigued Groups Chr 7 NR3C1 rs6188 rs852977 CFS-MDDm

  44. Summary and Conclusion • Advantage • The causal relationships between Gene expression levels ,Genetic variation and disease for identifying causal genes of a disease • Limitation • Complicated causal Models • Future Analysis • More complicated causal models such as feedback Models • Develop Sophisticated method for other possible models of Causality • Integration method adding Protein Data

  45. Reference • Schadt.E.E,et al , Integrating Genotype and Gene expression Data to Identify Key Drivers of Complex Traits. Nat.Genetics.37,2005. • Rolf H.Adler,Chronic fatigue syndrome(cfs),SWISS MED WKLY,2004:134:268-176 • V.Tusher, R.Tibshirani, G.Chu, Significance analysis of microarrays applied to the ionizing radiation response, PNAS, 2001, 98:5177-5121 • Baldi,P.,Long,AD, Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes, Bioinformatics 17,2001,509-519 • Jeanette I. Webster, Leonardo Tonelli, Neuroendocrine regulation of Immunity, Annu.Rev.Immunol.2002.20:125-63

  46. Reference • Principles of Neural Scienece, Kandel, Schwartz and Jessell, Fourth edition • http://gostat.wehi.edu.au/ • http://snubi.org • http://www.biocarta.com/ • Benjamini,Y. and Hochberg,Y.(1995). Controlling the False Discovery Rate, A Practical and Powerful Approach to Multiple Testing. Journal of Royal statistical Society Series B, 57(1), 289-300 • A.Chaudhuri, P.O.Behan, Fatigue in neurological disorders, THELANCET,363,2004,978-988

  47. Thank you

More Related