1 / 34

Identifying Causal Genes and Dysregulated Pathways in Complex Diseases

Yoo-Ah Kim NIH / NLM / NCBI. Identifying Causal Genes and Dysregulated Pathways in Complex Diseases. Nov. 6 th , 2010. Complex Diseases. Associated with the effects of multiple genes As opposed to single gene diseases

moya
Download Presentation

Identifying Causal Genes and Dysregulated Pathways in Complex Diseases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Yoo-Ah Kim NIH / NLM / NCBI Identifying Causal Genes and Dysregulated Pathways in Complex Diseases Nov. 6th, 2010

  2. Complex Diseases • Associated with the effects of multiple genes • As opposed to single gene diseases • The combination of genomic alteration may vary strongly among different patients • Dysregulating the same components, thus often leading to the same disease phenotype • Difficult to study and Treat • Cancer, Heart diseases, Diabetes, etc.

  3. Copy Number Variations • Two copies of each gene are generally assumed to be present in a genome • Genomic regions may be deleted or duplicated causing CNV • Some CNVs are associated with susceptibility or resistance to diseases such as cancer Copy Number Variations in 158 Glioblastoma patients

  4. Identifying Genomic Causes in Complex Diseases • Identify genotypic causes in individual patients as well as dysregulated pathways • Systems biology approach • Genome-wide search • Graph theoretic algorithms • Circuit flow • Set cover • 158 Glioblastomamultiforme patients

  5. Glioblastomamultiforme (GBM) • the most common and most aggressive type of primary brain tumor in humans

  6. Expression as Quantitative Trait Genotype: Copy number variations Phenotype: Gene expression

  7. eQTL (expression Quantitative Trait Loci) Analysis • While we assume that the genetic variation is the cause and expression change is the effect, we don’t know molecular pathways behind the relation Putative causal gene/loci Putative target gene

  8. A B cases cases Method Outline g1 s1 s2 g2 tag loci g3 target genes s3 gm s4 • Target gene selection • Gene expression • eQTL • Find association between expression and copy number • Circuit flow algorithm • Molecular interactions • Candidate causal genes • Causal gene selection • Weighted multiset cover sn C causal genes target Gene gm tag SNP sn TF-DNA phosphoryl. event protein- protein + - D causal genes cases

  9. Target Gene Selection • Select a representative set of disease genes • Filter differentially expressed genes for each case • Multi-set cover Gene Expression Gene 1 Gene 2 Gene 3 . . . . . Controls Disease Cases

  10. eQTL • Associations between the expression of target genes and copy number variations of genomic loci cases cases • Linear regression • For every pair of tag loci and target genes tag Loci target genes

  11. Finding Candidate Causal Genes Target Genes Genotypic Variations

  12. Finding Candidate Causal Genes Target Genes Genotypic Variations Candidate Genes C1 C2 C3 C4 ? C5

  13. Finding Candidate Causal Genes Genotypic Variations Candidate Genes Interaction Network Target Genes C1 C2 C3 D C4 C5 protein-protein interactions phosphorylation events transcription factor interactions.

  14. Finding Candidate Causal Genes Genotypic Variations Candidate Genes Interaction Network Target Genes C1 D C2 C3 C4 C5 u v + - Current flow Resistance (u, v) is set to be reversely proportional to (|corr (expr(u), expr(D))| + |corr(expr(v), expr(D))|)/2

  15. Finding Candidate Causal Genes Genotypic Variations Candidate Genes Interaction Network Target Genes C1 D C2 C3 C4 C5 + - Current flow Compute the amount of current entering each causal gene by solving a system of linear equations

  16. A B cases cases Method Outline g1 s1 s2 g2 tag loci g3 target genes s3 gm s4 • Target gene selection • Gene expression • eQTL • Find association between expression and copy number • Circuit flow algorithm • Molecular interactions • Candidate causal genes • Causal gene selection • Weighted multiset cover sn C causal genes target Gene gm tag SNP sn TF-DNA phosphoryl. event protein- protein + - D causal genes cases

  17. Final Causal Gene Selection causal genes • A putative causal gene explains a disease case if • its corresponding tag locus has a copy number alteration • its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case cases

  18. Final Causal Gene Selection causal genes • A putative causal gene explains a disease case if • its corresponding tag locus has a copy number alteration • its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case cases

  19. Final Causal Gene Selection causal genes • A putative causal gene explains a disease case if • its corresponding tag locus has a copy number alteration • its affected target genes (i.e., genes sending a significant amount of current to the causal gene) are differentially expressed in the disease case WEIGHT cases

  20. Final Causal Gene Selection • Find a smallest set of genes covering (almost) all cases at least k’ times  minimum weighted multi-set cover

  21. Dysregulated Pathways • Causal paths between a target and a causal gene • a maximum current path C1 C2 C3 C4 C5 D

  22. Selected Causal Genes

  23. Results • 701 candidate causal gene from circuit flow algorithm (STEP C) • 128 causal genes from set cover (STEP D)

  24. Causal Genes • The selected causal gene set includes many known cancer implicated genes Functional analysis using DAVID BSOSC Review, November 2008

  25. PTEN as causal gene fold change - 0 + TF TF-DNA protein- protein kinase causal genes

  26. EGFR as causal and target gene TF Causal EGFR kinase causal genes fold change - 0 + phosphorylation TF-DNA protein- protein Target EGFR

  27. Conclusion • A novel computational method to simultaneously identify causal genes and dys-regulated pathways • Circuit flow algorithm • Multi-set cover • Augmentation of eQTL evidence with interaction information resulted in a very powerful approach • uncover potential causal genes as well as intermediate nodes on molecular pathways • Our method can be applied to any disease system where genetic variations play a fundamental causal role

  28. Acknowledgements • Teresa M. Przytycka • Stefan Wuchty • Other group members • Dong Yeon Cho • Yang Huang • Damian Wojtowicz • Jie Zheng

  29. A B cases cases Method Outline g1 s1 s2 g2 tag loci g3 target genes s3 gm s4 • Target gene selection • Gene expression • eQTL • Find association between expression and copy number • Circuit flow algorithm • Molecular interactions • Candidate causal genes • Causal gene selection • Weighted multiset cover sn C causal genes target Gene gm tag SNP sn TF-DNA phosphoryl. event protein- protein + - D causal genes cases

  30. EGFR as causal and target gene Causal Paths TF causal EGFR kinase causal genes phosphorylation TF-DNA protein- protein fold change - 0 + target EGFR

  31. PTEN as causal gene Causal Paths fold change - 0 + TF TF-DNA protein- protein kinase causal genes

  32. Our Method • Integrate several types of data • Gene expression • Copy number variations • Molecular interactions

  33. Methods and Results • Method • model the expression change of disease genes as a function of genomic alterations • translated the propagation of information from a potential causal to a disease gene as the flow of electric current through a network of molecular interactions. • multi-set cover: select most prominent genes causal genes disease gene gm tag SNP sn + - • Validated our approach by testing the enrichment of selected causal genes with known GBM/Glioma related genes

More Related