1 / 34

Network-based data-integration

Network-based data-integration. Kathleen Marchal. Overview. Data collection. Network Inference. Network-based dataintegration. 1. ARRAY BASED 2. NEXT-GEN SEQUENCING RNA- Seq analysis ChIP-seq Bulked segregant analysis. 1 . Sequence-based data analysis MotifSuite ModuleDigger

jensen
Download Presentation

Network-based data-integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network-based data-integration Kathleen Marchal

  2. Overview Data collection Network Inference Network-based dataintegration 1. ARRAY BASED 2. NEXT-GEN SEQUENCING RNA-Seq analysis ChIP-seq Bulked segregant analysis 1. Sequence-based data analysis MotifSuite ModuleDigger Crossed 1. Network-based analysis of unstructured gene lists 2. Network-based gene prioritization 3. Network-based eQTL analysis 4. Network-based subtyping 2. Network reconstruction Lemone Distiller Comodo Bayesian network reconstruction http://bioinformatics.psb.ugent.be/DBN/

  3. Network-inference EXPERIMENTAL INTERACTION DATA Protein protein Transcriptional Signaling Etc… FUNCTIONAL DATA INTEGRATED NETWORK Gene expression E. coli S. Typhimurium human Phylogenetic profiles 1 0 0 1 1 1 0 1 0 0 1 0 0 1 1 1 0 0 1 0 Domain interactions Clootset al. Curr. Opin. Microbiol. 2011

  4. Network-based dataintegration Using reconstructed networks to interpret in house data Pathfinding Kernel based strategies • Applications • Mode of action determination • Network-based eQTL analysis for trait selection • GWAS • Mechanistic insights

  5. Network-based similarity • Adapt based on the biological question LOCUS Distances between nodes NETWORK CLUSTERING Distances between causal genes and effects PATHFINDING Verbeke et al. 2013 De Maeyer et al. 2013 Dougali et al. in prep

  6. Network-based interpretation of gene lists Phenetic Causes Causes KO strains … Mechanism ? … Differential expressed genes Effects Effects De Maeyer et al., MBS 2013

  7. Network-based interpretation of gene lists De Maeyer et al., MBS 2013

  8. Network-based gene prioritization Pooled genotyping (BSA) Identified QTL ? ? Gene Gene Gene ? Source nodes Pooled expression analysis Differentially expressed genes with respect to the inferior parent RNA seq Target nodes Pullido et al., in prep

  9. … … … … Network-based eQTL analysis eQTL association mapping in sexually reproducing organisms Genotyped individuals • Goal: • Exploit natural variation amongst individuals • Use molecular phenotype to identify the MOA of the ‘observed phenotype) • Classical association analysis does not work in clonal species! genomic loci Expression profiles of individuals genes

  10. Network-based eQTL analysis Clonal systems Evolution experiment Asexual reproduction (quasi clonal) Bacteria Cancer Eukaryotic pathogens

  11. Network-based eQTL analysis Clonal systems Winning clones Pooled sequencing eQTL analysis Winning clone 1 Unevolved strain Evolved community 1 Pooled sequencing eQTLanalysis Unevolved strain Evolved community 2 Winning clone 2 eQTL analysis to study clonal evolution

  12. Network-based eQTL analysis PATIENT 1 PATIENT 2 PATIENT 3 3 evolved communities Goal: identify causal mutations underlying the phenotype (distinguishing driver from passenger mutations) Differentially expressed gene Mutated gene • Different eQTL analysis is essential

  13. … … … … Network-based eQTL analysis Genotyped individuals genomic loci Physical interaction network Expression profiles of individuals genes

  14. Network-based eQTL analysis 1. Distinguish driver from passenger mutations in the genomic space Genotype space 2. Identify mode of action by linking the drivers to the expression phenotype 3. Subtyping: find groups of strains with different MOA Expression phenotype How to best integrate the data?

  15. Network-based eQTL analysis • Distinguish driver from passenger mutations • Clustering in the genomic space • HYPOTHESIS : • Functional relatedness in the network space • Maximal coverage in the patient space => Problem exacerbated in absence of known ‘subtypes’

  16. Network-based eQTL analysis 2. Identify mode of action by linking the drivers to the expression phenotype

  17. Network-based eQTL analysis 3. Subtyping: find the groups of patients for which the clustering in the genomic space best correlates with the expression behavior Subtyping is based on differences in genotype-expression phenotype association, which not necessarily correlates with a clinical difference

  18. The all in one solution… g g g g g g q g g p m m q Patient p p m Gene g CNV q g Mutation m p q

  19. The all in one solution… Represent all data in a matrix

  20. The all in one solution… Similarity matrix after kernel calculation

  21. The all in one solution… 1. Distinguish driver from passenger mutations in the genomic space 2. Identify mode of action by linking the drivers to the expression phenotype 3. Subtyping: find groups of strains with different MOA Did it work?

  22. Proof-of-concept: breast cancer Dataset • 463 patients • 1674 differentially expressed genes (normalsvs tumors) selected from 20,000 genes • CNVs differential between tumors and matched normals • Amplifications and deletions • CNV segments determined using GISTIC 2 • 141 significant segments found • Somatic mutations (SNPs) • Filtered using biologically relevant filtering

  23. De novo subtyping • Recapitulates existing subtypes • PAM50 based on expression • Here subtyping based on ‘all’ information

  24. Driver mutations & MOA Which information types contributed to the patient classification? Interest score for each information type

  25. Driver mutations & MOA P53 P53 PIK3CA PIK3CA P53

  26. Driver mutations & MOA P53 PIK3CA BASAL HER2 LUMA LUMB Amplified region containing ERBB2

  27. Driver mutations & MOA P53 PIK3CA BASAL HER2 LUMA LUMB

  28. Driver mutations & MOA P53 P53 PIK3CA PIK3CA P53

  29. Driver mutations & MOA CDH1 MAP2K4 MAP3K1 PTEN MLL3 PIK3CA • Set of mutually exclusive mutations that cover most of the LUMA patients • Map on the p38-JNK pathway • Do not cover all patients!

  30. P38-JNK1 pathway In mutation group

  31. Driver mutations & MOA CDH1 MAP2K4 MAP3K1 PTEN MLL3 PIK3CA Somatic mutations in p38-JNK pathways are mutually exclusive with CNVs in lumA

  32. P38-JNK1 pathway Pink= in CNV group In mutation group

  33. Novel subtypes

  34. Acknowledgements UGENT/KUL • Carolina Fierro • Jimmy Van den Eynden • Yan Wu • Aminael Sanchez • Dries De Maeyer • Sergio Pullido Ghent University/INTEC • Jan Fostier • Piet De Meester • LievenVerbeke Ghent University/PSB • EvangeliaDougali • Yves Van de Peer • KUL/Computer science • Luc De Raedt • Siegfried Nijssens • JorisRenkens • Tan Levan http://bioinformatics.psb.ugent.be/DBN/

More Related