350 likes | 627 Views
Network-based data-integration. Kathleen Marchal. Overview. Data collection. Network Inference. Network-based dataintegration. 1. ARRAY BASED 2. NEXT-GEN SEQUENCING RNA- Seq analysis ChIP-seq Bulked segregant analysis. 1 . Sequence-based data analysis MotifSuite ModuleDigger
E N D
Network-based data-integration Kathleen Marchal
Overview Data collection Network Inference Network-based dataintegration 1. ARRAY BASED 2. NEXT-GEN SEQUENCING RNA-Seq analysis ChIP-seq Bulked segregant analysis 1. Sequence-based data analysis MotifSuite ModuleDigger Crossed 1. Network-based analysis of unstructured gene lists 2. Network-based gene prioritization 3. Network-based eQTL analysis 4. Network-based subtyping 2. Network reconstruction Lemone Distiller Comodo Bayesian network reconstruction http://bioinformatics.psb.ugent.be/DBN/
Network-inference EXPERIMENTAL INTERACTION DATA Protein protein Transcriptional Signaling Etc… FUNCTIONAL DATA INTEGRATED NETWORK Gene expression E. coli S. Typhimurium human Phylogenetic profiles 1 0 0 1 1 1 0 1 0 0 1 0 0 1 1 1 0 0 1 0 Domain interactions Clootset al. Curr. Opin. Microbiol. 2011
Network-based dataintegration Using reconstructed networks to interpret in house data Pathfinding Kernel based strategies • Applications • Mode of action determination • Network-based eQTL analysis for trait selection • GWAS • Mechanistic insights
Network-based similarity • Adapt based on the biological question LOCUS Distances between nodes NETWORK CLUSTERING Distances between causal genes and effects PATHFINDING Verbeke et al. 2013 De Maeyer et al. 2013 Dougali et al. in prep
Network-based interpretation of gene lists Phenetic Causes Causes KO strains … Mechanism ? … Differential expressed genes Effects Effects De Maeyer et al., MBS 2013
Network-based interpretation of gene lists De Maeyer et al., MBS 2013
Network-based gene prioritization Pooled genotyping (BSA) Identified QTL ? ? Gene Gene Gene ? Source nodes Pooled expression analysis Differentially expressed genes with respect to the inferior parent RNA seq Target nodes Pullido et al., in prep
… … … … … Network-based eQTL analysis eQTL association mapping in sexually reproducing organisms Genotyped individuals • Goal: • Exploit natural variation amongst individuals • Use molecular phenotype to identify the MOA of the ‘observed phenotype) • Classical association analysis does not work in clonal species! genomic loci Expression profiles of individuals genes
Network-based eQTL analysis Clonal systems Evolution experiment Asexual reproduction (quasi clonal) Bacteria Cancer Eukaryotic pathogens
Network-based eQTL analysis Clonal systems Winning clones Pooled sequencing eQTL analysis Winning clone 1 Unevolved strain Evolved community 1 Pooled sequencing eQTLanalysis Unevolved strain Evolved community 2 Winning clone 2 eQTL analysis to study clonal evolution
Network-based eQTL analysis PATIENT 1 PATIENT 2 PATIENT 3 3 evolved communities Goal: identify causal mutations underlying the phenotype (distinguishing driver from passenger mutations) Differentially expressed gene Mutated gene • Different eQTL analysis is essential
… … … … … Network-based eQTL analysis Genotyped individuals genomic loci Physical interaction network Expression profiles of individuals genes
Network-based eQTL analysis 1. Distinguish driver from passenger mutations in the genomic space Genotype space 2. Identify mode of action by linking the drivers to the expression phenotype 3. Subtyping: find groups of strains with different MOA Expression phenotype How to best integrate the data?
Network-based eQTL analysis • Distinguish driver from passenger mutations • Clustering in the genomic space • HYPOTHESIS : • Functional relatedness in the network space • Maximal coverage in the patient space => Problem exacerbated in absence of known ‘subtypes’
Network-based eQTL analysis 2. Identify mode of action by linking the drivers to the expression phenotype
Network-based eQTL analysis 3. Subtyping: find the groups of patients for which the clustering in the genomic space best correlates with the expression behavior Subtyping is based on differences in genotype-expression phenotype association, which not necessarily correlates with a clinical difference
The all in one solution… g g g g g g q g g p m m q Patient p p m Gene g CNV q g Mutation m p q
The all in one solution… Represent all data in a matrix
The all in one solution… Similarity matrix after kernel calculation
The all in one solution… 1. Distinguish driver from passenger mutations in the genomic space 2. Identify mode of action by linking the drivers to the expression phenotype 3. Subtyping: find groups of strains with different MOA Did it work?
Proof-of-concept: breast cancer Dataset • 463 patients • 1674 differentially expressed genes (normalsvs tumors) selected from 20,000 genes • CNVs differential between tumors and matched normals • Amplifications and deletions • CNV segments determined using GISTIC 2 • 141 significant segments found • Somatic mutations (SNPs) • Filtered using biologically relevant filtering
De novo subtyping • Recapitulates existing subtypes • PAM50 based on expression • Here subtyping based on ‘all’ information
Driver mutations & MOA Which information types contributed to the patient classification? Interest score for each information type
Driver mutations & MOA P53 P53 PIK3CA PIK3CA P53
Driver mutations & MOA P53 PIK3CA BASAL HER2 LUMA LUMB Amplified region containing ERBB2
Driver mutations & MOA P53 PIK3CA BASAL HER2 LUMA LUMB
Driver mutations & MOA P53 P53 PIK3CA PIK3CA P53
Driver mutations & MOA CDH1 MAP2K4 MAP3K1 PTEN MLL3 PIK3CA • Set of mutually exclusive mutations that cover most of the LUMA patients • Map on the p38-JNK pathway • Do not cover all patients!
P38-JNK1 pathway In mutation group
Driver mutations & MOA CDH1 MAP2K4 MAP3K1 PTEN MLL3 PIK3CA Somatic mutations in p38-JNK pathways are mutually exclusive with CNVs in lumA
P38-JNK1 pathway Pink= in CNV group In mutation group
Acknowledgements UGENT/KUL • Carolina Fierro • Jimmy Van den Eynden • Yan Wu • Aminael Sanchez • Dries De Maeyer • Sergio Pullido Ghent University/INTEC • Jan Fostier • Piet De Meester • LievenVerbeke Ghent University/PSB • EvangeliaDougali • Yves Van de Peer • KUL/Computer science • Luc De Raedt • Siegfried Nijssens • JorisRenkens • Tan Levan http://bioinformatics.psb.ugent.be/DBN/