220 likes | 291 Views
Canadian Bioinformatics Workshops. www.bioinformatics.ca. Module #: Title of Module. 2. Module 4 Analyzing gene list function and associations. Quaid Morris Interpreting gene lists from – omics studies July 15-16, 2010. Place an image representing the talk here.
E N D
Canadian Bioinformatics Workshops www.bioinformatics.ca
Module 4 Analyzing gene list function and associations Quaid Morris Interpreting gene lists from –omics studies July 15-16, 2010 Place an image representing the talk here http://morrislab.med.utoronto.ca
Overview • Extending gene lists using functional associations • Sources of functional association • GeneMANIA
Extending Gene Lists • Given a gene list, find other similar genes • Gene list defines the query and the “function” of interest • Query: complex or pathway components • Result: additional members • Query: kinases • Result: other kinases and related genes • Query: genes affected in RNAi screen • Result: other genes that may affect phenotype
Network-Based Gene Function Prediction • Genes of similar sequence often have similar function • Unknown gene similar to known gene likely to have similar function (annotation transfer) • Guilt-by-association principle • Many other similarity measures for genes (e.g. co-localization) Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64
Cell cycle CDC3 CLB4 CDC16 UNK1 RPT1 RPN3 RPT6 UNK2 Protein degradation Functional association networks to predict gene function Co-expression network Microarray expression data Eisen et al (PNAS 1998) Fraser AG, Marcotte EM - A probabilistic view of gene function - Nat Genet. 2004 Jun;36(6):559-64
Predicting Gene Function Using a Network Is gene X involved in cell cycle regulation? + CDC3 CLB4 + + Discriminant value CDC16 UNK1 0.9 ? UNK1 Labelled examples Classification algorithm UNK2 0.1 UNK3 0.05 - - RPT1 RPN3 Discriminant value: a value you can use to rank the genes according to certainty or threshold to classify genes - ? RPT6 UNK2 e.g. co-expression ? UNK3
Predicting Gene Function Using a Network Is gene X involved in cell cycle regulation? + CDC3 CLB4 + + Discriminant value CDC16 UNK1 0.9 ? UNK1 Labelled examples kNN,SVM, LabelProp UNK2 0.1 UNK3 0.05 - - RPT1 RPN3 Discriminant value: a value you can a) use to rank the genes according to certainty and b) threshold to classify genes - ? RPT6 UNK2 e.g. co-expression ? UNK3
Label propagation vs guilt-by-association CDC48 MCA1 CPR3 TDH2 Discriminant Value CDC48 CDC48 Label propagation algorithm Guilt-by-association MCA1 MCA1 CPR3 CPR3 TDH2 TDH2 -1 …………....+1
Types of functional associations • Molecular Interactions (i.e. physical interactions) • Regulatory Interactions (e.g. ChIP-chip binding) • Genetic Interactions (e.g. synthetic lethality) • Similarity relationships • Co-expression • Protein sequence (e.g. BLAST –log(E-value)) • Domain architecture • Phylogenetic profiles • Gene neighborhood** • Gene fusion** • … ** most useful for bacterial genes
Problem: genes are multi-function • Gene function could be a/the: • Biological process, • Biochemical/molecular function, • Subcellular/Cellular localization, • Regulatory targets, • Temporal expression pattern, • Phenotypic effect of deletion. Some networks may be better for some types of gene function than others
Query-specific weights for multifaceted functional queries w1x w2x w3x weights CDC27 Cell cycle CDC23 + + APC11 UNK1 Co-complexed Jeong et al 2002 Genetic Tong et al. 2001 RAD54 XRS2 DNA repair = MRE11 UNK2 Co-expression Pavlidis et al, 2002, Lanckriet et al, 2004 Mostafavi et al, 2008 The GeneMANIA project
GeneMANIA in the MouseFunc contest “Test” benchmark: Predicting held-out genes One of GeneMANIA’s two entries had the best area under the ROC curve in every category Sara Mostafavi
GeneMANIA performance on yeast More error Slower GeneMANIA on 15 networks GeneMANIA label propagation on bioPIXIE* Probabilistic graph search* on bioPIXIE* GeneMANIA on 5 networks TSS** on 5 networks * Myers et al, 2005 ** Tsuda et al, 2005 Mostafavi et al, 2008
GeneMANIA Prediction Server http://www.genemania.org or http://qa.genemania.org
Other prediction servers • STRING (http://string-db.org/) • Funcoup (http://funcoup.sbc.su.se/) • FunctionalNet (http://www.functionalnet.org) • bioPIXIE (http://pixie.princeton.edu) • MouseNet (http://mousenet.princeton.edu/)
Chemogenomics • STITCH: Chemical-Protein Interactions • http://stitch.embl.de/
What Have We Learned? • Network-based gene function prediction • Guilt-by-association principle • used to predict gene function using functional association networks • Many types of functional associations exist • Can be combined intelligently to optimize prediction accuracy • Convenient software available: GeneMANIA • Emerging area: chemical genomics gene function prediction