130 likes | 137 Views
Babelomics is a web resource for the functional interpretation of genome-scale experiments. It provides tools for the analysis and interpretation of genome-scale experimental output, including functional annotation, differential distribution analysis, and gene set enrichment analysis.
E N D
BabelomicsFunctional interpretation of genome-scale experiments Barcelona, 28 November de 2007 Ignacio Medina imedina@cipf.es David Montaner dmontaner@cipf.es http://bioinfo.cipf.es Bioinformatics Department CENTRO DE INVESTIGACION PRINCIPE FELIPE (VALENCIA)
Babelomics: A systems biology web resource for the functional interpretation of genome-scale experiments. http://babelomics.bioinfo.cipf.es
Genome-scale experiment output Functional Interpretation
Babelomics imported databases ENSEMBL www.ensembl.org GO KEGG Interpro Transcription Factors Cisred Bioentities Literature Gene expression Homo sapiens HGNC symbol EMBL acc UniProt/Swiss-Prot UniProtKB/TrEMBL Ensembl IDs RefSeq EntrezGene Affymetrix Agilent PDB Protein Id IPI…. Mus musculus Rattus norvegicus Ensembl ID Gallus gallus Drosophila melanogaster Caenorhabditis elegans Saccharmoyces cerevisae Arabidopsis thaliana
Babelomics tools FatiGO: Finds differential distributions of Gene Ontology terms between two groups of genes. FatiGOplus: an extension of FatiGO for InterPro motifs, pathways and SwissProt KW , transcription factors (TF), gene expression in tissues, bioentities from scientific literature, cis-regulatory elements CisRed. Tissues Mining Tool: compares reference values of gene expression in tissues to your results. MARMITE Finds differential distributions of bioentities extracted from PubMed between two groups of genes. FatiScan: detect significant functions with Gene Ontology, InterPromotifs, Swissprot KW and KEGG pathways in lists of genes ordered according to differents characteristics. MarmiteScan: Use chemical and disease-related information to detect related blocks of genes in a gene list with associated values. GSEA: Detects blocks of functionally related genes with significant coordinate over- or under-expression using the Gene Set Enrichment Analysis.
FatiGO Organism Gene List1 Gene List2 Biological process Molecular function Cellular component KEGG pathways Biocarta Pathways (new) Interpro motifs Swissprot keywords Bioentities from literature (Marmite) Gene Expression (TMT) Transcription Factor binding sites Cis-regulatory elements (CisReD) miRNAs (new) Text files with a column of identifiers emailme@cipf.es your project name
A B Biosynthesis 6 2 No biosynthesis 4 8 Testing the distribution of functional terms among two groups of genes(remember, we have to test hundreds of GOs) Group A Group B Are this two groups of genes carrying out different biological roles? Biosynthesis 60% Biosynthesis 20% Sporulation 20% Sporulation 20% Genes in group A have significantly to do with biosynthesis, but not with sporulation.
FatiGO Results Gene group1 is enriched in this functional block Gene group2 is enriched in this functional block percentages p-values corrected p-values
Organism Gene List ordered according the experimental value Biological process Molecular function Cellular component KEGG pathways Interpro motifs Keywords Swissprot Transcription Factor Cis-regulatory elements FatiScan
Testing along the ordered list Annotation label A Annotation label B Annotation label C B C A List of genes + • Index ranking genes according to some biological aspect under study. • Database that stores gene class membership information. • FatiScan searches over the whole ordered list, trying to find runs of functionally related genes. Block of genes enriched in the annotation A Annotation C is homogeneously distributed along the list Block of genes enriched in the annotation B -
% Genes with the specific GO annotation for each partition Fatiscan results B C A List of genes + -
GO over-represented among genes over-expressed in A GO over-represented among genes over-expressed in B % Genes with the specific GO annotation for each partition Functional interpretation A B + Expression level -
FatiScan Example Tumor Control t ~Tumor mean expression – Control mean expression + t Proliferation Is more associated with the genes on the top of the list All genes in the array Is more associated with the genes that show higher expression in Tumors - t