260 likes | 347 Views
Using pathway information to understand genomics results. Chris Evelo BiGCaT Bioinformatics. Understanding Array data. Typical procedure Annotate the reporters with something useful (UniProt!) Sort based on fold change Search for your favorite genes/proteins Throw away 95% of the array.
E N D
Using pathway information to understand genomics results Chris EveloBiGCaT Bioinformatics the European Nutrigenomics Organisation
Understanding Array data • Typical procedure • Annotate the reporters with something useful (UniProt!) • Sort based on fold change • Search for your favorite genes/proteins • Throw away 95% of the array the European Nutrigenomics Organisation
Understanding Array data • Typical procedure • Annotate the reporters with something useful (UniProt!) • Sort based on fold change • Search for your favorite genes/proteins • Throw away 95% of the array the European Nutrigenomics Organisation
Understanding Array data • “Advanced” procedures • Gene clustering or principal component analysis • Get groups of genes with parallel expression patterns • Useful for diagnosis • Not adding much to understanding (unless combined) the European Nutrigenomics Organisation
Functional Mapping Annotation/coupling the European Nutrigenomics Organisation
Best known: GenMAPP • Full content of GO database • Textbook like local mapps • Geneboxes with active backpages, coupled to online databases • Visualize anything numerical(fold changes on arrays, p-values, present calls, proteomics results) the European Nutrigenomics Organisation
GenMAPP: Full GO content the European Nutrigenomics Organisation
GenMAPP:Textbook like maps Extensive backpages present with links to online databases the European Nutrigenomics Organisation
GenMAPP: visualize anything numerical Example Proteomics results (2D gels with GC-MS identification). Fasting/feeding study shows regulation of glycolysis (data from Johan Renes, UM). Other useful things:- p-values, present calls- presence in clusters- presence in QTLs the European Nutrigenomics Organisation
MAPPfinder • Ranks mapps where relatively many changes occur • Useful to find unexpected pathways • Statistics hardly developed(many dependencies to overcome) • Next example from heart failure study(Schroen et al. Circ Res; 2004 95: 506-514) the European Nutrigenomics Organisation
GenMAPP: Full GO content the European Nutrigenomics Organisation
Scientist know GenMapp Advantages: • Easy to use, • Reasonable visualization • Some pathway statistics • Interesting content Disadvantages: • Small academic initiative, uncertain lifespan • No info on reactions, metabolites, location • No change (e.g. time course) visualization • Content could be better! the European Nutrigenomics Organisation
Datasources 1 GenMapp local mapps: Created by a single postdoc (Kam Dahlquist). the European Nutrigenomics Organisation
Datasources 2 KEGG: Older pathway database (Kyoto Japan), on enzyme code (EC) level. Annotation problems for automatic annotation (no absolute match between EC and SP ID) Contributed and converted Mapps See example the European Nutrigenomics Organisation
Datasources 3 Gene Ontology Database: Simple tree structure database with a lot biological content (biologist know and like it). Automatic annotation possible even for EST’s See structure in GenMapp 1 (or use Go browser) the European Nutrigenomics Organisation
Datasources 4 Alternative programs like GeneGo: Based on expert knowledge (20 Russian biochemists). Allows pathway connection (explain) Primitive views of multiple conditions See example results the European Nutrigenomics Organisation
Fatty acid oxidation - II the European Nutrigenomics Organisation
Datasources 5 Reactome: Curated reactions database (with n-dimensional interconnections) from EBI e.a. Still lacks views and export options the European Nutrigenomics Organisation
Started as a NuGO and IOP gut health initiative. Waiting for expert response. (Add another map) Proposed workflow Combine and forwardexisting mapsto limited group of experts Evaluated some commercial tools (pathway assist). Think of best way to storepathway information Text miningfrom key genes/metabolites Forward improved mapsto limited group of experts Develop storage format plus tools Collect back page info Forward new draft to alarger group of expertswithin NuGO Develop/adapt entry toolsplus converters Test resulting maps Make maps available the European Nutrigenomics Organisation
Rachel van Haaften (BiGCaT/NuGO) and Marjan van Erk (TNO/NuGO) will test this and give user feedback Storing pathway data Rachel van Haaften (BiGCaT/NuGO) and Marjan van Erk (TNO/NuGO) will visit EBI early 2005 to learn doing this GMML (GenMapp Markup Language) is a superset of BioPAX 1. BioPAX could contain graphical views. (GMML 2 = BioPAX2). But, how do we make that happen? This step has not been taken care off as of yet… Current GenMapp BioPAX Plus/GMML 2 BiGCaT Tue students created GenMapp 2 – GMML converter with help from Lynn Ferrante (GenMapp.org) BioPAX Expert data Philippe Rocca and Imre Vastrik (EBI/Reactome) will define a way to get Reactome views and export them to GenMapp2 NUGO/EBI Reactome GMML BiGCaT/GenMapp EBI GenMapp 2 the European Nutrigenomics Organisation
Reactome to GenMapp Current status: Export via MS-Access shows only some content (reaction numbers) in GenMapp. Example the European Nutrigenomics Organisation
Views on reactions 1 Reaction databases are build from interconnected reactions (pathways). Some of these reactions may connect to other known pathways. Combined pathways may form knew pathways that we didn’t know. the European Nutrigenomics Organisation
Views on reactions 2 Unknown pathways may not be connected in the database yet. But they may: • Show co-regulation • Plus contain regulatory elements (We can’t analyze those from scratch…) the European Nutrigenomics Organisation
Challenges Biological concepts and available tools do not yet meet! Biologists: • know what they want • don’t know how to do it or what problems are involved. Computer science people: • know how it should be done • but not what should be done. BMT students have shown to be able to work on the interface the European Nutrigenomics Organisation