1 / 52

Systems Biology for Drug Discovery

Systems Biology for Drug Discovery. Building and using protein interaction networks: industry perspective. Andrej Bugrim GeneGo, Inc. Topics. Annotation process and collecting network content for idustrial-type applications

Download Presentation

Systems Biology for Drug Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Systems Biology for Drug Discovery Building and using protein interaction networks: industry perspective Andrej Bugrim GeneGo, Inc.

  2. Topics • Annotation process and collecting network content for idustrial-type applications • Biological and disease ontologies – how to improve and use them in functional analysis • Tools: utilizing network data in pharmaceutical R&D

  3. Causative relations Mechanistic relations Multi-level understanding of human biology Level of phenotype Level of Cell process/ network Level of protein

  4. Disease group Network group Specialty group Chemistry group Causative disease associations: DNA, RNA, protein levels Protein-protein; Protein-DNA; protein-RNA interactions Biomarkers Ligand-receptor interactions: drugs, leads, hits Compare Causative BC models BC-perturbed cell processes Other cancers chosen by Consortium Disease-centered knowledge base in MetaMiner (Oncology example) GG annotation team General BC schema

  5. Content

  6. Three interactions domains in MetaCore • 1,600 drugs w/targets • 4,100 endogenous metabolites • >21,000 ligand-receptor interactions • 850 GPCRs and other membrane receptors • 110 Nuclear hormone receptors Ligands: metabolites, peptides, xenoboitics Membrane receptors Signal transduction: G proteins, Secondary messengers Kinases Phosphotases 172K manually curated physical signaling interactions 538 canonical maps 42,000 13-step canonical signal transduction pathways 924 Human transcription factors 6,000 target genes Transcription factors 11,300 metabolic reactions 116 Fine metabolic maps Core effect: metabolic pathways Metabolites 4,100 endogenous metabolites

  7. MetaBase Content Overview • Database • Chemical compounds 580,000 • Drugs 8,590 • Chemical Reactions 35,600 • Metabolic networks 251 • Network • Proteins + genes 13,402 • Transcription factors 924 • Chemical compounds 26,000 • Drugs 2,740 • Endogenous compounds 4,100 • Proteins linked to drugs 2,711 • Reactions 5,330 • Small molecule ligands forhuman receptors 3,510 • blockers for ion channels 629 • Pubmed journals 3,100 • Pubmed articles 81,400 • Total amount of interactions 177,000 • Content • GeneGo regulatory networks 120 • GeneGo disease networks 88 • Maps 538 • Regulatory maps325 • Metabolic maps 116 • Traditional metabolic maps (EC)97 • Diseases4,920

  8. Database Chemical compounds 580,000 Metabolic Human Genes reactions proteins (human: 35,600 14,570 38,700) Total:137,500 MetaBase content by type

  9. Manually curated interactions (172,787) Logical relations; 1,934; 1% Signalling interactions; 137,297; 79% Protein-protein; 87,675; 51% Small molecule-protein; 42,383; 26% Metabolic reactions; 35,490; 21% Y2H "Interactome"; 2,370; 1% With virus protiens; 335; 0% Chip-Chip; 980; 1% With MicroRNA; 1,620; 1% Network interactions All interactions taken from articles indexed in Pubmed Pubmed journals 3,100 Pubmed articles 81,400

  10. Type of interactions in network

  11. Distribution of interactions by mechanism

  12. Network objects Total number of nodes: 40,229

  13. Proteins: distribution by tissue & localization

  14. Molecular functions in Database

  15. Endogenous compounds (4,100 total) • 3,070 endogenous compounds involved in metabolic reactions: 6,819 reactions with endogenous compounds only • 751 endogenous ligand for 498 receptors with 2,455 interactions • 4000 (98%) of endogenous compounds in network • 15,962 network interactions with endogenous metabolites • 3,600 compounds with structures and brutto-formulas (other 700 are “generic”: contain acyl-, alkyl- and other variable groups)

  16. Enzyme2 Enzyme1 reaction1 metabolite reaction2 Network and pathway statistics in GeneGO • >40,000 nodes; • ~177,000 edges; • Average node degree: 3,77; • 241 million shortest pathways; • Average shortest pathway length: 5.3811; • 42,000 13-step canonical signal transduction pathways; • 200canonical metabolic pathways- major metabolic fluxes like glycolysis or TCA; • 72,000 pathways on metabolic maps: pathways analogous to KEGG (KEGG has 42,500)

  17. Pathways in regulatory network Start: TMR (transmembrane receptor) TF (Transcription Factor) a a b End: Target genes

  18. Ontologies

  19. By genre: • Drama • Action • Romance • Horror • Foreign • By director: • Lynch • Tarantino • Leone • Stone • Antonioni • By actor: • Pitt • Nicholson • Depp • Redford • Damon • By year: • 2007 • 2006 • 2005 • 2004 • 2003 Molecular pthwy Cellular process Disease Metabolic process Mixed ontologies Knowledge base (ontologies) • How do you compare “action” movies vs. Tarantino movies vs. 2003 movies? • These are incomparable as these are different categories

  20. Multiple ontologies in MetaDiscovery Platform: multi-dimensional knowledge base on human biology

  21. Enrichment in GO and GeneGo processes GO processes GeneGo process networks • Resolution: interactions between proteins • Connections between all proteins in folder • Clear signaling path, effect within process • Resolution: list of proteins • No connections between proteins • No sgnaling/effect within process • 4 samples from 4 patiens • Disease/norm from same patients • Affy U133A arrays

  22. Inflammation Genes from GO process “Inflammatory response” 231 Genes fromGO-processes “Inflammatory response” “Immune response” 613 Genes fromGO-process “Immune response” 446 Not in networks 79 Not in networks 199 In networks 152 In networks 247 Not in networks 268 In networks 345 Genes in 15 process networks 1642 Genes added to networks 1297

  23. Diseases 4,881 Diseases, based on MeSH 38,709 Human genes total Human genes linked to diseases – 6,318 Diseases linked to genes – 1,630 Human genes not linked to diseases – 32,391 Diseases with no gene links – 3,251 6,318 genes are linked to 1,630 diseases 21,264 uniquearticles, indexed in PubMed

  24. Disease tree – Neoplasms by Site

  25. Drug toxicity tree 38 Drug-induced pathological processes Folders from MeSH Folders created at GeneGo based on reviews

  26. OMIM • Only genetic info (mutation, SNPs) • No expression • No protein activity, loc Gene-Disease connections in public domain and GeneGo GENE MeSH Only citation with Diseases name. Low trust Only hierarchical structure disease tree Public domaindoes not have structuredinformation about disease connectivity(by clinical classification) and causative relations withgenes and proteins GeneGo • Hierarchical strusture • disease classification 4,888 diseases • Genes associated with diseases 6,429 • Cited articles 33, 792

  27. Content. Cancer maps and networks. Breast Cancer:general scheme

  28. Angiogenesis in tumor growth

  29. Unique genes Fine metabolic differences between rodents, human Human Mouse, Rat Unique genes and orthologs catalyse one reaction 141 mouse genes 74 rat genes There is no human orthologs for Protein A Unique genes catalyze unique reactions 9 mouse genes 2 rat genes Orthologs catalyse different reactions 1 mouse gene 1 rat gene

  30. Tools

  31. Data analysis workflow in MetaDiscovery suit • Custom interactions data: • Y2H • Pull-down • Co-expression • annotation Custom maps, networks, pathways Molecular bio data ISIS DB MetaLink PathwayEditor MapEditor • P-value scoring • Ontologies: • GO processes • GeneGo processes • Canonical pathways • Metabolic networks • Diseases • Toxicities • Cross-experiment comparison • Time series • Multi-patient cohorts • Multiple logical operations • Complete report • Signature networks • Diseases • Drug response • Network alignment • Multiple algorithms • Sub-network queries SBML, BioPax • Modeling software: • CellDesigner • Virtual Cell • Med. chemistry: • Indications • - Toxicities • - Off-site effects Structures sdf, MOL Metabolites HTS, HCS HTS, HCS MetaCore/MetaDrug platform Biology: - Biomarkers - Pathway-based targets

  32. MetaCore™ Platform Networks Building Tools Statistics for pathways, processes, networks Pathway editor Visualization Tools Data:m-arrays, SAGE, proteomics, siRNA, metabolites, custom interactions Logical operations module curated interactions from the literature Oracle Based Database

  33. Networks of protein interactions • Dynamic; built “on-the-fly” • Exploratory tool • Build new pathways for genes of interest Pathways Integration • Interactive, static maps • 550 maps • Signaling, regulation, metabolism, diseases • Backbone of formalized “state of art” in the field

  34. Choose direction and checkpoints within network building page From – histamine through – histamine H1 receptor to – Actin

  35. False discovery rate filter i Threshold 0.01 Apply Non-significant bars become semi-transparent

  36. New customization modules • MapEditor: custom maps synchronized with MC/MD database • Draw pathways maps from scratch • Transform gene lists into networks into pathway maps • Edit MetaCore’s canonical maps • View and score your maps within the context of canonical maps • Map experimental data on custom maps • MetaLink: overlaying custom interactions • Import custom interactions (Y2H, co-expression, pull-down, etc.) • Visualize using GeneGo network building algorithms • Score “unknown” proteins (high IP potential) based on relevance to “benchmark” networks built from MetaCore interactions • PathwayEditor: annotation technology transfer, at the database level • Custom annotation of interactions, compounds, diseases, metabolism in the framework of internal annotation system at GeneGo • Use the annotation forms, workflows and QC system developed at GeneGo • Novel objects are imported and integrated with pre-existing data in MetaCore

  37. Adding Localizations Additional Localizations can be added

  38. Your NEW map is now an interactive part of MetaCore Users can visualize their experimental data on the new map

  39. Mapping interaction sets on networks Resulting Direct Interactions network Pink interactions are from the uploaded links file Mouse over an interaction to see the uploaded weight value Blue interactions are in both the links file and the MetaCore database

  40. Algorithms

  41. Old and new ways to analyze data Current way of analysis: all significance calculations done before mapping onto network Statistical procedures, thresholds of fold, p-value either in MC or 3rd party tools Full data tables Connect them on network by one way or another: Too many choices, no clear way to choose Sets of genes New way of analysis: significance calculations follow the mapping onto network Statistical procedures in MC based on concurrent analysis of expression profiles and connectivity Full data tables Apply to global network Sets of network modules

  42. Samples are analyzed in pathway’s expression space

  43. Network signatures for compounds effects Mestranol Phenobarbital Tamoxifen Phenobarbital

  44. A Topologically significant Not topologically significant Finding topologically significant nodes B C 4 out 6 under nodes regulated by B are differentially expressed: more than random share = significant Only 1 out of 6 nodes regulated by C is differentially expressed: could be due to random event = not significant In reality algorithm also considers nodes beyond first-degree neighbors Differentially expressed genes Non-differentially expressed genes

  45. Why JAK1 is significant in this dataset? Regulation via JAK1 Feedback loops • JAK1 provides essential network conduit between PLAUR and many differentially expressed targets of STAT1 • Topological significance helps to find important links in pathways that do not come up on HT screens

  46. Regulation of lipid Metabolism Topologically significant nodes revealed by the new algorithm Differentially expressed genes identified by microarray and confirmed by proteomic screen

  47. Putting it all together: network activity inference • Identifying causal relation between putative input and output signals • Tracking effects of molecular perturbation trough activation/inhibition cascades Predicted input Scoring intermediary nodes Experimental data Experimental data: terminate cascade Predicted target Experimental data: start cascade Inferred activity

  48. Work in progress • Finding Patterns of significance (based on one experiment): • Significant neighborhoods • Significant receptors (by underlying cascade) • Significant transcription factors (by upstream cascade) • Significant interaction types (by distribution of expression at terminals) • Finding common and different pathway modules (based on multiple samples: • Looking for “differential pathways” - modules that distinguish one group of samples from another • Finding common motifs in a group of pathway modules • Inferring patterns of network activity • Identifying causal relation between putative input and output signals • Tracking effects of molecular perturbation trough activation/inhibition cascades • Looking into mutual gene-process information and Bayesian inference of significance • If gene G occurs only in process P its up-/down-regulation is a significant evidence with respect to inferring P’s status • If gene G occurs in many other processes in addition to P its up-/down-regulation is not a significant evidence with respect of inferring P’s status

  49. Future products

  50. MetaMiner Consortiums for 2007 • Oncology (breast cancer, 4 other cancers) • Metabolic diseases (diabetes II, obesity, metabolic syndrome) • CNS and neurodegenerative diseases • Immunological and autoimmune diseases

More Related