1 / 49

Identifying functional subnetworks in large-scale datasets

Identifying functional subnetworks in large-scale datasets. Benno Schwikowski Institut Pasteur – Systems Biology Group http://systemsbiology.fr. The three levels of this talk. Discovery of pathways active in HepC infection Cytoscape plug-ins Cytoscape platform. Hepatitis C infection.

zena
Download Presentation

Identifying functional subnetworks in large-scale datasets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Identifying functional subnetworksin large-scale datasets Benno Schwikowski Institut Pasteur – Systems Biology Group http://systemsbiology.fr

  2. The three levels of this talk • Discovery of pathways active in HepC infection • Cytoscape plug-ins • Cytoscape platform Benno Schwikowski

  3. Hepatitis C infection • One person out of 30 is infected • No vaccine exists • In 20% of chronic infections, liver fibrosis and cirrhosis • Frequently requires liver transplants Benno Schwikowski

  4. Studying HepC infection mRNA changes • 50% of transplant livers become re-infected with Hepatitis C • Study expression of 7000 genes in re-infected livers after transplantation • 1-24 month post-transplant • Samples in 3-6 month intervals • 28 biopsies from 11 patients • Mixture of hepatocytes, hepatic stellate cell, Kupffer cells, various types of blood cells • Compare against pre-transplant reference pool Benno Schwikowski

  5. Result of mRNA expression analysis • Most genes (5968 of 7000)were significantly under- or overexpressed in one or more experiments • High patient-to-patient variation Benno Schwikowski

  6. Our approach • Construct seed networkamong known molecular players • Expand seed networkto include differentially expressed genes • Identify putative pathwaysby the Active Modules approach Benno Schwikowski

  7. Types of interactions Protein-protein Protein-DNA Phosphorylation Activation Repression Covalent bond Methylation Seed network

  8. InteractionFetcher plug-in Purpose • Dynamically retrieves remote information for selected nodes • From SQL database • Requests data via XML-RPC protocol Currently implemented types • Protein/gene synonyms • Orthologs • Sequences (DNA, protein, DNA upstream) • Gene, protein, • Interactions/associations Options • Cross-species queries • Ortholog information from Homologene • Inferred interactions (interologs) • Interactive links to Source Web pages 100% open-source (client and server) Benno Schwikowski

  9. 2. Expand seed network Purpose • Bring significantly up-/downregulated genes “into the picture” Approach • Add interactions with differentially expressed genes (“in silico pull-down”) • Use BIND, HPRD databases • Only human-curated interactions Benno Schwikowski

  10. Network after InteractionFetcher expansion

  11. Identifying putative pathwaysWhy clustering can be problematic • Many clustering methods are not model-based  significance of clusters is unclear • Any given cluster may not be supported by all experiments – noise problem • Clusters tend to contain unrelated genes with vaguely similar profiles Benno Schwikowski

  12. The three levels of this talk • Discovery of pathways active in HepC infection • Cytoscape plug-ins • Cytoscape platform Benno Schwikowski

  13. How can the clustering issuesbe addressed? The ActiveModules Plug-in • Define “up-/downregulated” on the basis of a well-defined statistical model • Also derive clusters from some of the input experiments • Use additional evidence to focus on “plausible” clusters  protein interactions Benno Schwikowski

  14. Interaction networks Schwikowski, Uetz, FieldsNature Biotechnology (2000) Benno Schwikowski

  15. Modular organization of interaction networks Benno Schwikowski

  16. A lot of interaction data is becoming available Databases on... • Protein-protein interactions • Protein-DNA interactions • Genetic interactions • Metabolic pathways • Cell signaling pathways, similarity relationships, literature-based relationships Benno Schwikowski

  17. Multi-criteria detection of modules 1. Interaction networkbetween genes/proteins 2. Differential Gene/ProteinAbundances/Activities Experiments Genes  Benno Schwikowski

  18. Final Score Scoring a module candidate Perturbations /conditions Pz = 1-F(zA(j)) Rank adjustment: Binomial summation rA(j)=F-1(1-pA(j)) m = total number of conditions j = size of subset of conditions Ideker, Ozier, Schwikowski, Siegel(2002): Bioinformatics 18. S233-240

  19. Pathways in Rosetta’s compendium(300 conditions) Benno Schwikowski

  20. The three levels of this talk • Discovery of pathways active in HepC infection • Cytoscape plug-ins • Cytoscape platform Benno Schwikowski

  21. Active Modules plug-in appliedto HCV re-infection data • Iterative application results in four significant highly overlapping subnetworks • Repeat analysis only retaining “late-active” re-infection experiments • Eliminates pathways activated by transplant operation • Cutoff: 8 months Benno Schwikowski

  22. Which observations can we make locally? Network after InteractionFetcher expansion Bold: Differentially regulated subnetwork Red/Green: Late-active subnetwork

  23. Cytotalk plug-in • Overrepresentation analysis using Cytotalk plug-in, R, of overrepresentation of genes in Gene Ontology classes • Cytotalk enables interactive communication with • C/C++ programs • Java processes • Python • UNIX shell scripts • R, R scripts • Can be run on same machine or any other Internet-connected machine • Can function as Cytoscape plug-in • 100% open-source Benno Schwikowski

  24. The three levels of this talk • Discovery of pathways active in HepC infection • Cytoscape plug-ins • Cytoscape platform Benno Schwikowski

  25. Some Network Visualization Tools • Pajek - Slovenia • Osprey - SLRI, Toronto • VisANT - BU • Biolayout - EBI • GraphViz • PowerPoint • Others • Cytoscape (only open-source biology) Benno Schwikowski

  26. Cytoscape

  27. Cytoscape Basic Concepts • Objectsvisualized as nodes • Relationshipsvisualized as edges • Attributes (name, sequence, source,...) • Mappingattributes  drawing customizable throughvisual mapper Benno Schwikowski

  28. Cytoscape file formats Sample interaction file YDR216W pd YIL056W YDR216W pd YKR042W YDR216W pd YGL096W YDR216W pd YDR077W [...] Sample interaction file GENE DESC exp0.sig exp1.sig exp0.sig exp1.sig GENE0 G0 0.0 0.0 23.2 11.5 GENE1 G1 0.0 0.0 34.6 5.2 GENE2 G2 0.0 0.0 10.0 28.0 GENE3 G3 0.0 0.0 1.64 4.77 [...]

  29. Cytoscape • Display • gene & protein expression • protein interactions (physical andnon-physical) • protein classifications • Analysis plug-in modules • http://www.cytoscape.org/ • Java: platform independent + web-start • 100% open-source Benno Schwikowski

  30. Visual Styles Display gene expressionas clear text

  31. Visual Styles Map expression values to node colors using a continuous mapper

  32. Visual Styles Expression data mapped to node colors

  33. Multidimensional attributes Cytoscape, pre-release plug-in Data from Ideker et al., Science (2001)

  34. Layout • 16 algorithms available through plug-ins • Zooming, hide/show, alignment

  35. yFiles Circular

  36. Benno Schwikowski

  37. Cytoscape Core – Differences to most other approaches • Emphasis on data analysis & integration • No built-in semantics(added by plug-ins) • Very simple concepts • Human-readable input formats • Extensibility Benno Schwikowski

  38. Cytoscape extensibility • Core: 100% open source Java • Plug-in API • Plug-ins are independently licensed • “Just need to do the biology” • Template code samples Plug-in Benno Schwikowski

  39. Biomodules plug-in Prinz S, Avila-Campillo I, Aldridge C, Srinivasan A, Dimitrov K, Siegel AF, and Galitski T Genome Res. 2004 14: 380-390

  40. Cytoscape Plugins Modules in Complex Networks Iliana Avila-Campillo, Tim Galitski Discovering Regulatory and Signaling Circuits in Molecular Interaction Networks Trey Ideker, Owen Ozier, Benno Schwikowski, Andrew Siegel Data Integration in Juvenile Diabetes Research Marta Janer, Paul Shannon A network motif sampler David Reiss, Benno Schwikowski Benno Schwikowski

  41. Cytoscape Core Features • Visualize and lay out networks • Display network data using visual styles • Easily organize multiple networks • Bird’s eye view navigation of large networks • Supports SIF and GML, molecular profiling formats, node/edge attributes • Functional annotation from GO + KEGG • Metanode support (hierarchical groupings) • Extensible through plugins (20 developed) Benno Schwikowski

  42. Baliga et al.Genome ResearchJune 2004 Benno Schwikowski

  43. Collaborators: HCV Institute for Systems Biology, Seattle, WA • David Reiss • Iliana Avila-Campillo • Vesteinn Thorsson • Tim Galitski Benno Schwikowski

  44. Benno Schwikowski

  45. ISBLeroy HoodRowan Christmas Agilent Technologies Unilever PLC Long-term funding from NIH and participating institutions UCSDTrey IdekerChris Workman Memorial-Sloan KetteringCancer CenterChris SanderGary BaderEthan Cerami PasteurMelissa ClineAndrea SplendianiTero Aittokallio Collaborators: Cytoscape Benno Schwikowski

  46. Shannon, P., et al. (2003). Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res 13, 2498-504.

  47. Collaborators: Active Networks • Trey Ideker • Owen Ozier • Andrew Siegel • Richard Karp Benno Schwikowski

  48. Levels of Biological Information DNA mRNA Protein Pathways Networks Cells Tissues Organs Individuals Populations Ecologies Benno Schwikowski

More Related