1 / 30

Topic Outline

Network Biology Data Biological, Conceptual and Computational Issues around Network, System, and Pathway Data The Abstract and The Concrete. Topic Outline. Lessons from Genome Program and Abstract Ideas to transform data to information when looking at systems data.

lazar
Download Presentation

Topic Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Network Biology Data Biological, Conceptual and Computational Issues around Network, System, and Pathway DataThe Abstract andThe Concrete

  2. Topic Outline • Lessons from Genome Program and Abstract Ideas to transform data to information when looking at systems data. • Two examples of Concrete Tools (ready for use) • WebGestalt (for large sets of genes) • Ingenuity (for networks) • A Concrete Thing: Bioinformatics Resource Center (under development) • Other tools under development

  3. Genome-encoded “parts list” as data integrator.-Common Data Elements of gene and gene Products of transcripts and proteins. Enabling Integration and Comparison of data in NEW ways… Human Genome Project (HGP): Past Lessons and Future Directions in Data… Individualized Genotype data within populations Genome Data Phenotype and System Data GeneKeyDB and related work as an integrative foundation that can help merge with other data.

  4. HGP Highlighted some ways to succeed or fail with large data sets. ? Lessons Learned applicable for systems bio of expression, proteomics, genetic data sets? Yes. ?But, are some new approaches needed to understand SYSTEM data? Yes. Genome Data`

  5. Biggest Lesson: A Biodata item has 2 questions attached to it…Mayr…HGP showed importance of the why questions in thinking about and organizing data. Other genotype, phenotype, system data Genome Data A datum… How? Why?

  6. HGP results and Future Issues for new data…. Genotype + Environment + DEVELOPMENT ==> Phenotype 1) Astounding Results Importance of Network thinking in development and physiology for data to explain phenotype (e.g. PAX6) 2) Some relevance from HGP data approaches, but…Need new bioinformatics tools for network data and thinking…

  7. Δ data in Cellular signaling networks Δ data in Regulatory networks Δ data in protein coding

  8. Δ data in Cellular signaling networks A way of thinking about data… Bioinformatics: Finding the (genotypic, environmental data) difference that makes the (phenotypic data) difference. (Many differences that make an interesting difference, NOT at protein coding, but at complex networks) Δ data in Regulatory networks Δ data in protein coding

  9. A Biological network can be expressed and manipulated in terms of “graph theory.” Combinatorial algorithms are needed to analyze graphs. 1.7 + 1.2 + 0.9 + What is a “Network” way of viewing data… • Nodes or Vertices • May be • Genes • Gene products • Hormones, signals • Metabolites • Publications • Functional Sequence Elements • Edges or Lines • may be • Undirected vs. directed • Weighted vs. unweighted. • Could be… • Co-expression Networks • Gene Regulatory networks • Cell-Cell communication and signal transduction networks. • Phylogenetic relationships among genes, species, networks: orthology, paralogy, etc. (trees, clades, etc.) • Gene Ontology or other Directed Acyclic Graphs. e.g. Alon U. 2003. Science 301: 1866; Barabasi Linked. 2003. Plume Books. Barabasi AL, Oltvai ZN. 2004. Nat. Rev. Genetics 5: 101

  10. A Biological network can be expressed and manipulated in terms of “graph theory.” Combinatorial algorithms are needed to analyze graphs. 1.7 + 1.2 + 0.9 + What is a “Network” way of viewing data… • Nodes or Vertices • May be • Genes • Gene products • Hormones, signals • Metabolites • Publications • Functional Sequence Elements • Edges or Lines • may be • Undirected vs. directed • Weighted vs. unweighted. • Experimental correlation (can be undirected) vs. mechanistic & directed Tightly connected modules might be found… Might be loosely analogous to a protein sequence module that is conserved, duplicated, and diverged. Might see similarity across different tissue, species, etc. e.g. Alon U. 2003. Science 301: 1866; Barabasi Linked. 2003. Plume Books. Barabasi AL, Oltvai ZN. 2004. Nat. Rev. Genetics 5: 101

  11. Existing Knowledge Large Molecular data sets Genetic Data Phenotype Data WebQTL Williams et al UTHSC Microarray data, proteome, etc. MuTrack GeneKeyDB Gene-centered data integration (via GeneKEyDB, BioFoundation) Comparative, Boolean, other operations on Gene Sets & Networks WebGestalt and Ingenuity are two examples Network modules: Duplicated Diverged Converged Network Analysis CS, Stats, Bio Sequence and Network Modularity Comparative Cladistic Phylogenetic Analysis Graph Algorithms Need to collaborate, integrate, and COMPARE to find differences in biological NETWORKS.Collaborative, Integrative, and Comparative Bioinformatics Data Storage & Collaborative Bioinformatics Integrative Bioinformatics Genotype & Phenotype Data Sets Data Visualization & Stats Comparative Bioinformatics & Data Mining

  12. WebGestaltWeb-based Gene SetAnalysis Toolkit http://bioinfo.vanderbilt.edu/webgestalt Bing Zhang

  13. Can upload gene sets based on • IDs (e.g. affy, locus link, protein IDs from chip, proteome, etc.) • Genome Location • Or… • 3) Gene Ontology • (common biological process, molecular function, cellular location)

  14. Manipulate data, as set of genes or gene productsRNA expression, proteome, genomics, statistical genetics, etc. all produce list of genes that may function in a network.

  15. 1 of 3 things to doBoolean operations on multiple sets or retrieving orthologs.

  16. 2 of 3 things to doRetrieve Data and other IDs 1 of 3 things to do

  17. 3rd thing to do “Unusual” Properties across set

  18. e.g. What GO (biological processes, molecular functions, and cellular locations) are in the set? Are they any that seem to occur more than than expected…

  19. Co-occurrence of genes and publications (GRIF)

  20. Protein Domains in set

  21. Chromosome locations in set…

  22. Pathways in set (1)

  23. Pathways in set (2)

  24. Ingenuity • A commercial tool for manipulating graphs (networks). VU License http://bioinfo.vanderbilt.edu/wiki/Ingenuity • (Also some open source tools, cytoscape, GeNetViz, etc. )

  25. Use of Commercial tool, Ingenuity by Dr N. Deanne and Dr. Beauchamp Pathways (3)

  26. Bioinformatics Resource Center • Developing a Bioinformatics Resource Center (BRC) that will consist • Training infrastructure and applied workshops • Support faculty using existing tools and databases (CaBIG, custom statistical packages, NCBI genomics, imaging,molecular structure resources). • Collaborative IT • Establish accessible databases in shared cores and support faculty using these resources. … • Integrative IT • Web sites that integrate information from disparate data sets: • Comparative IT • Systems biology: comparing data across multiple platforms to identify new patterns—tissues and cells, molecular pathways, model organisms, toxins, etc (taken from VUMC Strategic Plan).

  27. Other systems… • Construction projects that can be further formed by your needs… • CollabCore and Lab Blogs • Genepedia, • GeneKeyDB, BioFoundation • Extensions to Webgestalt • TFCAT, GeneCAT, CladeCAT, Pazar

  28. Bing Zhang Stefan Kirov Leslie Galloway Barbara Jackson Betty Lou Alspaugh Oakley Crawford Suzanne Baktash Xinxia Peng Harold Shanafield Sam Wang Adam Tebbe Shawn Ericson Jeff Horner A few collaborators… Bonnie LaFleur Shawn Levy Phil Dexheimer Michael Langston CS collaborator Wyeth Wasserman Dan Goldowitz and the TMGC Rob Williams et al WebQtl, etc. Erich Baker Dan Beauchamp Natasha Deanne Chad Johnson Acknowledgments

More Related