1 / 27

Bioinformatics at NASA or Yes Virginia, NASA does do biology!

Bioinformatics at NASA or Yes Virginia, NASA does do biology!. Maryland. Michael New Astrobiology Discipline Scientist. Bioinformatics at NASA?. Bioinformatics is used at NASA in several ways: Fundamental Space Biology: How do organisms, including humans, adapt to the space environment?

wilson
Download Presentation

Bioinformatics at NASA or Yes Virginia, NASA does do biology!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics at NASAorYes Virginia, NASA does do biology! Maryland Michael New Astrobiology Discipline Scientist

  2. Bioinformatics at NASA? • Bioinformatics is used at NASA in several ways: • Fundamental Space Biology: How do organisms, including humans, adapt to the space environment? • Planetary Protection: What is the nature of the community of micro-organisms living in space-craft assembly areas and on space-craft? • Astrobiology: What can the genomes of life on Earth tell us about the origin, evolution, distribution and future of life on Earth and the potential for life elsewhere Bioinformatics Technology Forum

  3. Fundamental Space Biology • How are molecular signals, pathways, and products in humans and model organisms (e.g., mice) altered by exposure to microgravity and space radiation factors? • How is drug metabolism affected by space related effects? • Are there critical stages in development that are affected by altered gravity? • Why virulence of pathogens appears to increase in space? Bioinformatics Technology Forum

  4. Small Sats + On-board Expression Measurements + Bioinformatics 30 cm x 10 cm x 10 cm How to make inferences? Bioinformatics Technology Forum

  5. Making good inferences is the key ExperimentalData Analysis Algorithm New Knowledge Backgroundknowledge PreviousResults Andrew Pohorille, Jeff Shrager and Steve Racunas NASA Center for Astrobioinformatics, Karl Schweighofer Bioinformatics Technology Forum

  6. Hypothesis:Jnk1 activates c-Jun ? An example: the Jnk Pathway External Stimuli TGF- TPA Kinases Jnk1 Jnk2 JunD c-Jun Transcription Factors . . . mRNA Transcripts p19 IL-11 p53 nur77 Bioinformatics Technology Forum

  7. Expression studies are inconclusive p value: Probability that posterior of H, p(D|H), is just spurious (i.e., same posterior likely with random D when ¬H) Bioinformatics Technology Forum

  8. Background knowledge makes a difference! Bioinformatics Technology Forum

  9. Need a system for evaluating biological models Bioinformatics Technology Forum

  10. Planetary Protection • What organisms are present in and on spacecraft? • How can we assess the “bioburden” of spacecraft? • How can we ensure the no Terran life hitchhikes to a clement spot on another planet? • How can we assess the safety of returned samples? Bioinformatics Technology Forum

  11. Assessing “crud” • What is the diversity of low-biomass samples taken from a space-craft assembly clean room? • Comparing two new techniques: Affymetrix’sPhylochip and 454 sequencing. Bioinformatics Technology Forum

  12. Third Generation Phylochip • Additional advancements • Smaller feature size -> no increase in chip cost. • Smaller sample volumes: decreased cost in reagents. • Improved analysis • More sophisticated fragmentation method • Refined analysis software • Improved validation approach. Relatively inexpensive and suitable for repeated assays, Less robust quantitation Bioinformatics Technology Forum

  13. 454 Sequencing: The Sogin Survey Method • In a single run, 454 technology can generate up to 200,000 independent sequence reads of ~100 bases each. Comprehensively samples short variable rRNA regions • First report on deep sea diversity estimates 10-100 times more species than previously suspected (Sogin et al., PNAS 2006). A few species are common, vast majority are rare • This method easily adapted to spacecraft bioburden inventory. Gives some estimate of quantity as well as phylogeny 454 Inc Method is expensive and requires large amounts of DNA. More suitable for infrequent assays of pooled samples. Bioinformatics Technology Forum

  14. Family-level Comparisons • Overall both methods showed high agreement of detection at the family level, but only when data from all temperature gradients was compiled. 31 65 22 454 V6 Pyrosequencing: Families Detected: 87 Detected exclusively on PhyloChip: 22 G2 PhyloChip: Families Detected: 96 Detected exclusively on PhyloChip: 31

  15. Astrobiology: Life in a Universal Context • How does life begin and evolve? • What do the rock record and genomes tell us? • Does life exist elsewhere in the Universe? • Life as we know it? • “Weird” life? • How can either be detected? • What is the future for life on Earth and beyond? Bioinformatics Technology Forum

  16. Three case studies • Development of new tool to assess HGT. • Peter Gogarten and Olga Zhaxybayeva • Use of standard tools to look for independent “leaps to land.” • Zoe Cardon, Louise Lewis, and Harry Frank • Resurrecting ancient proteins. • Steve Benner, et al. Bioinformatics Technology Forum

  17. How can we assess the degree of HGT present on the early Earth? • Quartet is a smallest unit of phylogeneticinformation • Each quartet can have three unrooted tree topologies • Support for different quartet topologies can be summarized for all gene families Bioinformatics Technology Forum

  18. Why use embedded quartets? • No assumption that all genes in a genome have the same phylogenetic history. • The total number of quartets is much smaller than number of tree topologies, which makes it possible to evaluate all quartets. • Gene families present only in few analyzed genomes can be included in the analyses • Phylogenetic signal can be divided into plurality consensus and the conflicting signal. • Allows us to partition analyzed genomes according to some scenario (e.g., grouping by ecology) and retrieve gene families that support or conflict it. Bioinformatics Technology Forum

  19. Example: Cyanobacteria & their Genes • Analyzed gene families in 11 sequenced cyanobacterial genomes using the developed quartet decomposition method • Cyanobacterial genomes reveal a complex evolutionary history, which cannot be presented by a single strictly bifurcating tree for all genes or even most genes. • Across short phylogenetic distances all type of genes appear to be equally affected by transfer. Across large phylogenetic distances genes encoding metabolic functions are more frequently transferred, and genes in transcription and translation are less frequently transferred Olga Zhaxybayeva, J. Peter Gogarten, Robert L. Charlebois, W. Ford Doolittle and R. Thane Papke: "Phylogenetic Analyses Of Cyanobacterial Genomes: Quantification Of Horizontal Gene Transfer Events", Genome Research, 2006, 16:1099-1108. Bioinformatics Technology Forum

  20. What traits were needed for “leap to land”? Green Plants 5 Major Green Algal Classes (sensu Mattox and Stewart, 1984--recent revision divides Charophyceae into 6 classes) Terrestrial green plants Numerous independent habitat transitions provide statistical power for detecting traits correlated with successful leaps from water to land. ? ? ? Chlorophyceae ? ? ? ? Trebouxiophyceae ? ? ? ? ? Ulvophyceae ? ? Charophyceae The famous leap to land Embryophytes N=1 Prasinophyceae ? ? ? N=? leaps of eukaryotic green algae from aquatic or marine habitats to land Bioinformatics Technology Forum 20

  21. Bioinformatics used to: • Infer evolutionary relationships among known aquatic and recently isolated desert algae using data from nucleotide sequences (large data sets, multiple genes) to estimate diversity and describe new species. • Estimate the number of transitions from aquatic to terrestrial habitats (Bayesian methods). To date, we estimate at least 40 evolutionarily independent transitions! • Test the correlation of source habitat type with traits that occur in our desert and related aquatic algae, using comparative statistical methods that take into account evolutionary relationships among taxa. Lewis and Lewis 2005, Systematic Biology, 54: 936-947; Gray et al. 2007, Plant Cell and Environment, 30:1240-1255; Cardonet al. 2008, Bioscience, 58:114-122; Lewis, unpublished Bioinformatics Technology Forum

  22. Moving from single cells to multicellular animals • This seems hard to do from the perspective of molecular biology: • Change the goal of life to replicate cells as fast as possible (what bacteria do) toreplicating cells under control, and then not at all (what you do) • The fossil record makes the transition seem sudden (but the fossil record may be missing many things) • We are not certain that the transition is not driven by planetary change, such as the emergence of abundant oxygen in the atmosphere Understanding how this transition took place on Earth helps NASA infer how likely it is to have taken place elsewhere, a key part of the Drake equation to estimate the likelihood of intelligent life elsewhere in the cosmos. Bioinformatics Technology Forum

  23. Since fossils are no help, turn to genomes • Exhaustive matching supported models for protein sequence evolution • New tools to score amino acid replacements • Tools to extend the model that scores replacements • Tools to exploit homoplasy, compensatory covariation, other non-Markovian behaviors of in the evolution if real proteins diverging under functional constraints • Gonnet, G. H., Cohen, M. A., Benner, S. A (1992) Exhaustive matching of the entire protein sequence database. Science256, 1443-1445 Sequencing of Choanoflagellate provides outgroup, an animal diverging just before multicellularity emerges King, N. et al. (2008) The genome of the choanoflagellateMonosigabrevicollis and the origin of metazoans. Nature451, 783-66 Multicellularity emerges What happened here in the genome? Bioinformatics Technology Forum

  24. So what happened? Many things • Steroid receptors emerged, together with oxygen-dependent proteins that make steroid hormones; key at many places in metazoan biology • Protein tyrosine phosphorylatingkinases emerged from serine kinases • Protein tyrosine phosphatases emerged (from an unknown source) • Kinase substrates emerged that were phosphorylated on tyrosines • SH2 domains that bind to phosphortyrosine emerged (unknown source) And not just one example. Lots of them with correlated evolution. JAK STAT JAK is a two domain kinase. The domains are duplicates of a single domain; the duplication occurred in this episode. STAT is a family of substrates for JAK, also arising by duplication at the same time as the JAK domains duplicated. Bioinformatics Technology Forum

  25. How do we know that the ancestral proteins were doing phosphorylation, being phosphorylated etc. at that time? Bring the experimental method to bear on historical hypotheses using biotech to resurrect genes and proteins having the inferred ancestral sequence, studying their behavior in the lab. Consider the SH2 domains, which bind to phosphotyrosine, a new function emerging together with multicellularity. The SH2 domains are a large family having various binding specificities. Resurrection shows that the ancestral proteins bind as well, and shows their specificity. (Benner, et al., unpublished) Binds (Gln or Tyr)-Asn-Tyr) Binds (Ile or Val)-Asn-(Val or Pro)) outgroup Bioinformatics Technology Forum

  26. Acknowledgements • Andrew Pohorille (NASA ARC) • Jeff Schrager (Stanford) • Stephen Racunas (Stanford) • Karl Schweighofer (SETI Inst) • Catharine Conley (NASA ARC) • Mitch Sogin (MBL) • KasthuriVenkataswaran (JPL) • Gary Andersen (LBL) • J. Peter Gogarten (U Conn) • Olga Zhaxybayeva (Dalhousie) • Zoe Cardon (MBL) • Louise Lewis (U Conn) • Frank Lewis (U Conn) • Steve Benner (FFAME) • Jason Raymond (UC Merced) • Rob Knight (CUB) • Eric Gaucher (GA Tech) Bioinformatics Technology Forum

  27. Questions? Comments? Brickbats? Bioinformatics Technology Forum

More Related