1 / 60

Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

gggatttagc tcagttggg agagcgcca gactgaa ga t ttg gag g tcctgtgtt cgatccac agaattc gcacca. Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment. Harvard-MIT GtL Center Goals. Protein Complexes : Mass Spectrometry

Download Presentation

Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. gggatttagctcagttgggagagcgccagactgaa gat ttg gag gtcctgtgttcgatccacagaattcgcacca Share, Search, Merge, Check, Design:e.g. 3D & Sequence alignment

  2. Harvard-MIT GtL Center Goals • Protein Complexes : Mass Spectrometry • multi-species-time-series & crosslinking • 2 Regulatory Networks : RNA array quantitation • 3 Microbial Communities, Biofilms : Polonies* • Tagged-strain-competition, Single Cell Activities. • 4 Computational Modeling: Metabolic Optimization • & 4D Cell modeling* (Workshop B*)

  3. CO2 100 ppmv increase http://jan.ucc.nau.edu/~doetqp-p/courses/env470/Lectures/lec41/Lec41.htm

  4. Energy & CO2 Fluxes 4x1013 kW of sunlight hits earth per year. We consume 2kW per person* 6x109 = 1010 kW. CO2 >370 ppm = 730 x1015 g globally, increase ~3 x1015 /yr. Ocean productivity = ~100 x1015 g/yr. Autotrophs: 1025 Prochlorococcus cells globally (108 per liter) Undone by Cyanophages & Heterotrophs: 2x1028 SAR11 cells in the oceans Pseudomonas & Caulobacter in a variety of soils & aquatic environments http://www.gsfc.nasa.gov/gsfc/service/gallery/fact_sheets/earthsci/terra/earths_energy_balance.htm http://clear.eawag.ch/models/optionenE.html Morris et al. Nature 2002 Dec 19-26;420(6917):806-10. http://hosting.uaa.alaska.edu/mhines/biol468/pages/carbon.html

  5. HarvardMIT DOEGtL Center C.Ting Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory, Laub, Kucherlapati

  6. Biosystems Integrating Measures & Models Environment Metabolites RNAi Insertions SNPs DNA Proteins RNA Replication rate interactions Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms

  7. Comparison of predicted with observed protein properties (abundance, localization, postsynthetic modifications)E.coli Link et al. 1997 Electrophoresis 18:1259-313 (Pub)

  8. In vivo crosslinking DNA-binding proteins

  9. Multidimensional peptide measures (Optionally protein separation steps) 3rd 2nd

  10. Prochlorococcus Proteogenomic Map Numberson top in basepairs. 1700 ORFs are predicted . Proteomic Model is based on Mass-spectrometry of peptides at 24h time points. DifferenceMapindicates new peptide regions. The 6 colors represent ORFs in the 6 reading frames .(Harvard-MIT GtL:Jaffe, Church, Lindell, Chisholm, et al. )

  11. Circadian time-series (Prochlorococcus)RNA &protein quantitation: RNA (3 AM) RNA (3 AM) R2=.992R2=.635 Linear RegressionR2=.1 (Harvard-MIT GtL:Jaffe, Church, Lindell, Chisholm, et al. )

  12. Goals 1& 2: RNAs & Proteins Next steps • Detect a higher fraction of peptides • (currently ~ 80% proteins, 87% peptides max, 19% average) • 2 Comparison of two Prochlorococcus isolates • (1700 vs 2500 genes, high vs low light adapted) • 3 Move from two time points to smooth series.

  13. Biosystems Integrating Measures & Models Environment Metabolites RNAi Insertions SNPs DNA Proteins RNA Replication rate interactions Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms

  14. Why we model cells? • Tests of understanding • Program minimal cells (100kbp) • Nanobiotechnology - new polymers • Manage complex systems • e.g. stem cells & ocean ecology

  15. Suboptimality of mutants --integrating growth rate & flux data Minimization of MetabolicAdjustment (MoMA) for the analysis of non-optimal metabolic phenotypes Daniel Segre, Dennis Vitkup

  16. MoMA/FBA REFERENCES - Haemophilus influenzae metabolism (Schilling andPalsson, J.Theor.Biol. 2000) - Escherichia coli metabolic network and gene deletions (Edwards and Palsson, PNAS 2000, BMC Bioinf. 2000) - Helicobacter pylori (Edwards, Schilling, Covert, Church, Palsson, J. Bact 2002) - Escherichia coli MOMA (Segre, Vitkup, & Church, PNAS 2003)

  17. Fluxes include transport, & a growth flux Vtrans Membrane Vsyn Vdeg Xi Vgrowth Xi=const.  vj=0 Growth: c1Xi+ c2X2+... +cmXm Biomass

  18. Biomass Composition ATP GLY LEU coeff. in growth reaction ACCOA NADH FAD SUCCOA COA metabolites

  19. FluxBalanceAnalysis core 2 1 Find max{Growth} using simplex Null(S)={v : Sv=0}

  20. Can we use flux analysis to say something about suboptimal states ?

  21. Flux ratios at each branch point yields optimal polymer composition for replication x,y are two of the 100s of flux dimensions

  22. Projection can leave the mutant feasible space…so Quadratic programming (QP) to find the nearest point

  23. 12C13CFluxRatio Data

  24. Flux DataC009-limited 200 WT (LP) 180 7 8 160 140 9 120 10 Predicted Fluxes r=0.91 p=8e-8 100 11 14 13 12 3 1 80 60 40 16 20 2 6 5 15 4 17 18 0 0 50 100 150 200 Experimental Fluxes 250 250 Dpyk (LP) Dpyk (QP) 200 200 18 7 r=0.56 p=7e-3 8 r=-0.06 p=6e-1 150 150 7 8 2 Predicted Fluxes Predicted Fluxes 10 9 13 100 9 100 11 12 3 1 14 10 14 13 11 12 3 50 50 5 6 4 16 16 2 15 5 6 18 17 15 17 0 0 4 1 -50 -50 -50 0 50 100 150 200 250 -50 0 50 100 150 200 250 Experimental Fluxes Experimental Fluxes

  25. Flux data (MOMA & FBA)

  26. Competitive growth data On minimal media negative small selection effect C 2 p-values 4x10-3 1x10-5 Novel redundancies Position effects

  27. Replication rate of a whole-genome set of mutants Badarinarayana, et al. (2001) Nature Biotech.19: 1060

  28. lysC 1 2 10.4 Replication rate challenge met: multiple homologous domains thrA 1 2 3 1.1 6.7 metL 1 2 3 1.8 1.8 Selective disadvantage in minimal media probes

  29. Multiple mutations per gene Correlation between two selection experiments Badarinarayana, et al. (2001) Nature Biotech.19: 1060

  30. Goals 3& 4: Populations and models Next steps • 1 Generate MOMA models for autotrophs • Comparison of models for multiple Prochlorococcus • & Pseudomonas genomes • Insertion & point mutant competitions for hard-to-grow species (e.g.. Prochlorococcus 24 hr doubling).

  31. Harvard-MIT GtL Center Goals • Protein Complexes : Mass Spectrometry • multi-species-time-series & crosslinking • 2 Regulatory Networks : RNA array quantitation • 3 Microbial Communities, Biofilms : Polonies* • Tagged-strain-competition, Single Cell Activities. • 4 Computational Modeling: Metabolic Optimization • & 4D Cell modeling

  32. Biosystems Integrating Measures & Models MOMA Darwinian (sub)optima Polonies (CD44 & cancer) Arrays&Mass-spec (circadian & cell cycle) Environment Metabolites DNA Proteins RNA interactions Microbes Cancer & stem cells In vitro replication multicellular organisms

  33. GtL Workshop B: Experimental Technology Development and Integration Tue at 2 PM Co-Chairs – George Church, Harvard Medical School Ham Smith, Institute for Biological Energy Alternatives As we attempt to understand, protect, and/or engineer environmental microbial communities, we need to ask what sorts of data would most benefit our models and how to obtain these cost-effectively. For this session let us answer what small (or large) technological step are we taking toward these specific challenges: (1) microscopic methods capable of tracing the chain of a small genome, (2) quantitation of “all” peptide states (either in single cells or populations), (3) Sequencing at Mbp per $, and (4) automated designed genome engineering. The framework for the discussions will be the following questions: · What are the most useful technologies for our tasks/goals now and for the future? What are the major technological gaps that will need to be addressed to reach the GTL goals? To what extent will the technologies be developed by others? · How can technologies best be used to complement each other and strengthen the resulting research/insights? How do we promote the kind of synergistic interactions among the practitioners? Presentations by Joachim Frank (Wadsworth Center, New York State Department of Health) on Cryo-Electron Microscopy, Bob Hettich or Greg Hurst (ORNL) and Dick Smith (PNNL) on Mass spectrometry, Hoi-Ying Holman (Berkeley Lab) on FTIR imaging Steve Colson (PNNL) on optical imaging We would like to invite you to bring one viewgraph to share with the participants on your views about technologies needed to meet these challenges.

  34. Biosystems Integrating Measures & Models Environment Metabolites RNAi Insertions SNPs DNA Proteins RNA Replication rate interactions Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms

  35. Improving Models & Measures Why model? “Killer Applications”: Share, Search, Merge, Check, Design

  36. Why improve measurements? Human genomes (6 billion)2 = 1019 bp Immune & cancer genome changes >1010 bp per time point RNA ends & splicing: in situ 1012 bits/mm3 Biodiversity: Environmental & lab evolution Compact storage 105 now to 1017 bits/ mm3 eventually & How? ($1K per genome, 108-1013 bits/$ ) • The issue is not speed, but integration. • Cost per 99.99% bp : Including Reagents, Personnel, • Equipment/5yr, Overhead/sq.m • Sub-mm scale : 1mm = femtoliter (10-15) • Instruments $2-50K per CPU

  37. Projected costs determine when biosystems data overdetermination is feasible. In 1984, pre-HGP (fX, pBR322, etc.) 0.1bp/$, would have been $30B per human genome. In 2002, (de novo full vs. resequencing ) ABI/Perlegen/Lynx: $300M vs. $3M 103 bp/$(4 log improvement) Other data I/O (e.g. video) 1013 bits/$

  38. Why single molecules? Integration from cells/genomes/RNAs to data Geometric constraints : Who’s “in cis” on a molecule, complex, or cell. e.g. DNA Haplotypes & RNA splice-forms

  39. Polymerasecolonies(Polonies) along a DNAor RNAmolecule

  40. Polymerase colony (polony) PCR in a gel B A’ A’ A’ B B B A’ B B B A’ A’ A’ A’ B A’ B B Single Molecule From Library A’ Primer is Extended by Polymerase A Primer A has 5’ immobilizing Acrydite 1st Round of PCR Mitra & Church Nucleic Acids Res. 27: e34

  41. Sequence polonies by sequential, fluorescent single-base extensions B B B’ B’ • Hybridize Universal Primer • Add Red(Cy3) dTTP. Wash. • Add Green(FITC) dCTP • Wash; Scan 3’ 5’ 3’ 5’ C G A T C G C G T . . .

  42. $1K per diploid human sequence Input: Buccal cells, blood, or forensic samples. Output: Prioritized list of deviant bps (e.g. non-conservative). Raw data rate: 16 pixels/bp, 1Mpixel per 6sec/CPU = 24 CPU days. Amortization: 5 yr for camera/CPU/transport @ $50K total = $200 per 1011 bp Overhead: $200 /sq ft/yr * 40 sq.ft (400 cu.ft) = $40 Reagents: At 20 mm per (5 mm) polony and 40 bp reads means 10000 cm2 area, 800 ml of fluor dNTP, $100/mg = $40 5 ml PCR reactions = $200 Disposables: 500 slides = $50 Electricity: 2 kwatts 24hr*24days* 0.13$/kwatt-hr = $150 Labor for repair: 10% of instrument cost = $10 Labor for operation: Slide PCR, slide dips, scans, etc. = $20 R&D: Initially NIH grants (roughly 10%).

  43. Inexpensive, off-the-shelf equipment Automated slide fluidics $4K MJR in situ Cycler $10K Microarray Scanner $26K+

  44. Human Haplotype:CFTR gene45 kbp Rob Mitra Vincent Butty Jay Shendure Ben Williams

  45. Quantitative removal of Fluorophores Rob Mitra

  46. Sequencing multiple polonies Template ST30: 3' TCACGAGT Base added: (C) A G T (C) (A) G (T) C (A) 3' TCACGAGT AGTGCTCA (G) T C A Rob Mitra

  47. Mutiple Image Alignment • Metric based on optimal coincidence of high intensity noise pixels over a matrix of local offsets • (0.4 pixel precision) Shendure

  48. Polony exclusion principle &Single pixel sequences Mitra & Shendure

More Related