370 likes | 534 Views
Harvard MIT DOE GtL Center. C.Ting. 2-20 μm. 7-Feb-2005 4:10-4:40 PM. Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory arep.med.harvard.edu. 0.6 μm. Molecular Systems Biology Access is free of charge. Transcriptomics Proteomics Metabolomics Functional genomics
E N D
HarvardMIT DOEGtL Center C.Ting 2-20 μm 7-Feb-2005 4:10-4:40 PM Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory arep.med.harvard.edu 0.6 μm
Molecular Systems BiologyAccess is free of charge Transcriptomics Proteomics Metabolomics Functional genomics Structural genomics Computational biology Theoretical biology Mathematical biology Synthetic biology www.nature.com/msb/
Harvard MIT DOE Center Projects arep.med.harvard.edu Poster# Topic Goal# 2. Leptos, et al. Proteomics 1 121. Nguyen, et al. Mass spectrometry XML 1 122. Nguyen, et al. Gene Regulation 2 67. Thompson, et al. Vibrio diversity 3 68. Martiny, et al. Prochlorococcus diversity 3 77. Sullivan, et al. Cyanophage diversity 1,3 3. Zhang, et al. Single cell sequencing 1-4 1. Church, et al. Metabolic fluxes 4 Prochlorococcus Photosynthesis, circadian & cell cycles Escherichia Synthetic genomes/proteomes Vibrio 4X faster replication than E.coli Caulobacter Asymmetric cell & chromsome structure Pseudomonas Biofilms
Prochlorococcus40ºN - 40ºS Ocean chl a (Aug 1997 –Sept 2000) Provided by the SeaWiFS Project, NASA
Energy & CO2 Sequestration Humans consume 2kW per person = 1010 kW. Sunlight hits the earth at 40,000 times that rate (70% ocean). CO2 370 ppm = 730 x1015 g globally, increase ~3 x1015 /yr. Ocean productivity = ~100 x1015 g CO2/yr … due to Autotrophs: 1026 Prochlorococcus cells globally (108 per liter) Sequestration v. respiration v. use: heterotrophs (Pelagibacter), phages, predators (Maxillopoda, Malacostraca, herring) 0.1 mm 0.1 m 6 cm http://www.gsfc.nasa.gov/gsfc/service/gallery/fact_sheets/earthsci/terra/earths_energy_balance.htm http://clear.eawag.ch/models/optionenE.html http://en.wikipedia.org/wiki/Copepod Morris et al. Nature 2002 Dec 19-26;420(6917):806-10. http://hosting.uaa.alaska.edu/mhines/biol468/pages/carbon.html http://www.aeiveos.com/~bradbury/Papers/PhotosyntheticEfficiency.html
Diel (circadian) cycle Light output for sun-box: 14hr light – 10hr dark, 230 mE at peak Zinser, Lindell,Chisholm, Leptos, Jaffe, Lin,et al.
Diel Expression: All genes Dark Dark Light Light Normalized expression Time (Hours) Zinser et al. unpubl.
Light regulated Prochlorococcus metabolism glgA glgB glgC Central Carbon Metabol. a-Glc-1P ADP-Glc glycogen a-1,4-glucosyl-glucan glgX glgP Zinser et al. unpubl.
H O O 2 2 Oxygenic Photosynthesis Core reaction Center Proteins e e - - e e - - NADPH NADPH psbA=D1 Fd= Ferridoxin PSII PSII PSI PSI D2 Pc= Plastocyanin HLIP= High Light Induced Protein H H O O O O 2 2 2 2
HLIP D1 Photosynthetic Genes in Phage Podovirus P-SSP746 kb Myovirus P-SSM2255 kb PC PC HLIPs HLIPs Fd Fd D1 D1 12kb 24kb 12kb 24kb Myovirus P-SSM4 181 kb HLIPs HLIPs D1 D1 D2 D2 ~500 ~500 bp bp 6.4kb 6.4kb 2.8kb 2.8kb Lindell, Sullivan, Chisholm et al. 2004
RNA Responses to Phage MED4 host psbA MED4-0682 (60 aa Conserved URF) Phage SSP7 psbA Lindell,Sullivan, Zinser, Chisholm
Synthetic - homologous recombination testing of DNA motifs 1.3 2.4 (1.3 in DargR) 1.1 1.3 0.7 2.5 0.2 1.4 1.4 3.5 RNA Ratio (motif- to wild type) for each flanking gene Bulyk, McGuire,Masuda,Church Genome Res. 14:201–208
Synthetic Genomes&Proteomes. Why? • Test or engineer cis-DNA/RNA-elements • Access to any protein (complex) including • post-transcriptional modifications • Affinity agents for the above. • Protein design, vaccines, solubility screens • Utility of molecular biology DNA -- RNA -- Protein • in vitro "kits" (e.g. PCR -- T7 -- Roche) • Toward these goals design a chassis: • 115 kbp genome. 150 genes. • Nearly all 3D structures known. • Comprehensive functional data.
(PURE) translation utility Removing tRNA-synthetases, translational release-factors, RNases & proteases Allows: Selection of scFvs[antibodies] specific for HBV DNA polymerase using ribosome display. Lee et al. 2004 J Immunol Methods. 284:147 Programming peptidomimetic syntheses by translating genetic codes designed de novo. Forster et al. 2003 PNAS 100:6353 High level cell-free expression & specific labeling of integral membrane proteins. Klammt et al. 2004 Eur J Biochem 271:568 Cell-free translation reconstituted with purified components. Shimizu et al. 2001 Nat Biotechnol. 19:751-5. Also: membrane incompatible expression & diverse amino-acids (>21)
yU mS eU UUG UGG CAG | | | | | | | | | ... AUG AAC ACC GUU GAA 5' A 3' fM N T V E in vitro genetic codes 5' 3' Second base A U A C U C A C yU mS U G eU 80% average yield per unnatural coupling. eU = 2-amino-4-pentenoic acid yU = 2-amino-4-pentynoic acid mS = O-methylserine gS = O-GlcNAc–serine bK = biotinyl-lysine Forster, et al. (2003) PNAS 100:6353 Zhang et al. (2004) Science. 303:371
Oligos for 150 & 776 synthetic genes(for E.coli minigenome & M.mobile whole genome respectively) Forster & Church
Up to 760K Oligos/Chip18 Mbp for $700 raw (6-18K genes) <1K Oxamer Electrolytic acid/base 8K Atactic/Xeotron/InvitrogenPhoto-Generated Acid Sheng , Zhou, Gulari, Gao (U.Houston) 24K Agilent Ink-jet standard reagents 48K Febit 100K Metrigen 380K NimblegenPhotolabile 5'protection Nuwaysir, Smith, Albert Tian, Gong, Church
Improve DNA Synthesis Cost Synthesis on chips in pools is 5000Xless expensive per oligonucleotide, but amounts are low (1e6 molecules rather than usual 1e12) & bimolecular kinetics slow with square of concentration decrease!) Solution: Amplify the oligos then release them. 10 50 10 => ss-70-mer (chip) => ds-90-mer => ds-50-mer 20-mer PCR primers with restriction sites at the 50mer junctions Tian, Gong, Sheng , Zhou, Gulari, Gao, Church Nature 2004
Improve DNA Synthesis Accuracyvia mismatch selection Other mismatch methods: MutS (&H,L) Tian & Church
Computer Aided Design Polymerase Assembly Multiplexing (CAD-PAM) 50 75 125 225 425 825 … 100*2^(n-1) Moving forward: 1. Tandem, inverted and dispersed repeats (hierarchical assembly, size-selection and/or scaffolding) 2. Reduce mutations (goal <1e-6 errors) to reduce # of intermediates 3. 15kb to 5Mb by homologous recombination (Nick Reppas) 4. Phage integrase site-specific recombination, also for counters. Stemmer et al. 1995. Gene 164:49-53;Mullis 1986 CSHSQB.
All 30S-Ribosomal-protein DNAs(codon re-optimized) 1.7 kb 0.3 kb Atactic <4K chip s19 0.3kb Nimblegen 95K chip Tian, Gong, Sheng , Zhou, Gulari, Gao, Church
Improving synthesis accuracy Method Bp/error Chip assembly (PAM) 160 1 Hybridization-selection 1,400 1 MutS-gel-shift 10,000 2 MutHLS cleavage 30,000 3 (10X better than PCR) 1. Tian, Church, et al. 2004 Nature 432:1050 2. Carr, Jacobson, et al. 2004 NAR 32:e162 3. Smith & Modrich 1997 PNAS 94:6847
Extreme mRNA makeoverfor protein expression in vitro RS-2,4,5,6,9,10,12,13,15,16,17,and 21 detectable initially. RS-1, 3, 7, 8, 11, 14, 18, 19, 20 initially weak or undetectable. Solution: Iteratively resynthesize all mRNAs with less mRNA structure. Western blot based on His-tags Tian & Church
Safe Synthetic Biology Church, G.M. (2004) A synthetic biohazard non-proliferation proposal. http://arep.med.harvard.edu/SBP/Church_Biohazard04c.doc 1. Monitor oligo synthesis via expansion of Controlled substances, Select Agents, &/or Recombinant DNA 2. Computational tools are available; very small number of reagent, instrument & synthetic gene suppliers at present. 3. System modeling checks for synthetic biology projects 4. Multi-auxotroph, novel genetic code for the host genome, prevents functional transfer of DNA to other cells.
Photosynthetic bacterial genomes (for population genetics & proteomics) 89 97 92 71 66 100 78 72 84 0.01 MED4 TATL1a high light adapted Prochlorococcus SAR6 ENATL1 ENATL3 MIT9302 MIT9312 MIT9201 GP2 MIT9202 MIT9215 TATL1b MIT9107 NATL2A low light adapted Prochlorococcus Pac 1 ENATL7 ENATL4 SS120 MIT9211 MIT9303 MIT9313 SAR139 WH8112 WH8102 SAR100 Marine Synechococcus WH8101 WH8012 WH7805 SAR7 Synechococcus PCC6307
Environmental population genomics(of a ribotype cluster)Thompson, Polz, et al. (2005) Science Monthly samples Isolate Vibrios Identity population as cluster of barcode genes Quantification: population is continuously present Additional marker gene: highly diverse Genomes: almost each genome different in typical sample
Sequencing single cells Biome studies focus on single-cells because hard to grow in the lab, multiple DNAs & RNAs per cell, exchange genome subsets. (Complementary to Biome shotgun and/or 100 kb BACs) Many input molecules required to sequence one molecule. vs. one molecule sufficient to sequence via many copies of it.
Amplifying DNA from single cells Prochlorococcus & Escherchia Zhang, Martiny, Chisholm, Church, unpub. No template control f29 real-time amplification Affymetrix quantitation of independent amplifications
In vitro libraries via paired tag manipulation Monolayered immobilization in acrylamide SOFTWARE Images → Tag Sequences Tag Sequences → Genome Polony Bead Sequencing Pipeline Bead polonies via emulsion PCR [Dre03] Enrichment of amplified beads FISSEQ or “wobble” sequencing Epifluorescence Scope with Integrated Flow Cell Mitra, Shendure, Porreca, Rosenbaum, Church unpub.
Read length needs for population surveys Paired tags are separated by 1000 +/- 100 bases
Selector bead Polony Fluorescent In SituSequencing Libraries 1 to 100kb Genomic 2x20bp after MmeI (BceAI, AcuI) LR M M Sequencing primers PCR bead Greg Porreca Abraham Rosenbaum Dressman et al PNAS 2003 emulsion
Cleavable dNTP-Fluorophore (& terminators) Reduce or photo- cleave Mitra,RD, Shendure,J, Olejnik,J, Olejnik,EK, and Church,GM (2003) Fluorescent in situ Sequencing on Polymerase Colonies. Analyt. Biochem. 320:55-65
Polony-FISSeq: up to 2 billion beads/slide Cy5 primer (570nm) ; Cy3 dNTP (666nm) Self Organizing Monolayer Jay Shendure
High accuracy special case: homopolymers (e.g. AAA, CC, etc.) • Use "compressed" tags , ACG = ACCG=ACCCG • Quantitate incorporation • Reversible terminators • FRET between adjacent 3' bases • Wobble primers, CTAGCGAGCTAGNNNNNNNNA All five of these work. • Maintenance of amplification fidelity using linear amplification from initial genomic fragment
Polony FISSeq Stats • # of bases sequenced (total) 23,703,953 • # bases sequenced (unique) 73 • Avg fold coverage 324,711 X • Pixels used per bead (analysis) ~3.6 • Read Length per primer 14-15 bp • Insertions 0.5% • Deletions 0.7% • Substitutions (raw) 4e-5 • Throughput: 360,000 bp/min • Current capillary sequencing 1400 bp/min • (600X speed/cost ratio, ~$5K/1X) • (This may omit: PCR , homopolymer, context errors) Shendure
Wobble vs Simple primer sequencing 1 vs 2.5 bp read/cycle of 4 bases 10 vs 14-200 bp reads 3e-3 vs 4e-5 non-homopolymer errors 3e-3 vs 1e-1 homopolymer errors 40 minutes per base tested = 60 hr per 20 cycles (20 hr, if 4 colors)
Harvard MIT DOE Center Projects arep.med.harvard.edu Poster# Topic Goal# 1. Church, et al. Metabolic fluxes 4 2. Leptos, et al. Proteomics 1 68. Martiny, et al. Prochlorococcus diversity 3 121. Nguyen, et al. Mass spectrometry XML 1 122. Nguyen, et al. Gene Regulation 2 77. Sullivan, et al. Cyanophages 1,3 67. Thompson, et al. Vibrio diversity 3 3. Zhang, et al. Single cell sequencing 1-4 Prochlorococcus Photosynthesis, circadian & cell cycles Escherichia Synthetic genomes/proteomes Vibrio 4X faster replication than E.coli Caulobacter Asymmetric cell & chromsome structure Pseudomonas Biofilms