1 / 59

Microbial Functional Genomics, Genomic Technologies, And Their Applications

Microbial Functional Genomics, Genomic Technologies, And Their Applications. Jizhong (Joe) Zhou Zhouj@ornl.gov , 865-576-7544. Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA. Community & Ecosystem Genomics. Gene Expression Patterns.

thimba
Download Presentation

Microbial Functional Genomics, Genomic Technologies, And Their Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Microbial Functional Genomics, Genomic Technologies, And Their Applications Jizhong (Joe) Zhou Zhouj@ornl.gov, 865-576-7544 Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA

  2. Community & Ecosystem Genomics Gene Expression Patterns Oligonucleotide Arrays Microbial functional Genomics Genomic Technology Functional Gene Arrays Whole Genome Microarrays Microbial Ecology & Extremophiles Community Genome Arrays Producing Magnetic Nanoparticles Microbial Community Diversity & Mechanisms Uranium Reduction Protein array

  3. Challenges in functional genomics • Defining gene functions: • 30-60% open reading frames are functionally unknown. • Regulatory network • Gene number difference could not explain phenotypic differences, suggesting regulation is the key.

  4. Microbial Functional Genomics Integrating Gene Expression Profiling, Bioinformatics, mutagenesis and Proteomics MUTAGENESIS BIOINFORMATICS Structure-Based Function Prediction aac1 Gmr sacB pDS31 PROTEOMICS TRANSCRIPTOMICS DNA Microarrays 2-D Gels Genome Sequence Mass Spectrometry Phage Display

  5. Whole genome microarrays available at ORNL Rhodopseudomonas palustris: Photosynthetic bacterium (MGP, GTL) Geobacter metallireducens: Metal-reducing bacterium (GTL) Shewanella oneidensis MR-1: Metal-reducing bacterium (MGP, GTL) Nitrosomonas europaea: Ammonium-oxidizing bacterium (MGP) Desulfovibrio vulgaris: Sulfate-reducing bacterium (GTL, NABIR) Deinococcus radiodurans R1: Radiation-resistant bacterium (GTL) Methanococcus maripaludis (GTL)

  6. Two primary uses of microarrays for functional analysis • Hypothesis-generating, i.e., exploratory, Gene expression profiling under different conditions: • e.g., Radiation responses in Deinococcus radiodurans. • Hypothesis-driven: • e.g., mutant characterization in Shewanella oneidensis MR-1.

  7. Plasmid 45.7 Kbp % G+C 66.6% # ORFs 3,195 Mean ORF size 937 bp % Coding 91% Chromosome I 2.65 Mbp Mega- plasmid 177.5 Kbp # Similar to known proteins 52.2% # Conserved hypothetical 16% # Hypothetical 31.5% rRNA operons 9 Chromosome II 412.3 Kbp Deinococcus radiodurans R1 Genome: 3.3Mb *D. radiodurans R1 genome sequence and annotation courtesy of TIGR

  8.  Radiation Survival Curve Hours post irradiation M CK 0 1.5 3 5 9 24 bp 23.1 9.4 D. radiodurans R1 6.6 E. coli 4.4 Radiation Resistance of D. radiodurans R1 • Majority of E. coli cells are dead at ~500 grays. • D. radiodurans exhibits a shoulder of resistance up to ~5000 Gy; no loss of viability. • Very little is known about the DNA repair pathways enabling D. radiodurans to resist ionizing and UV irradiation.

  9. Deinococcus Cells Can Survive Acute -radiation due to its ability to repair direct damage and remove free radicals. • Direct damage (20%) • Indirect damage due to free radicals (80%) DNA damage repair Re-initiate DNA synthesis (early events after irradiation) -radiation DNA damages Replication impaired -photon (20%) Cell division arrested Cells mRNA degradation Cells grow slow or dead Irradiation-induced Free radicals (80%) Protein degradation Minimize free radical levels (late events after irradiation) Cellular functions impaired

  10. Gene Expression Profiling: Experimental Design • Recovery of D. radiodurans (wild-type strain R1) from acute radiation (exposure dose = 15,000 Grays of -radiation) Cell SampleRecovery Time (in hours) @ 32C Control (non-irradiated) – 1 0 2 0.5 3 1.5 4 3 5 5 6 9 7 12 8 16 9 24 Irradiated Control 3 biological replicates (different mRNAs) 4 technical replicates Total replicates: 12 Collaboration with Mike Daly

  11. Hierarchical Clustering Analysis of Expression Profile Patterns • More than 800 genes are induced at 1.5 hr radiation. • More genes are up-regulated than down-regulated. • More than 40% of the genes which are functionally unknown are significantly changed upon irradiation. recA-like expression profile: DNA replication DNA repair Recombination Cell wall metabolism Cellular transport Uncharacterized proteins Induced Genes (early to mid phases): Glyoxylate shunt Superoxide dismutase Stress response Proteases, nucleases Repressed Genes (early to mid phases): TCA cycle Genes involved in de novo synthesis of amino acids and nucleotides

  12. Discovery of a Novel ATP-dependent DNA ligase Ligase (DR0100) • A novel ATP-dependent DNA ligase was highly expressed with recA profile. • It has consensus motifs with ligase from eucaryotes. motif I motif III motif IIIa motif IV * 6459863 DNLJ_DR2069 123 FTGELKIDGLSV 44 LEVRGEVYL 44 KAILYAVGKRDG 50 ADGTVLK 300 2506362 DNLJ_ECOLI 110 WCCELKLDGLAV 46 LEVRGEVFL 44 TFFCYGVGVLEG 51 IDGVVIK 290 1352290 DNL1_MOUSE 561 FTCEYKYDGQRA 41 FILDTEAVA 31 CLYAFDLIYLNG 51 CEGLMVK 723 1706482 DNL4_HUMAN 201 FYIETKLDGERM 46 CILDGEMMA 28 CYCVFDVLMVNN 51 EEGIMVK 365 1706481 DNL3_HUMAN 416 MFSEIKYDGERV 40 MILDSEVLL 27 CLFVFDCIYFND 51 LEGLVLK 573 11498455 AF0849 91 VVLEEKMNGYNV 40 YMLCCEAVG 16 EFFLFDVREGKT 46 REGVVFK 232 15894039 CAC0752 38 CVLEEKVDGANC 49 YVMYGEWLY 12 YFMEFDIFDKKE 50 RENLEIR 188 6460914 DRB0100 35 VVVTEKLDGENT 37 WRFCGENVY 12 YFYLFSVWDDLN 42 MEGYVVR 165 consensus/100% hh...KhsG.th h.h.sE.hh .hh.ashh...t .-sh.h+ secondary str (1DGS) EEEEE EEE EEEEEEEE EEEE EEEEE Liu et al. 2003. PNAS, 100: 4191-4196

  13. Highly coordinated regulations • Energy pathway switching, less energy produced. • Minimizing energy demands --- Shutdown de novo biosynthetic pathways • Energy pathway switching --- less free radicals produced. • Increasing activities of the genes involved in removing free radicals. Free radicals Energy Biosynthetic precursors • Shutdown de novo biosynthetic pathways to minimize energy requirement. • Increasing activities of proteases and nucleases to provide amino acids and nucleotides for protein, DNA and RNA synthesis.

  14. Shewanella oneidensis – MR-1 Mine waste Black Sea Oneida Lake Green Bay Panama Basin Mississippi Delta North Sea Redox Interfaces O2 NO3-, NO2- Mn(IV) Mn(III) Fe (III) Fumarate DMSO TMAO So S2O32- U(VI) Cr(VI), Tc, As, Se, I, • Habitats: • lake & marine sediments • deep sea • oil brine • spoiled food Formate Lactate Pyruvate Amino Acids H2 S With this kind of versatility, what will it really do?

  15. DOE Shewanella Federation Sequencing, annotation ORNL ESD Microbial Functional Genomics Group TIGR (John Heidelberg) Microarrays, LIMS Database Center for Microbial Ecology, MSU (J.Tiedje, J.Cole, J.Klappenbach) Physiology, Genetics 2-D PAGE Metabolomics USC, JPL (K.Nealson) Phage display UCB (J. Keasling) Pathway construction and modeling ANL (C.Giometti) BCM (T. Palzkill) Physiology, MS proteomics Modeling Bioinformatics, MS ISB (E. Kolker) PNNL (J.Frederickson, D. Smith) ORNL LSD, CASD(F.Larimer, B. Hettich) B.Palsson (UCSD) Adam Arkin (LBL) M.Riley (Woods Hole)

  16. U Washington U Missouri Large Genomes To Life Project: $38M for 5 years Rapid Deduction of Stress Response Pathways in Metal/Radionuclide Reducing Bacteria Stress responses on: Desulfovibrio vulgaris Shewanella oneidensis Geobacter metallireducens National Laboratories Universities Private Organizations UC Berkeley (Consultant)

  17. Summary of microarray analysis for Shewanella • Responses to 11 different electron acceptors • Mutant characterization with chemostats • Low-pH and high-pH stress • Heat shock, cold shock • Oxidative stress (e.g., H2O2)(Ting Li) • High salt • Carbon starvation • Metal stress: strontium, chromium • Hypothetical proteins • Many mutants

  18. Defining Gene Function through Deletion Mutagenesis, ~ 80 deletion mutants GLOBAL REGULATORS:etrA, narQ, fur, crp, arcA, envZ cAMP-BINDING REGULATORS:cAMP1, cAMP2, cAMP3 ADENYLATE CYCLASES:cya1, cya2, cya3 OUTER MEMBRANE PROTEINS AND CYTOCHROMES:mtrC, mtrA, omcA SIGMA FACTORS:rpoH, rpoE, STRESS RESPONSE:oxyR, bolA, dps, ompR, cpxR DOUBLE MUTANTS:etrA-fur, etrA-crp, cpxR-cpxA, ompR-envZ, cpxR-cpxA PAS domain (old annotation): 0834, 0906, 1761,4254, 4326, 4917 Hypothetical proteins: 1377, 3584 Transcriptional factors: 220 genes, 78 within single operon, Cytochrome genes: 42 genes

  19. C-terminal domain N-terminal DNA-binding domain Computational Prediction of the function of the SO1328 Gene Product (LysR) • It was annotated as LysR family protein. • It is induced 5-7 folds by H2O2 treatment. • It shares ~34% sequence homology with E.coli OxyR gene. • 3D structure is similar to OxyR in E. coli.

  20. Growth phenotype of LysR deletion mutant (SO1328) • Less growth was obtained when the WT cells were treated with 2,000 um H2O2. • Wild type cells were sensitive to H2O2. WT 0 uM 2,000 uM • No differences between treatment and control for the mutant cells • The LysR mutant is not sensitive to H2O2. • OxyR mutant is more sensitive to H2O2 in E. coli Mutant 0 uM 2,000 uM

  21. Microarray analysis of LysR mutant in response to H2O2 stress • Key genes (e.g., dps, katG) known to be involved in oxidative stress were not affected by H2O2 in the mutant. • Since OxyR mutant is more resistant to H2O2, it is expected that the genes involved in oxidative stress should be highly expressed, but they are not. This suggests that novel mechanisms and pathways may exist. • OxyR-dps double mutant is also resistant to H2O2, suggesting that the oxidative responses in MR-1 are very complicated.

  22. Proteomics Tools for studying proteomics • 2-Dimentional gel electrophoresis • Mass spectrometry • Phage-display • Yeast two hybrid system • Protein arrays • Structural determination: X-rays, NMR

  23. Using phage-display to study protein-protein interactions and regulations Gateway cloning vector • First key step: cloning all genes into universal vector. • The cloning systems were optimized. • All primers were synthesized. • 3,853 genes were cloned. • Sequenced 50 clones, no errorswere found. Phage display

  24. Expression of Shewanella proteins from the pDEST17 vector 175kDa 83kDa 62kDa 48kDa 33kDa 25kDa GST GST n i i n i i i i i 70.2kDa 34.2kDa 20.5kDa 32.4kDa NarQ ArcA Fur EtrA n= no insert control i= expression induced with 0.5 mM IPTG Global regulatory genes are well expressed in E. coli

  25. gltA aceA aceB Icd sucAB sdhCAB sucCD Identification of binding motifs of ArcA by gel shifting assays • Consistent with E. coli : Icd, gltA-sdhCAB,sucABCD • Different from E. coli,aceBA, potentially regulate the glyoxylate shunt pathway. • Shewanella ArcA can also interact with promoters of other TCA cycle related genes (not found in E. coli): SO0970(fumarate reductase flavoprotein subunit precursor), SO1538 (isocitrate dehydrogenase), , SO2222 (fumarate hydratase)

  26. Using promoter microarray for studying protein-DNA interactions to understand regulatory network 1 In vitro/vivo pull down qPCR amplification 2 Non specific competitors 1. BSA/milk 2. Random DNA Direct binding Verification by EMSA/RT-PCR/cDNA microarray

  27. Challenges in protein arrays • Antibodies are commonly used as probes in protein arrays • Two big challenges: • Loss of activity:The big challenge for antibody arrays is the loss of activity of antibody because the active binding site may bind to slide surface through chemical bonding, and thus the active site may not be available to the antigen. • Cross reactivity:Specificity is also a big issue for antibody protein arrays..

  28. Development of novel chemistry for protein array fabrication Thin film coating Glass substrate Langmuir 20, (2004), 8877-8885. Proteomics, in revision • Proteins are affixed on the slide by: • Entrapment by porous structure of the polymer • Electrostatic interaction • But not by covalent bonding

  29. 2 fold decrease Nanofilm-coated Superaldehyde Poly-Lysine Superamine Proteins spotted on different slides • Nanofilm coated slide • More sensitive • Less background noise

  30. Antibody arrays Very good specificity of the antibody-antigen reactions were obtained. • A patent was filed and licensed to a company • Nominated by ORNL for R&D100 Award.

  31. Detection of Single Base Pair Differences • Short oligos (<25 bp) without end modification, typically $20/oligo. • More than 5 fold difference of signal intensity between PM and MM probes. • Single mismatch can be clearly differentiated.

  32. Arbitrary cutoff for network identification Correlation matrix of 5 genes • Only 3 interactions left when Rc=0.7. • 7 interactions left when Rc=0.4 Main challenges • All methods defined a cutoff arbitrarily. • Identified clusters or modules are ambiguous. Rc=0.7 Rc=0.4 3 interactions left 7 interactions left

  33. Novel approach for network identification Poisson Distribution (cutoff >0.7) Wigner-Dyson Distribution (cutoff < 0.7) Random Matrix Theory and Level Statistics Poisson Distribution: Wigner-Dyson Distribution: • Main advantages: • Universal laws support • Automatic cutoff • Reliable, sensitive, robust • Random properties: Wigner-Dyson distribution • Nonrandom properties: Poisson distribution

  34. Identification of 27 Modules from Yeast Cell Cycle Expression Data

  35. Experimental Validation of some hypothetical proteins • Cycloheximide inhibits protein synthesis by blocking peptidyl transferase. • Mutants are more sensitive to this drug, suggesting that it has defective ribosome. • Thus the function of the genes is involved in ribosomal biogenesis.

  36. 1 2 DSP10 30oC SO2017 30oC DSP10 42oC SO2017 42oC 7 5 6 3 4 Functional identification of a hypothetical protein in Shewanella • For Shewanella heat shock data, SO2017 is grouped with heat shock proteins. 1. dnaK 2. htpG 3. groEL 4. groES 5. Lon 6. dnaJ 7. SO2017 Experimental validation of SO2017 • Mutant of SO2017 is sensitive to heat shock. • This gene is indeed involved in heat shock response. • Suggesting that the prediction is correct

  37. Pioneering advances in microarray-based technologies to address challenges in microbial community genomics • Challenges: • Specificity: Environmental sequence divergences. • Sensitivity: Low biomass. • Quantification: • Existence of contaminants: Humic materials, organic contaminants, metals and radionuclides. • Solutions • Developing different types of microarrays and novel chemistry to address different levels of specificity. • Developing novel signal amplification strategy to increase sensitivity • Optimizing microarray protocols for reliable quantification.

  38. Summary of 50mer-based FGAs for environmental studies Oligonucleotide probe size: 50 bp Tiquia et al. 2004.BioTechniques 36, 664-675 Rhee et al. 2004, AEM 70:4303-4317 • Nitrogen cycling: 302 • Sulfate reduction: 204 • Carbon cycling: 566 • Phosphorus utilization: 79 • Organic contaminant degradation: 770 • Metal resistance and oxidation: 85 • Total: 2,006 probes • All probes are < 88% similarity

  39. Specificity of 50 mer microarrays K nir 4 5 S nir H nif 1 A pmo A amo 3 AB dsr 2 Specific hybridization was obtained with probes  85% similarity • 5 nirS genes were mixed together • Only corresponding genes were hybridized • 6 types of genes were mixed together • Only corresponding genes werehybridized

  40. Sensitivity Cells Genomic DNA 5 6 7 8 1 2 3 4 50 ng 500 ng gDNA 25 ng 1.3107 1.6109 3.0106 • Detection limit • 50 ng pure DNA in the presence of non-target templates • 107 cells

  41. r2 = 0.98 1.6  109 8.0  108 4.0  109 2.0  108 1.0  107 5.0  107 2.5  107 1.3  107 3.0  106 6.0  106 Log (Cell Number [N]) Quantification and validation Microarray hybridization Real-PCR • Quantification • Good linear relationship • Quantitative • Microarray result is consistent with real-time PCR

  42. 10fg M A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 A6 B6 A7 B7 A8 B8 M Novel amplification approach for increasing hybridization sensitivity As low as 10fg (2 cells) can be detected Amplification is quantitative for majority of the genes Submitted to PNAS

  43. NABIR Field Research Center Samples Area 3 S-3 Ponds Cap pH Nitrate Uranium Nickel TOC FW-300* 6.1 1.200 0.001 0.005 30 FW-003 6.0 1060 0.01 0.015 100 FW-005 3.9 175.0 6.40 5.00 70 FW-010 3.5 42000 0.17 18.0175 FW-015 3.4 8300 7.70 8.80 65 TPB-16 6.3 30.00 1.10 ND 65 010 Contaminant source 005 • 2 L groundwater • Genes analyzed • 16S rRNA, nirS, nirK, dsrAB, amoA Most contaminated 015 Least contaminated Less contaminated 003 16 275 m Area 1 N Area 2 30 m 6 samples were taken to assess the effects of contaminants on microbial community structure

  44. Groundwater samples with very low biomass • 2L groundwater from six different sites. • Cell counts: 1-5x105/ml • DNA was isolated, 1/20 of the DNA was manipulated and used for hybridization. • Nice hybridization was obtained with the DNA manipulated with the new method. • No hybridization were obtained if the DNA is not manipulated.

  45. FW300 FW010 Difference of functional genes in samples from NABIR Field Research Center • Clear difference was observed among contaminated and noncontaminated sites. • E.g., some genes are present in noncontaminated site but not in contaminated sites Reference site Highly contaminated site 5

  46. Overall diversity among different samples • Overall diversity correlates with contaminant level. • The proportion of overlapping genes between samples was consistent with the contaminant level and geochemistry. • A significant portion (5-20%) of all detected genes were unique to each sample, even though they are very close. Thus, important microbial populations appear to be highly heterogeneous in this groundwater system.

  47. CommOligo --- New oligo probe design program for community analysis • Useful for both whole genome microarrays and community arrays • Able to design group-specific probes • Better performance than other programs

  48. Probes Designed for a Second Generation FGA • Nitrogen cycling: 5089 • Carbon cycling: 9198 • Sulfate reduction: 1006 • Phosphorus utilization: 438 • Organic contaminant degradation: 5359 • Metal resistance and oxidation: 2303 • Total:23,408 genes • 23,000 probes designed • Will be very useful for community and ecological studies

  49. Community Genomics Grand challenges • Extremely high diversity, 5000 species/g soil • 99% of the microbial species are uncultured

  50. 010A-A05 99 89 Ralstonia eutropha 71 Azoarcus eutrophus 67 Ralstonia NI1 59 010A-E08 010D-B06 010A-F09 Azoarcus FL05 54 98 010B-A01 uncultured clone 3 010A-A04 100 Acidovorax 3DHB1 84 010D-C09 uncultured clone 81 95 96 010A-D01 80 Rhodoferax antarcticus 97 010A-F11 uncultured clone HC-32 98 010B-E10 64 100 Aquaspirillum autotrophicum 010D-D06 61 010D-A06 55 uncultured clone S015 53 uncultured clone GOUTA12 89 99 010B-G08 010B-B11 51 100 Pseudomonasmarginalis 010D-G08 010B-B09 100 010D-C08 87 99 Pseudomonasstutzeri 010A-C01 010A-A01 100 Rhizobium gallicum 100 010A-F12 100 uncultured clone LAH1 0.05 Whole community sequencing • Sample from NABIR Field Research Center at ORNL • Sequenced by DOE Joint Genome Institute • 20 species based on 16S rRNA

More Related