360 likes | 465 Views
About the cellulases distribution. Renaud Berlemont, UCI Adam Martiny Lab. About the GHx classification. CAZYdb Glycoside Hydrolases, … Structure – Sequences Alignements : Families (>100) / Clans (14) « Convergence – Divergence » . Some statements.
E N D
About the cellulases distribution Renaud Berlemont, UCI Adam Martiny Lab.
About the GHx classification • CAZYdb Glycoside Hydrolases, … • Structure – Sequences Alignements : Families (>100) / Clans (14) • « Convergence – Divergence »
Some statements • Biochemically confirmed « cellulases » = CMCases
Some statements • Biochemically confirmed « cellulases » = CMCases • Many cellulases are active on other substrates (e.g. xylan) • Many « cellulases » are non-cellulolytic !? • CMCases ≠ Cellulases
Cellulose production : • GH8 (Romling, 2002) – Biofilm / Interaction (w. plant) • GH5 (Berlemont, 2009) - Biofilm • GH6 (Delbrassine, in prep) – Cell differenciation • GH6 (Tunicate, animal) • GH9 (KORrigan, plant)
Some statements • Biochemically confirmed « cellulases » = CMCases • Many cellulases are active on other substrates (e.g. xylan) • Many « cellulases » are non-cellulolytic • CMCases ≠ Cellulases • Best studied cellulose degraders all belong to the Firmicutes group (e.g. Clostridium) • ~20 genomes of cellulose degraders have been completely sequenced
Hypothesis 2a Question 2How are extracellular enzyme genes distributed among microbial taxa ? Some extracellular enzymes are broadly distributed across taxa while others are constrained to a small number of taxa. Hypothesis 2b The occurrence of different extracellular enzyme genes among taxa will be correlated. Some genes will show patterns of over-dispersion while others will show co-occurrence.
pSEED - FigFams • Sequenced genomes (patricbrc db - 4089) In order to analyze as much as possible sequenced genomes
pSEED - FigFams « FIGfams are sets of protein sequences that are similar along the full length of the proteins. Proteins are thought of as implementing one or more abstract functional roles, and all of the members of a single FIGfam are believed to implement precisely the same set of functional roles ». « Unambiguous coherent annotation system » … 3.2.1.4 : 1,4-beta-D-endoglucanase, 1,4-beta-D-glucan-4-glucanohydrolase, beta-1,4-endoglucan hydrolase, beta-1,4-endoglucanase, endoglucanase,
CAZYdb E.C. 3.2.1.4 GHx Pfam (pro. + euk.) InterPRo (pro.) PfGHx.FASTA IprGHx.FASTA Home-made Script : SEQ PEG ID pSEED PEG IDs FigFam IDs Methodology GH families Figfam IDs Several Figfam IDs correspond To one GHx families because Signal Peptides and accessory domains Are not conserved …
GHx pSEED FigFam IDs Genomes Annotations GHx Occurrence In Sequenced genomes Bacterial groups CBM2 … Bacterial groups Occurrence / List Bacterial groups Occurrence / List … Statistic Alignment Methodology Figfam IDs Genomes annotations (pSEED) GHx distribution
A huge data-set A ActinobacteriaB AequfacieC Bactero./ChlorobiD Chlam./ Verruco.E ChloroflexiF ChrysiogenetesG CyanobacteriaH DeferibacterI Deinoco./ThermusJ DictyoglomiK ElusomicrobiaL Fibrob./ Acidobact.M FirmicutesN FusobacteriaO NitrospiraeP GemmatimonadetesQ PlanctomycesR ProteobacteriaS SpirochaetesT SynergistetesU TenericutesV Thermodesulfobact.W Thermotogae Huge bias : A + C + M + R = 88% of the sequenced genomes…
Average Gene Content (AGC) A ActinobacteriaB AequfacieC Bactero./ChlorobiD Chlam./ Verruco.E ChloroflexiF ChrysiogenetesG CyanobacteriaH DeferibacterI Deinoco./ThermusJ DictyoglomiK ElusomicrobiaL Fibrob./ Acidobact.M FirmicutesN FusobacteriaO NitrospiraeP GemmatimonadetesQ PlanctomycesR ProteobacteriaS SpirochaetesT SynergistetesU TenericutesV Thermodesulfobact.W Thermotogae Life style (Auto Vs. Hetero) Host association … “HKG” Multi-function …
GHx distribution in Genomes Life Style Autotrophic : Aequifacie Cyanobacteria Chrysiogenetes Nitrospirae Host associated: Chlam./ Verruco.ElusomicrobiaFibrob./ Acidobact.*FusobacteriaSpirochaetesTenericutes
GHx distribution in Genomes GHx functions « house keeping » GH6 endoglucanase ; cellobiohydrolase GH18 … endo-β-N-acetylglucosaminidase … Q: Planctomycetes U: Tenericutes - Mycoplasma
GHx distribution in Genomes GHx functions GHx families « specialization » GH6 endoglucanase ; cellobiohydrolase GH5 chitosanase ; β-mannosidase ; cellulase ; glucan β-1,3-glucosidase ; licheninase ; glucan endo-1,6-β-glucosidase mannan endo-β-1,4-mannosidase ; endo-β-1,4-xylanase ; cellulose β-1,4-cellobiosidase ; β-1,3-mannanase ; xyloglucan-specific endo-β-1,4-glucanase ; mannan transglycosylase ; endo-β-1,6-galactanase ; endoglycoceramidase How is it possible to know if an Enzyme from the GH5 is a cellulase?
Complex architectures GH5 chitosanase (EC 3.2.1.132); β-mannosidase (EC 3.2.1.25); cellulase (EC 3.2.1.4); glucan β-1,3-glucosidase (EC 3.2.1.58); licheninase (EC 3.2.1.73); glucan endo-1,6-β-glucosidase (EC 3.2.1.75) mannan endo-β-1,4-mannosidase (EC 3.2.1.78); endo-β-1,4-xylanase (EC 3.2.1.8); cellulose β-1,4-cellobiosidase (EC 3.2.1.91); β-1,3-mannanase (EC 3.2.1.-); xyloglucan-specific endo-β-1,4-glucanase (EC 3.2.1.151); mannan transglycosylase (EC 2.4.1.-); endo-β-1,6-galactanase (EC 3.2.1.164); endoglycoceramidase (EC 3.2.1.123) GH6 endoglucanase (EC 3.2.1.4); cellobiohydrolase (EC 3.2.1.91) GH5 = Multifunction GH6 = « cellulase » Free cellulases from the GH6 are Associated to the cellulose production In actynomycetes ! ?
Is there an efficient combination of enzymes ? Some genes are abundant (GH5, 10, 16, 18, 19) Are these genes really involved in PCW breakdown ? Multi-domain Why are Fibrobacteria so Efficient ?
Is there an efficient combination of enzymes ? The keys of the succes in Fibrobacteria
Things to remember… • Huge dataset • Distribution of GHx amongst taxa • Not all the GHx are equivalent • Multifunction, house keeping and specialized GHx families • Not all the taxa are equivelent • Life style, metabolism • Future : « Multi-domain »
What’s next Looking at the GHx-distribution in subgroups (e.g Proteobacteria, Firmicutes, …) Detailed table of the GHx distribution amongst (sub)-taxa
Potential publication ? • What is the phylogenetic distribution of GHx’s and CBM-GHx’s • Catabolism regulation analysis in Actynobacteria CebR (GHx vs CBM-GHx) : • Presence/absence of regulating sequences upstream the GHx-coding sequences • Environmental factors : “life style”, “metabolism”, … • Gene Gain/loss : 16S rRNA Vs. presence/absence of GHx’s
Do the cellulose degradation potential vary in environment ?
Some cases studies … GHx distribution in metagenomes % of CBM linked GHx Warnecke 2007 Spirochaetes, Fibrobacter, Bacteroidetes, … Hess 2011 Bacteroidetes, Fibrobacteria, Clostridia, …
…Vs. Our study Using the SSU…
…Vs. Our study Reno 2012 (probably) Actinobacteria, Alphaproteobacteria, Bacteroidetes, … Warnecke 2007 Spirochaetes, Fibrobacter, Bacteroidetes, … Hess 2011 Bacteroidetes, Fibrobacteria, Clostridia, …
Metagenomes Clustrering 16S rRNA GHx ? Environment selects for different populations (with different GHx)
Things to remember… • Different recipes for efficient PCW breakdown • Depending on the ecosystem • Leaf litter ≠ Cow Rumen • Bacterial content • GH content • Regarding the ecosystems, bacteria display different strategies to access plant polymers • [GH6, GH8, GH9]LL > [GH6, GH8, GH9]CR • [CMB-GHx]LL > [CBM-GHx]CR
What’s next • Leaf Litter Metagenome • 22 samples ~ready to be sequenced (TruSeq TM DNA -Illumina) (first year) • samples to be prepared (second year) • Compare : [GHx/16s rRNA in sequenced genomes] vs. [GHx/16s rRNA in Leaf Litter] • Compare different treatments, metagenomes
Nitrogen fertilization Nemergut, 2008, The effects of chronic nitrogen fertilization on alpine tundra soil microbial communities: implications for carbon and nitrogen cycling.
24 samples • TruSeq TM DNA (Illumina) • 24 samples • 22 samples ready to be sequenced
Complex architectures CBM2 CBM2 Cel5 Cel5 Xyl8 Cel5 Amount of FigFam IDs corresponding to a 2-domain protein Plant Cell Wall Amount of FigFam IDs ≠ Amount of genes
Metagenomes Clustrering 16S rRNA GHx Environment selects for different GHx potential