1 / 52

Phylogenetic diversity and proposed ecological roles of rare members of the soil biosphere

Phylogenetic diversity and proposed ecological roles of rare members of the soil biosphere. Mostafa S. Elshahed Oklahoma State University. Mostafa Elshahed, January 23 rd , 1997. Phylogenetic diversity and proposed ecological roles of rare members of the soil biosphere. Mostafa S. Elshahed

Download Presentation

Phylogenetic diversity and proposed ecological roles of rare members of the soil biosphere

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Phylogenetic diversity and proposed ecological roles of rare members of the soil biosphere Mostafa S. Elshahed Oklahoma State University

  2. Mostafa Elshahed, January 23rd, 1997

  3. Phylogenetic diversity and proposed ecological roles of rare members of the soil biosphere Mostafa S. Elshahed Oklahoma State University

  4. A yet-unexplored rare biosphere • Microbial communities often exhibit a species distribution pattern in which the majority of microbial species are present in low abundance. • Sampling effort in highly diverse environments usually covers a small fraction of the estimated number of species. • Low abundance, rarely sampled species have been called the rare biosphere Curve constructed form Schloss and Handelsman dataset of Alaskan soil (Plos comp. Bio. 2006 , 2:e92)

  5. What’s in the rare biosphere? • How to define, access the rare biosphere? • What is the level of phylogenetic diversity within members of the rare biosphere? • What is the evolutionary relationships between rare and abundant members of the rare biosphere? • What is the community dynamics between rare and abundant members of the community? • What is the ecological role (if any) of members of the rare biosphere? • What is the effect of the rare biosphere on species richness estimates?

  6. The rare soil biosphere • Extremely valuable ecosystem for economic sustainability as well as for elemental global cycling. • Highly diverse, species richness estimates range between 2,000 and 55,000. • One of the most intensively sampled ecosystems, 77,103 in RDP, almost all sequences originated from small size clone libraries. • Current collection of 16S sequences in the databases could be regarded as a global survey of abundant species in soils.

  7. Kessler farm soil (KFS) • Undisturbed tallgrass prairie soil in McClain County, Oklahoma. • 16S clone library was constructed using primer pair 27F-1391R  13,001 near full length non-chimeric clones. • Sequences grouped to 3,747 operational taxonomic units at a 3% sequence divergence cutoffs (OTU0.03). • Dataset is 8-times the largest near full-length soil clone library available, 1/4 the size of the largest pyrosequencing dataset generated from soil.

  8. KFS community composition Groups with abundances > 1% • 13,001 clones, 3747 OTUs • 34 different bacterial phyla • Major phyla fairly typical of soil. Groups with abundances < 1%

  9. 3% 6% 8% 10% 15% 20% Defining “rare” KFS Rarefaction curve at various taxonomic cutoffs • Subjective process. • Sampling effort did not reach saturation. • OTUs labeled as rare in KFS dataset represent OTUs with a low probability of being encountered in average-sized clone libraries.

  10. How to define the rare biosphere? * calculated using the formula p=1-(1-x)y, where p is the probability of detecting a species with relative abundance x in the large dataset in a small dataset of size y ** Determined using qPCR

  11. Rare members of KFS community 14 phyla are represented by ≤ 5 clones, 4 phyla represented by 1 clone

  12. Novel phylum level diversity in Kessler Farm Soil • 5 Novel candidate phyla (KFS1-KFS5). • Future availability of sequences could add three new phyla. • All novel phyla were represented by less than five clones.

  13. Novel subphylum level lineages detected in all major soil phyla. With the exception of two lineages in -Proteobacteria, all clones belonging to these novel lineages were present in low abundance. Novel subphylum level lineages in KFS rare biosphere

  14. Rare biosphere harbor lineages not commonly associated with soil • Within rare KFS biosphere, multiple lineages belonged to known phyla not commonly associated with soil. • Examples include: Phyla Chlorobia, Caldithrix, Elusimicrobia, and candidate phylum BRC-1 Clones affiliated with the genus Salinibacter within the Bacteroidetes Clostridiales-affiliated clones Clones belonging to Sup-05 lineage within the -Proteobacteria • These lineages obligatory require specific environmental conditions (strict anaerobic conditions, high salt, high temperature) that are not usually prevalent in soil ecosystems.

  15. Novelty of rare Vs abundant species in KFS dataset • Rare species (n5) are more than 7.5% different from their closest relatives in the database • Abundant species (n≥50) are 0.85-5.9% different from their closest relatives in the database • Some exceptions

  16. Uniqueness of rare members of the soil biosphere • What is the phylogenetic relationship between rare and abundant species in our dataset? • To answer this question, we determined the percentage of rare taxa at different taxonomic cutoffs • Species 3% • Genus 6% • Family 8% • Order 10% • Class 15% • Phylum 20%-25% • If the rare species are unique, this percentage should not decrease as the cutoff increases • If they are closely related to other more abundant taxa, the percentage should drop sharply as the cutoff increases • The magnitude of the drop in the % of rare taxa is indicative of the relative contribution of the above 2 scenarios to the total number of rare taxa in our dataset.

  17. n=1 n≤5 Proportion of unique clones within rare members of the KFS bacterial community While a fraction of the rare species have close relatives within the more abundant members of the community, a fraction represents unique, evolutionary distinct lineages Rare clones at putative genus cutoff represent 50.1-66.1% of the rare clones at putative species level. Rare clones at putative class cutoff represent 7.9-16.3% of the rare clones at (OTU0.03) (Figure 4b). Percentage of rare clones

  18. Members of the rare biosphere represent 18.1-37.1% of the KFS dataset. Members of the rare biosphere are on average more novel than more abundant members of the community Members of the rare biosphere either: Have close relatives within the more abundant members of the KFS community Belong to phyla not commonly associated with soil Belong to unique, phylogenetically distinct lineages with no close sequence similarity to more abundant members of KFS. We reason that recognizing these novelty and uniqueness patterns is key to understanding the origins, dynamics, and ecological roles of various members of the soil’s rare biosphere

  19. Proposed origins, dynamics, and ecological roles of rare members of the soil biosphere • non-unique, non-novel members of the rare biosphere act as a back-up system and readily respond to seasonal variations encountered in soil temperature, pH, light exposure, and nutrient levels. • Unique clones belonging to well described lineages that are not prevalent in soil respond to more drastic disturbances that could occur in the ecosystem. • Unique clones belonging to novel lineages have an old, evolutionary distinct origin, ecological role of this group is not clear - Remnants of evolution with exceptional survival ability. - Perform yet-unknown ecological role in the ecosystem.

  20. ML-Models Estimation methods Non-parametric No assumption of distribution Chao, ACE • Advantages • Computationally easy • Disadvantages • limited diagnostic criteria • arbitrary cutoff point • bias Disadvantages • Several models, which one to choose • Different models do not produce similar estimates • Computationally difficult Species richness in KFS Parametric Rarefaction curves • Advantages • Unbiased • Assessment of the model fit • Use maximum amt of frequency data Fitting the curve to estimate the asymptote • Advantages • Not sensitive to sample size • Disadvantages • Precision problems

  21. Parametric models • Approximate frequency distribution of captured species then project the given distribution to estimate the number of unobserved species. • Problems with previous application of parametric models: • Assume a-priori distribution (Lognormal). • Did not use maximum likelihood estimation of model parameters. • Did not test the goodness of fit or provide a standard error of the estimate.

  22. Parametric models, cont. • As Hong S.-H. et al. (2005) and Joen et al. (2006) suggested, since there is no reason to assume a-priori that a certain model will provide the best fit to the observed frequency data, several models were tested and compared. The model of choice is the one that • Provides Best fit (using 2 goodness of fit) • Has Least standard error • Includes the maximum number of frequency data (highest truncation point) • Models tested have an underlying sampling Poisson (random) distribution and differ in the distribution function • Negative binomial (-mixed Poisson) • Inverse Gaussian-mixed Poisson • Lognormal-mixed Poisson • Pareto-mixed Poisson • Mixture of 2 exponentials-mixed Poisson • Poisson

  23. Our dataset • Using our 13,001 clones, the species richness was estimated using parametric models as discussed above. • The species richness for our dataset estimated to be 15,009 species. • The mixture of 2 exponentials-mixed Poisson seems to be the model that best describes our frequency distribution. • Different models gave different estimates of species richness.

  24. Rarefaction curve fitting The species richness is estimated by fitting an equation to the curve and estimating the the asymptote

  25. Rarefaction curve fitting • Equations used to fit the curve • Michaelis Menten • Exponential • Both the Michaelis Menten and the exponential curves are forced through the origin. This affects their fit.To improve the fit, an intercept is added. • MM-with intercept • Exponential with intercept • With bigger datasets, the curvatures at the beginning and the end of the rarefaction curve are not the same. For this reason the MM equation is not a good fit. A double MM equation with one for the beginning and one for the end of the rarefaction curve should solve this problem • Double MM equation

  26. Rarefaction curve fitting • Materials and methods • Analytic Rarefaction software was used to construct the rarefaction curves • Once the rarefaction curve is available, the data is fitted using nonlinear least square method. Software available online. • 5 different equations are used to fit the curve. • MM and exponential equations have 2 parameters • The intercept equations have 3 parameters • The double MM equation has 4 parameters • For each equation, one of the parameters is the asymptote, i.e. the estimated species richness • The curve fitter software gives the parameter and its SE as well as the residuals (difference between the observed and fitted data) • The best model has the least SE and residuals

  27. Rarefaction curve fitting • Double MM was the model that gave the best fit (least residuals)

  28. Non parametric estimators 2 estimators are the most common: • Chao. Uses the number of species observed in the dataset as well as number of singletons (OTUs that occurred once) and doubletons. • Abundance-coverage estimator (ACE). Divides the data into abundant and rare species usually at a cutoff of 10.

  29. Species richness estimates in KFS • Different methods predict different estimates of species richness • Chao estimate is the lower bound • Highest estimate found with parametric model • Since the parametric models had the most controls, we expect the parametric estimate to be the most accurate Using a 13,001-clone library, We estimate 15,009  722 species to be present in soil at the time of sampling.

  30. Is this the true species richness? • As the sample size increases, the species richness estimates also increases. • Is our sample size (13,001 clones) enough to predict the true richness? • We randomly sampled our dataset to construct 100-, 500-, 1000-, and 3000-clone libraries. We treated each of them as a separate dataset and estimated species richness for each dataset by all the above methods.

  31. Species richness with different sample size datasets • Regardless of the method used, the estimate increased with sample size

  32. Towards a sample size-unbiased estimate of species richness • A plot of species richness estimate (SRest) at different clone library sizes (CLact) could be used. • However, asymptote of SRest will occur at clone library sizes that are orders of magnitude higher than the actual clone library sizes used for plotting the curve, making asymptote determination (and hence true SR determination) grossly inaccurate. • Theoretical clone library sizes required to encounter the absolute majority of species richness (CLth) being much larger is better suited for SR determination in a CLth Vs SRest plot.

  33. CLth required to observe the absolute majority of SRest at different CLact • When CLth = CLact effort required to observe the absolute majority of species is met. • Increase in CLact will not increase the CLth required to observe the absolute majority of SR. • SRest will not increase upon further sampling, represents a sample size-unbiased estimate of species richness. • This CLth value was determined to be 6.3X106.

  34. Sample size unbiased estimate of species richness • CLth - SRest plot, curve fitting suggested 17,230 as a sample-size unbiased estimate of species richness. • This value is 15% higher than SRest determined using the 13,001 dataset.

  35. Species richness estimates, conclusions • Species richness estimates increase with dataset size. Reported estimates are a fraction of the “true” richness. • We propose an approach that provides a sample size-unbiased estimate of species richness • The approach suggested a species richness value of 17,230, compared to 345-15,009 in 100-13,001 clones libraries

  36. Comparative diversity between different phyla in soil • All previous studies treated soil as a single dataset. • Soil has a fairly stable composition. • We compared the diversity between phyla that were present more than 3% in our dataset: Actinobacteria, Acidobacteria, -proteobacteria,-proteobacteria, Chloroflexi, Verrucomicrobia,Bacteroidetes, Planctomycetes,-proteobacteria.

  37. Comparative diversity indices 1. Single indices • Shannon index is the most common • Has been used for both macro- and micro-communities. • Disadvantage: highly sensitive to sample size

  38. Shannon’s index for the major phyla in soil

  39. B A Number of OTUs observed Number of clones sampled Comparative diversity indices, cont. 2. Rarefaction curves • Can be used to rank communities. The community with the rarefaction curve lying above, is the community with higher diversity. Not sensitive to sample size compared to single indices • Disadvantages • If the 2 rarefaction curves cross, the communities can not be judged with regards to diversity. • Even if they do not cross with the current sample size, there is no guarantee they are not going to as the sample size increases.

  40. Rarefaction curves of the 9 major phyla Rarefaction ordering Verrucomicrobia Beta proteobacteria Bacteroidetes Delta proteobacteria Acidobacteria Alpha proteobacteria Chloroflexi Actinobacteria Planctomycetes

  41. Comparative diversity indices, cont. 3. Diversity profiling • A potential solution to the problems with single diversity indices is offered by the use of parametric families of diversity indices. • There exist 3 different groups for diversity orderings. Each one of them can be represented by more than one method. • We compared the 9 major phyla using each and every method of diversity profiling ( a total of 12 methods in 3 major groups) to come up with a ranking of diversity.

  42. Comparative diversity indices, cont. x: inconclusive (profiles crossed) In diversity profiling we make a decision about any 2 phyla only if they are comparable (their profiles do not cross) by at least 2 groups of methods Diversity profile ranking: Planctomycetes> Actinobacteria> Acidobacteria> chloroflexi> -proteobacteria> -proteobacteria> -proteobacteria> Bacteroidetes> Verrucomicrobia

  43. Phyla Rarefaction ranking Shannon ranking Diversity profiling Planctomycetes 9 8 9 High Actinobacteria 8 9 8 Acidobacteria 5 7 7 Moderate Chloroflexi 7 6 6 Alpha-proteobacteria 6 5 5 Delta-proteobacteria 4 4 4 Beta-proteobacteria 2 2 3 Low Bacteroidetes 3 3 2 Verrucomicrobia 1 1 1 Summary of diversity rankings

  44. S. Giovannoni. nature 430:515-516 (2004) Planctomycetes Verrucomicrobia Ecological implications of differential diversity • More diverse phyla have a high OTU/clone ratio at different taxonomic cutoff. • More diverse phyla have more basal branches, less diverse phyla have more peripheral branches. • Evolutionary sweeps purge branches with similar niche/ role in ecosystem functioning. Basal branches that survive are essential for ecosystem functioning • Members of microdiverse clusters arise from neutral mutation, occupy the same niche, and fulfill similar services to the ecosystem. • More diverse phyla, with more basal branches, are more important to ecosystem functioning than phyla with lower diversity. Just a theory

  45. Summary • 16S rRNA near complete gene clone library was constructed from an undisturbed tallgrass prairie soil (13,001 clones). • To our knowledge this is the largest full length 16S rRNA gene clone library from a single PCR reaction. • Phylogenetic analysis identified 34 phyla and 3,747 species within the dataset. • The large sample size allowed the discovery of 5 new candidate phyla. • The rare biosphere in Kessler farm soil is phylogenetically diverse, harbors novel lineages at all taxonomic levels, and is more novel than abundant clones in KFS. • Rare biosphere is a mixture of both unique species and species closely related to abundant soil microorganisms.

  46. Summary, cont. • The distribution of species in KFS soil follows a mixture of 2 exponentials-mixed Poisson. • Parametric species estimates suggested a species richness of 15,009 at the time of sampling. A sample size-unbiased approach suggested 17,230 species. • Differential diversity studies were conducted within the community as opposed to between communities. Some of the methods used for differential diversity are new to microbial ecology (diversity profiling). • Of the nine major phyla in Kessler farm soil, Planctomycetes had the highest diversity and the highest percentage of rare species, Verrucomicrobia has the lowest diversity.

  47. Noha Youssef, James Davis Acknowledgments Lee Krumholz, Anne Spain, Cody Sheik Bruce Roe, Fares Najar, Leonid Sukharnikov David Bruce, Kerrie Barry Vanessa Bailey OSU administration for keeping us homeless for 8 months

  48. Current and Future plans I. Explore the diversity, dynamics, and ecological roles of the rare biosphere in multiple anaerobic habitats. • Combine pyrosequencing and capillary sequencing to identify extremely rare microorganisms. • Double, triple, or quadruple the number of known bacterial phyla. • More accurate estimates of species richness. • Global patterns of differential diversity between various bacterial phyla. II. Quantification, visualization, and metagenomics of rare (and abundant) candidate phyla.

More Related