1 / 54

Multivariate statistics for use in ecological studies

Multivariate statistics for use in ecological studies. Kevin Wilcox ECOL 600 – Community Ecology Spring 2014. Useful web resources. Vegan tutorial: http ://cc.oulu.fi/~ jarioksa/opetus/metodi/vegantutor.pdf The little book of r for multivariate analyses:

Rita
Download Presentation

Multivariate statistics for use in ecological studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multivariate statistics for use in ecological studies Kevin Wilcox ECOL 600 – Community Ecology Spring 2014

  2. Useful web resources • Vegan tutorial: http://cc.oulu.fi/~jarioksa/opetus/metodi/vegantutor.pdf • The little book of r for multivariate analyses: http://little-book-of-r-for-multivariate-analysis.readthedocs.org/en/latest/src/multivariateanalysis.html#means-and-variances-per-group • Ordination Methods by Michael Palmer: http://ordination.okstate.edu/overview.htm#Nonmetric_Multidimensional_Scaling • Community analyses lectures by JariOksanen: http://cc.oulu.fi/~jarioksa/opetus/metodi/

  3. Univariate statistics to measure community dynamics • Richness (R or S, Either local or regional) • Shannon index (H’; Shannon &Weaver 1949) • Incorporates richness as well as the relative abundances into a metric • Emphasizes richness • Simpsons index (D or λ; Simpson 1949) • Emphasizes evenness • Pielou’s evenness index (J’)

  4. Univariate indices (cont.) No information about individual species responses • Species A and B are dominant in low light • All species do OK with moderate light • Species E and F are dominant in high light

  5. You could look at each species individually • Lacks clarity with many species • When looking solely at individual responses, you lose information about entire community dynamics Abundance OR… Light

  6. You could use multivariate statistics • 3 parts: • Dissimilarity matrices • Ordinations • Statistical tests of differences between or among communities MDS1 MDS1

  7. You could use multivariate statistics • 3 parts: • Dissimilarity metrics and matrices • Ordination • Statistical tests of differences between or among communities • Software: • R • SAS • SPSS • PRIMER with PERMANOVA+ • PC-ORD

  8. Dissimilarity metrics are the building blocks used in many multivariate statistics P < 0.05 • Visual representation (ordination) • Statistical tests • Think carefully about which type of matrix or dissimilarity metric you should use

  9. Dissimilarity matrices… brace yourself • A dissimilarity matrix is simply a table that compares all local communities (plots). The higher the number, the more dissimilar the communities are

  10. Dissimilarity matrices… brace yourself • A dissimilarity matrix is simply a table that compares all local communities (plots). The higher the number, the more dissimilar the communities are

  11. Types of dissimilarity metrics • Euclidean distance • Operates in species space • Meaning that each species (or dependent variable) gets its own orthogonal axis in multidimensional space. • Because the differences are squared, single large differences become very important when determining dissimilarities • Dissimilarities between pairs of plots with no shared species are not necessarily the same • This is why ED is usually used for environmental and not abundance data

  12. Types of dissimilarity metrics Manhattan-type distances • Bray-Curtis (abundance data) • Jacaard (presence-absence) • Use sums or differences instead of squared terms making it less sensitive to single differences • Reach a maximum dissimilarity of 1 when there are no shared species between communities

  13. Dissimilarity metrics are used to look at differences between communities • Euclidean distances are good for looking at many types of environmental data but is not great for species abundances. Knapp et al. in prep

  14. Ordinations • Basically, ordinations plot the communities based on all response variables (e.g. species responses) and then squish this into 2 or 3 dimensions. • Example 1: 2 species, 2 axes. Plot 3 Plot 2 Sp.B Species B Sp.A Plot 1 Species A

  15. Ordinations • Plots the communities based on the response variables and then squishing this into 2 or 3 dimensions. • Example 2: 3 species, 3 axes • Etc up to n response variables • We can’t visualize this well after 3 axes but it happens Plot 3 Plot 2 Species B Plot 4 Plot 1 Species C Species A

  16. Ordinations Example 3: 3 species, 2 axes. Plot 3 Plot 2 Plot 2 Plot 3 Sp.B Species B Axis 2 Plot 4 Sp.A Sp.C Plot 1 Plot 1 Species C Plot 4 Axis 1 Species A

  17. Ordinations • Analyzes communities based on all response variables and then • 2 species, 2 axes • 3 species, 3 axes • Etc up to n species n axis • Need to squash n dimensions into 2. • Ordination rotates the axes to minimize distance from primary axes and maximize explanation of variance by axes

  18. Constrained vs unconstrained ordinations • Constrained ordination makes the data fit into measured variables • This is limiting because you can only examine species differences to things you measure • However this is beneficial if you are interested in only a couple of environmental variables • Unconstrained tries to represent variability of the data even if there are no variables to explain the variation • For example, if different temperatures in two areas caused altered communities but not included in the model, you would still be able to detect differences in community structure • Better for exploratory analyses

  19. Types of unconstrained ordinations • Principle components analysis (PCA) • Uses Euclidean distances to map plots with the 2 or 3 axes that explain the majority of variation • Use with environmental data • Be sure to standardize response variables if they are in different scales • Principle coordinates analysis (PCO; Gower 1966) • Acts like PCA but uses a dissimilarity matrix instead of pulling straight from the data. • This is more like plotting a close fitting trendline instead of the actual data. • Fits the line by maximizing a linear correlation – this can be problematic • Is sometimes called metric dimensional scaling (MDS)… not to be confused with NMDS

  20. Types of unconstrained ordinations • Non-metric multidimentionalscaling (NMDS) - PRIMER calls this MDS!!! Ugh. • Very complicated.. In the past, the drawback with this technique was the large amount of computing power necessary… this is no longer an issue. • Preserves rank order of relationships while plotting more similar local communities closer together in 2D or 3D space – this solves the linear problem • Axes aren’t constrained by distances (e.g. Euclidean) so this method is more flexible.

  21. Unconstrained ordinations Increase the number of dimensions of your ordination.. if possible • Non-metric multidimentionalscaling (NMDS) • Stress = mismatch between rank orders of distances in data and in ordination • Excellent – stress < 0.05 • Good – stress < 0.1 • Acceptable – 0.1 < stress < 0.2 • On the edge – 0.2 < stress < 0.3 • Unacceptable – stress > 0.3 • To cope with high stress…

  22. Types of constrained ordinations • Constrained analysis of proximities (CAP) • You can plug in any dissimilarity matrix into this • Performs linear mapping • Redundancy analysis (RDA) • Constrained version of PCA • Constrained correspondence analysis (CCA) • Based on Chi-squared distances • Weighted linear mapping

  23. Incorporating environmental data into ordination • Can overlay vectors of environmental data on top of community data • Vectors supply information about the direction and strength of environmental variables • Easy to interpret the effects of many variables • However, it assumes all relationships are linear. This might not be the case… Oksanen 2013

  24. Incorporating environmental data into ordination • Can overlay surfaces of environmental data on top of community data • Surfaces provide more detailed information about how communities exist within abiotic variables • More difficult to interpret with more than a couple variables • Using treatments is a special case for this Oksanen 2013

  25. Ordination by itself is not a robust statistical test • Although ordination is great for visualizing your data, we need to back it up. • One way is to calculate confidence ellipses around the centroid • Another way is to use resemblance-based permutation methods • They give P values… For discussion how to do this in R, see: http://stats.stackexchange.com/questions/34017/confidence-intervals-around-a-centroid-with-modified-gower-similarity

  26. Resemblance-based permutation methods • One benefit to these techniques is that they compare n dimensional data instead of ordination data squished into 2 or 3D • Many assumptions of regular MANOVAs are violated with ecological community data (see Clarke 1993) which spurred the creation of new methods for analyzing multivariate data • 3 majorly used methods: • Permutational MANOVA (or PERMANOVA) • Analysis of similarities (ANOSIM) • Mantel’s test • One assumption of all three of these tests is equal variance among treatments… • This is a problem but we’ll come back to this

  27. ANOSIM – Clarke 1993 Used to calculate P value Mean dissimilarity rank of plot pairs between groups • Ranks dissimilarities among local communities from 1 to the number of comparisons made • Then looks at averages of ranked dissimilarities within and among groups • Compares these averages to random permutations of the R values to get p-value =1 if i and j are in the same group and =0 if they are in different groups Mean dissimilarity rank of plot pairs within a group (Originally from Clarke 1993 and reviewed in Anderson and Walsh 2013)

  28. ANOSIM – Clarke 1993 Essentially, during each permutation, plot labels in the dissimilarity matrix are shuffled and an R value is calculated. Over many permutations, a null distribution for R is created which the original R can be compared to - a p-value is obtained by where the original R falls on the distrubution Compare R actual to calculate p value Density R

  29. Group Y Group Z Mantel test • Doesn’t use ranks • To compare groups, it uses one dissimilarity matrix and one model matrix to designate contrasts and compare within and among groups • p value is calculated as the proportion of z(0,1) (within group dissimilarities) that is lower or equal to z(1,0) (between group dissimilarities) (See Legendre & Legendre 2012 for more detail)

  30. PERMANOVA • Calculates a pseudo-F statistic • Pseudo-F is identical to a normal F statistic if there is only one response variable • This pseudo-F is calculated using the original data and compared with a distribution of pseudo F statistics from many random permutations. This step is the same as ANOSIM. Density Pseudo F Pseudo F (See Anderson 2001, 2005 for more detail)

  31. Choosing a method • A major assumption in all three methods is equal variance among groups • This is often violated in real-world communities • In fact, this change in variance (i.e. dispersion or convergence among replicates or beta diversity) is often of interest to ecologists • So… how do we deal with this? Anderson and Walsh 2013

  32. Choosing a method Anderson and Walsh 2013

  33. PERMDISP • Permutational analysis of multivariate dispersions (Anderson 2004) • Compares multivariate dispersion among groups • Uses any distance or dissimilarity measure you feed into it • 2 main reasons to use this: • To look for violations of assumptions in tests of centroid location (although, as we discussed above, this may not be as big of a deal as once thought) • Variance among local communities within a treatment may be of ecological interest (for more info about using community dissimilarity methods to estimate beta diversity, see Legendre & Caceres 2013) Anderson 2004 Chase 2007

  34. SIMPER • Similarity percentages of component species or functional groups • Bray-Curtis dissimilarity matrix is implicit in a SIMPER analysis • Can force it to use a Euclidean distance matrix in PRIMER • I have not seen evidence for or against this practice…. • Use this to find out which variables are responsible for observed shifts in multivariate space Knapp et al. in prep

  35. References • Anderson, Marti J., and Daniel CI Walsh. "PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing?." Ecological Monographs 83.4 (2013): 557-574. • Anderson, M. J. "PERMDISP: a FORTRAN computer program for permutational analysis of multivariate dispersions (for any two-factor ANOVA design) using permutation tests." Department of Statistics, University of Auckland, New Zealand (2004). • Anderson, Marti J. "Permutational multivariate analysis of variance." Department of Statistics, University of Auckland, Auckland (2005). • Chase, Jonathan M. "Drought mediates the importance of stochastic community assembly." Proceedings of the National Academy of Sciences104.44 (2007): 17430-17434. • Clarke, K R. "Non‐parametric multivariate analyses of changes in community structure." Australian journal of ecology 18.1 (1993): 117-143. • Gower, John C. "Some distance properties of latent root and vector methods used in multivariate analysis." Biometrika 53.3-4 (1966): 325-338. • Legendre, Pierre, and MiquelCáceres. "Beta diversity as the variance of community data: dissimilarity coefficients and partitioning." Ecology letters 16.8 (2013): 951-963. • Legendre, Pierre, and Louis Legendre. Numerical ecology. Vol. 20. Elsevier, 2012. • Oksanen, Jari. "Multivariate analysis of ecological communities in R: vegan tutorial." R package version (2011): 2-0. • Shannon, Claude E., and Warren Weaver. "The mathematical theory of information." (1949). • Simpson, Edward H. "Measurement of diversity." Nature (1949).

  36. Interactions between climate and plant community structure alter ecosystem sensitivity and thus ecosystem function

  37. Sensitivity = absolute change in productivity per unit change in precipitation 1 Direct impacts of precipitation regimes are based on ecosystem sensitivity Precipitation regimes Ecosystem Sensitivity 1 Ecosystem function and services IPCC 2007

  38. 1 Direct impacts of precipitation regimes are based on ecosystem sensitivity Precipitation regimes Soil moisture dynamics Ecosystem Sensitivity 1 2 Precipitation regimes may alter ecosystem sensitivity through changes in soil moisture dynamics 2 Ecosystem function and services

  39. 1 Direct impacts of precipitation regimes are based on ecosystem sensitivity Climate regimes Soil moisture dynamics Ecosystem Sensitivity 1 2 Precipitation regimes may alter ecosystem sensitivity through changes in soil moisture dynamics 2 Species responses 3 Individual species responses to long term climate regimes shifts are a potential mechanism that may structure communities 3 Community composition Ecosystem function and services

  40. 1 Direct impacts of precipitation regimes are based on ecosystem sensitivity Climate regimes Soil moisture dynamics Ecosystem Sensitivity 1 2 Precipitation regimes may alter ecosystem sensitivity through changes in soil moisture dynamics 2 Species responses 3 3 Individual species responses to long term climate regimes shifts are a potential mechanism that may structure communities 3 4 Community composition 4 Community composition can directly affect ecosystem services through dominance or diversity effects or indirectly by altering ecosystem sensitivity to precipitation regimes Ecosystem function and services

  41. Overarching question… • Do interactions between precipitation drivers, plant community structure, and ecosystem sensitivity alter effects of precipitation regimes on ecosystem function?

  42. Shifts in Ecosystem fxn across space and time Sala et al. 1988 Huxmanet al. 2004

  43. A. Changes in overall soil moisture cause a change in the intercept of the Productivity – Precipitation relationship B. Different drought sensitivities of component species within a community control slope and intercept of the relationship by altering ecosystem responses in dry years C. Growth limitations (e.g. growth rate maximums, co-limitation by other resources such as N) of component species in wet years determine slope and intercept. Wet years Dry years C A Ecosystem Function A B Precipitation

  44. Experimental designs • ANPP data from 2 long-term data sets and linked precipitation data • Irrigation transect – relieves water limitation throughout the growing season • 1991-2011 • Uplands vs Lowlands – Annually burned, ungrazed watershed. • 1984 – 2011 • Looked at slopes between growing season rainfall and ANPP to assess sensitivity in control and manipulated plots.

  45. I) Reduced soil water capacity Chronic reduction of soil water availability should cause increased sensitivity due to a reduction in the overall productivity of the system (A; i.e. lowered intercept), while the slope and intercept are altered by the resident plant community. The capacity for growth in wet years (C) should be similar due to unchanged growth potential and lack of limiting nutrients, but the negative response to drought should be increased (B) due to reduction of soil water stores to buffer against drought. Increased soil water availability should decrease sensitivity by increasing overall productivity of the system (A; i.e. increased intercept), while limitations on cumulative growth rates of the extant plant community should reduce productivity response in wet years (C) thus reducing sensitivity of the system to precipitation inputs (i.e. slope). Chronic reduction of soil water availability should cause increased sensitivity due to a reduction in the overall productivity of the system (A; i.e. lowered intercept), while the slope and intercept are altered by the resident plant community. The capacity for growth in wet years (C) should be similar due to unchanged growth potential and lack of limiting nutrients, but the negative response to drought should be increased (B) due to reduction of soil water stores to buffer against drought. Increased soil water availability should decrease sensitivity by increasing overall productivity of the system (A; i.e. increased intercept), while limitations on cumulative growth rates of the extant plant community should reduce productivity response in wet years (C) thus reducing sensitivity of the system to precipitation inputs (i.e. slope). C A B Ecosystem Function II) Increased soil water availability II) Increased soil water availability A C Precipitation

  46. Ecosystem response to altered precip regimes / soil conditions * Ecosystem Function Precipitation

  47. Sensitivity shifts? Irrigation Soil depth ** n.s. n.s. ANPP (g/m2) Growing season precipitation (mm)

  48. How is community structure modifying these relationships? Smith 2009

  49. Uplands vs lowlands

  50. Uplands vs lowlands Predictions when incorporating biotic forcings Predictions based on abiotic forcings Water limitation is not an important factor in wet years Growth rate limitations of extant species limit production in wet years Decreased drought sensitivity of extant species limit production loss in dry years Reduction in dry years because of limited soil water storage to buffer plants during periods of drought

More Related