270 likes | 940 Views
Correspondence Analysis: Simple ( CA) and Detrended (DCA). Vamsi Sundus Shawnalee. What is Correspondence Analysis? . AKA Reciprocal Averaging (RA). Basically: An ordination technique that involves repeatedly calculating weighted averages. Popular only in France (due to Benzecri).
E N D
Correspondence Analysis: Simple ( CA) and Detrended (DCA) Vamsi Sundus Shawnalee
What is Correspondence Analysis? • AKA Reciprocal Averaging (RA). • Basically: An ordination technique that involves repeatedly calculating weighted averages. • Popular only in France (due to Benzecri).
What is Detrended Correspondence Analysis? • Designed specifically to solve certain problems found when using CA on ecological data based on “empirical desire to reshape data closer to the models visualized by ecologists.” • Popular mainly in the ecological community.
Weighted Means • Weighted mean results when some of the numbers in the data are repeated. • Consider: • Arithmetic Mean: • Weighted Mean; value of 1 found 10 times.
Application of Weighted Means • Let’s say we had some hypothetical data as follows:
Application of Weighted Means • To know what’s the average lifetime of the species, you would have to use the weighted averages to compute a weighted mean (below): Mean Year
Application CA Algorithm to find “mean species” in a 3 species case. • But theoretically, most ecologists and the like would be observing multiple species at the same time and hence have count data for these multi-species groups such as follows: A C Mean Year B Mean Year Mean Year
Step 1 • Start with a random weighting. It’s pretty kosher to start from 0.0 100.0 in whatever increments are needed. • In our case, we’ll do (0,50,100) for (A, B, C) • Use this formula for nth species rank:
Step 2 • Use the starter weights (which are arbitrary essentially) and compute a weighting for each of the years
Step 3 • We can now calculate a new weighting for each species using these new year weightings. • Calculate similarly for B, C A Old weightings for species New calculated weightings for species
Step 4 • These new weightings for each species though aren’t that useful, so we need to rescale them back to 0 100, instead of currently 19.1 78.5. • So, to do this, simply use a logical rescaling method.
Step 4 cont. • So, after computing the rescaled values, we find the following:
Step 5 • This is now one cycle of the CA completed. • “Weightings for each year are recalculated using the new, rescaled weightings for the species.” • Eventually a stable patter will emerge. • 10-20 iterations.
Correspondence Analysis • That was CA utilized in a simplistic example.
Detrended Correspondence Analysis • This technique is not purely mathematical • It’s a series of rules that are used to reshape data to make it friendlier for analysis. • Once again, primarily used for ecological data, but can be extended to anything (data simply can’t contain negative values). • The reason that this technique is used is to over come the arch effect (the horseshoe effect).
Arch Effect (Horseshoe Effect) • Found in data whenever “PCA or other distance conserving ordination techniques are applied to data which follow a continuous gradient, along which there is a progressive turnover of dominant variables.” • Such as in ecological succession • After ordination by a distance conserving technique and the first two axes are plotted against each other, one would find an arch shape.
Steps of DCA • Two major stages • Ordination by CA (as previous) • Then get rid of arch effect by brute-force.
Notice • There’s a loss of information, specifically the second CA axis, the Y-axis in this case.
Software • Standard software according to Shaw is based on the same source code and entered through some front-end of DECORANA. • However, there is a package to do this in R.
Basics in R. • decorana(veg, iweigh=0, iresc=4, ira=0, mk=26, short=0, before=NULL, after=NULL) • veg = data matrix • Iweigh = downweighting of rare species. Both CA and DCA are extremely sensitive to rare species, so this would decrease the importance of rare species. • Iresc = number of cycles of reiteration. • Ira = turns CA into DCA, if turned on (0 = detrended, 1 = simple)
There’s no information to extend this in Shaw, so, leaving it until a later time.