160 likes | 376 Views
Geology 659 - Quantitative Methods. Eigenvalues/Eigenvectors & Discriminant Analysis. tom.h.wilson wilson@geo.wvu.edu. Department of Geology and Geography West Virginia University Morgantown, WV. Eigenvalue and eigenvector problems.
E N D
Geology 659 - Quantitative Methods Eigenvalues/Eigenvectors & Discriminant Analysis tom.h.wilson wilson@geo.wvu.edu Department of Geology and Geography West Virginia University Morgantown, WV
Eigenvalue and eigenvector problems The eigenvalue/eigenvector problems of concern to us in statistical analysis are associated with matrices of correlation coefficients. Consider the 4 x 4 matrix on page 147. The matrix is symmetrical. The diagonal elements with value 1 represent the correlation of a sample with itself, while the remaining elements represent correlations of 1 sample to another
The plots represent different states of correlation between two variables. The eigenvectors define the directions of maximum and minimum variance. High correlation Low correlation
Discriminant Analysis The example in the text illustrates grain size and sorting variations associated with two samples. One sample is taken from an offshore environment and the other from a beach environment. The plot suggests that neither grain size or sorting uniquely differentiate the offshore sands form the beach sands.
The result of discriminant analysis to find a linear combination of the sorting and grain size characteristics that helps differentiate between the clusters of offshore sands and beach sands appearing in the scatter plot.
As discussed by Davis, it is possible to derive a discriminant function along which there is maximum difference in the locations of beach and offshore sand clusters as defined by sorting and grain size. The discriminant function (below) yields a “score” for each observation or sorting/grain size pair. i is the observation and j (1 or 2) is the variable (grain size or sorting). The score is calculated as Beach Sands Offshore Sands
The cutoff score is the score calculated using the average of the group averages for sorting and grain size.
Ro helps differentiate the two groups and may suggest that some observations in the offshore sands, for example, may actually be beach sands. As you can see below, the discriminant scores suggest that three observations classified as offshore sands have characteristics similar to those of beach sands. There are also three observations classed as beach sands that have characteristics more like those of the offshore sand.
Discriminant analysis allows you to maximize the difference between clusters in the multidimensional space defined by the measured variables. Discriminant analysis provides a one dimensional measure of cluster separation along the discriminant score axis.
The data provided by Davis consists of three columns: 1) Group (A (beach) or B (offshore)), 2) median grain size, 3) sorting coefficient. Using the original classifications the two groups plot as shown below.
In today’s lab we’ll show you how to use the statistical analysis package MiniTab and work through examples in multiple linear regression and discriminant analysis using data sets from Davis (2002). Note that Minitab has abundant help files that will provide answers to many of your questions.
Help files for Multiple Regression and Discriminant Analysis are included in today’s handout. The simple exercises begun in today’s class should be handed in on Thursday.