170 likes | 308 Views
D/RS 1013. Discriminant Analysis. Discriminant Analysis Overview. multivariate extension of the one-way ANOVA looks at differences between 2 or more groups goal is to discriminate between groups considers several predictor variables simultaneously. Discriminant - Overview.
E N D
D/RS 1013 Discriminant Analysis
Discriminant Analysis Overview • multivariate extension of the one-way ANOVA • looks at differences between 2 or more groups • goal is to discriminate between groups • considers several predictor variables simultaneously
Discriminant - Overview • provides a way to describe differences between groups in simple terms. • removes the redundancy among large numbers of variables by combining into a smaller number of Discriminant functions • can classify cases to groups when their group membership is unknown
Overview (cont.) • tests the significance of differences between two or more groups • examines several predictor variables simultaneously • construct linear combination of these variables, forming a single composite variable called a discriminant function • basically MANOVA flipped upside down
Discriminant parallels with MANOVA and Regression • Discriminant works the other way, predicting group membership by some kind of scores • The discriminant function takes the form: • D = d1z1 + d2z2 + ..... + dpzp
Discriminant functions • Di = d1z1 + d2z2 + ..... + dpzp • where, D = scores on the discriminant function • d1 - dp = discriminant function weighting coefficients for each of p predictor variables • z1 - zp = standardized scores on the original p variables
Unstandardized Functions • even more like regression equation • Di = a + d1x1 + d2x2 + ..... + dpxp • a = the discriminant function constant • d1 - dp = discriminant function weighting coefficients for each of p predictor variables • x1 - xp = raw scores on the original p variables
Forming discriminant functions • discriminant function is formed to maximize the F value associated with the D • F = bg variance on D / wg variance on D • provides a function with the greatest discriminating power.
Functions beyond the first • first function is one of many combinations of the p original predictor variables. • # of useful functions is p (# of original variables) or k-1 (k=# of groups being considered), whichever is smaller. • later functions maximize the separation between groups and are orthogonal with the preceding functions.
First discriminant function (3 gps) Separates group 1 from groups 2 & 3
Second function (3 gps) Separates group 3 from groups 1 & 2
Both functions together Orthogonal = uncorrelated
Confusion Matrix • assign cases to groups based on their discriminant function scores • assignments compared with actual group memberships • confusion matrix gives both overall accuracy of classification and the relative frequencies of various types of misclassification
Confusion matrix: example • our proportion correct is (43 + 39)/100= .82 • by chance alone we would end up with .50 correct • if we evenly divided our group assignments • between A & B half in each group correct by chance • can consider prior probabilities, if known
Cross Validation • hold back some of the data to test the model that emerges • gives good idea of the kind of predictive accuracy we can expect for another sample • small samples and several variables unlikely to replicate across samples
Classification Functions • weights and constants used to calculate scores for each case • as many scores as there are groups for each case • assign to group that the case has the highest classification function score for
Assumptions • assumes that all predictors follow a multivariate normal distribution • test is robust with respect to normality, in practice, lack of normality doesn't make much of a difference • especially with large n and moderate number of predictors