1 / 90

Multivariate Analysis of Variance

SPSS Multivariate Analysis

Wondwossen
Download Presentation

Multivariate Analysis of Variance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter_14 Multivariate Analysis of Variance (MANOVA)‏ (Field_2005)‏

  2. A MANOVA is an ANOVA for several dependent variables. Therefore, MANOVA is called 'multivariate' Like ANOVA, MANOVA can have one or more independent variables What can you do with a MANOVA? • Until now we had only measured a single dependent variable. • Therefore, ANOVA is called 'univariate' • ANOVA can have one or more independent variable(s)‏

  3. If you conducted separate ANOVA's, any relationship between the dependent variables is ignored. However, there might be correlations between them.  MANOVA can tell us whether groups differ along a combination of dimensions Why MANOVA?

  4. ... multiple dependent variables as in MANOVA Number of pedestrians killed Number of lampposts hit Number of cars they crash in Advantage of having multiple Dependent Variables Expl: Distinguishing the three groups of 'drivers', 'drunk drivers', and 'no drivers' by ... • ...a single dependent variable as in ANOVA • Number of pedestrians killed MANOVA is more powerful in distinguishing these groups since it has more information on a variety of dependent variables

  5. Do not add any number of dependent variables you can think of but only reasonable ones which are theoretically and empirically motivated If you want to explore some novel dependent variables you might run separate analyses for the theoretically motivated ones and for the explorative ones. How many dependent variables?

  6. 2. Separate tests for the various group differences There are two main ways of following up on the group differences: Univariate ANOVAs Discriminant analysis ControversiesMANOVA is a 2-staged test • 1. Overall test • There are 4 possible ways for assessing the overall effect of MANOVA: - Pillai-Bartlett trace (V) - Hotelling's T2 - Wilks's lambda ()‏ - Roy's largest root

  7. MANOVA has greater power than ANOVA in detecting differences between groups. However, there is a complex relationship in that the power of MANOVA depends on a combination of the correlation between Dep Var's and the effect size. The power of MANOVA • For large effects, MANOVA has greater power if the variables are different (even negatively correlated) and if the group differences are in the same directions for those variables. • For 2 variables, one of which has a large and one of which has a small effect, power will increase if the variables are highly correlated.

  8. We want to assess the efffects of cognitive behaviour therapy (CBT) on obsessive compulsive disorder (OCD). CBT will we compared with Behavior Therapy (BT) and with no-treatment (NT) as a control condition. Since OCD manifests itself both behaviorally (obsessive actions) as well as cognitively (obsessive thoughts), both will be measured. Note that the two dependent variables are theoretically motivated! The example throughout the chapter

  9. The data from OCD.sav

  10. For understanding what is going on in a MANOVA we have to understand (a little bit of ) Matrices: A Matrix is a collection of numbers arranged in columns and rows. Expls: 2x3 matrix: 5x4 matrix: 1 2 3 1 2 3 4 4 5 6 5 6 7 8 9 1 3 5 6 7 2 8 0 5 2 8 The theory of MANOVA The values within a matrix are called 'components' or 'elements' Each row = data from 1 subjectcv Each row = data from 1 subject Each column= data from each variable Each column= data from each variable Each column= data from 1 variable

  11. An identity matrix is a square matrix in which the diagonal components are 1 and the off-diagonal components are 0: 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 More on Matrices • A square matrix is one with an equal number of rows and columns, e.g.: 5 3 5 7 8 4 2 1 0 5 1 3 9 7 4 1 3 5 8 0 9 6 3 7 2 • The red numbers are 'diagonal components', the black ones 'off-diagonal components'

  12. A matrix with data from only 1 person is called 'row vector'. It can be thought of as a single person's score on five different variables: (5 3 5 7 8)‏ A matrix with only one column is called 'column vector'. It can be thought of as five participants' score on a single variable: 8 5 7 8 2 More on matrices

  13. MANOVA uses a matrix that contains information about the variance accounted for by the independent variables, for each dependent variable. For each variance portion – hypothesis (model), error, and total variance – there is a sum of squares and cross-products matrix: Important matrices and their functions The matrices are used like the simple sum of squares (SSM, SSR, SST) for deriving the test statistics.

  14. In the 'sum of squares and cross-products matrix', what do the 'cross products' mean? It is the value for the total combined error between two variables. The represent the total correlation between two variables in an unstandardized way. It is these cross-products in terms of which MANOVA accounts for any correlation between dependent variables. What are 'Cross products'?

  15. Univariate ANOVA for DV1 (actions): For the first dependent variable 'number of compulsive actions' we have to determine 3 variance portions: Total, Model, and Residual Sum of Squares. Calculating MANOVA by hand(using the OCD data)First approach: two univariate ANOVAs SST(Action)‏ SSM(Action)‏ SSR(Action)‏

  16. SST = s2grand (n-1)‏ = 2.1195 (30-1)‏ = 61.47 Calculating SST, SSM, and SSR for Action Grand Variance 'Actions'

  17. SSM = Summing up the differences between each group mean and the grand mean, squaring it and multiplying by the number of subjects in the group SSM = 10(4.9-4.53)2 + 10(3.7-4.53)2 + 10(5-4.53)2 = 10(0.37)2 + 10(-0.83)2 + 10(0.47)2 =1.37 + 6.68 + 2.21 = 10.47 Calculating SST, SSM, and SSR for Action

  18. SSR = Taking the variance within each group, multiplying them with the n of scores -1 and adding them all up: SSR = s2CBT (nCBT -1) + s2BT (nBT -1) + s2NT (nNT -1)‏ = 1.433(10-1) + 3.122(10-1) + 1.111(10-1)‏ = (1.433x9) + (3.122x9) + (1.111x9)‏ = 12.9 + 28.1 + 10.00 = 51.00 Calculating SST, SSM, and SSRfor Action SST – SSM=SSR 61.47-10.47=51.00

  19. We divide SSM and SSR by their df's and derive the Mean sum of squares. Calculating Mean Sum of Squares: MST, MSM, and MSR for Action df (SSM) = k-1= 3-1=2 df (SSR) = k(n-1) = 3(10-1) = 3*9=27 Last, we divide MSM by MSR and derive the F-ratio: F = MSM= 5.235 = 2.771 MSR 1.889 The F-value has to be compared against the critical F-value F(2,27) = 3.355. Since the F-value of our model is smaller than the critical F-value, we reject the hypothesis that the 3 therapies differ in their effect on compulsive actions. Fcrit (2,27) = 3.355

  20. Univariate ANOVA for DV2 (thought):For the second dependent variable 'number of compulsive thoughts' we also have to determine the 3 variance portions: Total, Model, and Residual Sum of Squares. SST(Thought)‏ SSM(Thought)‏ SSR(Thought)‏

  21. Calculating SST, SSM, and SSR for Thought Grand Variance 'Thoughts' SST = s2grand (n-1)‏ = 4.8780 (30-1)‏ = 141.47

  22. SSM = Summing up the differences between each group mean and the grand mean, squaring it and multiplying by the number of subjects in the group SSM = 10(13.40-14.53)2 + 10(15.2-14.53)2 +10(15-14.53)2 = 10(-1.13)2 + 10(0,67)2 + 10(0.47)2 =12.77 + 4.49 + 2.21 = 19.47 Calculating SST, SSM, and SSR for Thought

  23. SSR = Taking the variance within each group, multiplying them with the n of scores -1 and adding them all up: SSR = s2CBT (nCBT -1) + s2BT (nBT -1) + s2NT (nNT -1)‏ = 3.6(10-1) + 4.4(10-1) + 5.56(10-1)‏ = (3.6x9) + (4.4x9) + (5.56x9)‏ = 32.4 + 39.6 + 50.00 = 122.00 Calculating SST, SSM, and SSRfor Thought SST – SSM=SSR 141.47-19.47=122.00

  24. We divide SSM and SSR by their df's and derive the Mean sum of squares. Calculating Mean Sum of Squares: MST, MSM, and MSR for Thought Last, we divide MSM by MSR and derive the F-ratio: F = MSM= 9.735 = 2.154 MSR 4.519 The F-value has to be compared against the critical F-value F(2,27) = 3.355. Since the F-value of our model is smaller than the critical F-value, we reject the hypothesis that the 3 therapies differ in their effect on compulsive thoughts. Fcrit (2,27) = 3.355

  25. MANOVA takes into consideration the relationship between the DVs by way of calculating their cross-products. More specifically, there are 3 cross-products which relate to the 3 SSs that we have calculated within the Univariate ANOVAs: - Total cross-product CPT - Model cross-product CPM - Residual cross-product CPR The relationships between the 2 DVs

  26. The total cross-product between our 2 Dvs 'action' and 'thought' is calculated by the following equation: CPT = xi(Action)-Xgrand(Action))(Xi(Thoughts) -Xgrand(Thoughts))]‏ For all subjects, we have to add up the product of the differences between the individual scores for action and thought minus the respective grand means. Calculating the total cross-product, CPT

  27. Total Cross-Product CPT The Total Cross-product tells us how the two dependent Variables, DV1 and DV2, are related, overall. The Total Cross-Product CPT= 5.47

  28. The model cross-product tells us how the relationship between our 2 DVs 'action' and 'thought' is affected by our experimental manipulation: CPM = nxgroup(Actions)-Xgrand(Action))(xgroup(Thoughts) -Xgrand(Thoughts))] For all 3 experimental groups, we have to add up the product of the differences between the group means for action and thought minus the respective grand means. Calculating the Model cross-product, CPM

  29. Model Cross-Product CPM The Model Cross-Product CPM= -7.53

  30. The residual cross-product tells us how the relationship between our 2 DVs 'action' and 'thought' is affected by individual differences (errors) in the model: CPR = xi(Actions)-Xgroup(Action))(xi(Thoughts) -Xgroup(Thoughts))‏ For all subjects, we have to add up the product of the differences between the individual scores for action and thought minus the respective group means. An easier way to calculate CPR is to subtract CPM from CPT: CPR = CPT - CPM = 5.47 - (-7.53) = 13 Calculating the Residual cross-product, CPR

  31. Residual Cross Product CPR The CPR is similar to the CPT only that it is the group means and not the grand means that are subtracted from the individual scores. The Residual Cross-Product CPR= 13

  32. The Sum of squares cross- product (SSCP) matrices We shall now represent the total, residual, and model Sum of squares and their respective cross-products in matrices. These combinatorial matrices are called Sum of squares cross- product (SSCP) matrices There are 3 of them: T: total SSCPT E: error SSCPE H: model SSCPH ('H' for 'Hypothesis)‏ Since we have 2 DVs, we will always have 2x2 matrices for all SSCP-matrices: T,E, and H

  33. In univariate ANOVA, we divide MSM/MSR in order to obtain the F-value. In multivariate ANOVA, we would have to divide H/E then. Problem: H and E are matrices and matrices cannot be readily divided. Solution: The equivalent to division for matrices is matrix inversion, hence H is multiplied by the inverse of E, called E-1. The product is HE-1. Principle of the MANOVA test statistic

  34. Representing the DVs by underlying dimensions is like working back in a regression, namely to derive from a set of Dependent Variables the underlying Independent Variables. These linear combinations of the DVs are called variates, latent variables, or factors. Knowing these linear variates, we can predict which group (here, therapy group) a person belongs to. Since the variates are used to discriminate groups of people, they are called discriminant function variates. Discriminant function variates

  35. How do we find the discriminant function variates? By maximization which means that the first discriminant function (V1) is the linear combination of dependent variables that maximizes the differences between groups. Hence the ratio of systematic to unsystematic variance (SSM/SSR) will be maximized for V1. For subsequent variates (V2, etc.), this ratio will be smaller. Practically, we obtain the maximum possible value of the F-ratio when we look at V1. Discriminant function variates

  36. The variate V1 can be described as a linear regression equation, where the 2 DVs are the predicting values and V1 is the predicted value: Y = b0 + b1X1 + b2X2 V1 = b0 + b1DV1 + b2DV2 V1 = b0 + b1Actions1 + b2Thoughts2 In linear regression, the b-values are the weights of the predictors. In discriminant function analysis they are obtained from eigenvectors of the HE-1 matrix . b0 can be ignored since we are only interested in the discrimination function and not in the constant. Discriminant function variates Note that by looking for the underlying factors of the Dep Var's, they become predictors (like Indep Var's). This is because we want to find out what dimension(s)‏ underlie them.

  37. How many variates are there?  The smaller number of either p or (k-1) where p is the number of DVs and k-1 is the number of levels of the independent variable. In our case, both yield 2.  We will find 2 variates. The b-values of the 2 variates are derived from the eigenvalues of the matrix HE-1. There will be 2 such matrices with 2 eigenvalues, one for each variate. Discriminant function variates

  38. An eigenvector is a vector of a matrix which is unchanged by transformations of that matrix to a diagonal matrix, i.e., one with only diagonal elements. By changing HE-1 into a diagonal matrix we reduce the numbers of elements we have to consider for testing significance while preserving the ratio of systematic vs. unsystematic variance. We won't calculate those eigenvalues ourselves but just adopt them from the book (Field_2005_589): eigenvector1 = 0.603 for variate 1 -0.335 eigenvector2 = 0.425 for variate 2 0.339 Discriminant function variates These are the b's for variate 1 and 2 in the regression equation These are the b's for variate 1 and 2 in the regression equation

  39. V1 = b0 + b1Actions1 + b2Thoughts2 (b0 can be omitted since it plays no role in the discrimination function)‏ Variate V1 = 0.603 Actions - 0.335 Thoughts Variate V2 = 0.425 Actions + 0.339 Thoughts The regression equation for the 2 variates

  40. The equation can be used to calculate a score for each subject on the variate. Exp.: Subject 1 in the CBT had 5 obsessive actions and 14 obsessive thoughts. His scores for variate 1 and 2 are: V1 = (0.603 x 5) + (0.335 x 14) = -1.675 V2 = (0.425 x 5) + (0.339 x 14) = 6.871 The discriminant function

  41. The eigenvalues we have just derived (0.335 and 0.073) are the conceptual analog to the F-ratio in univariate ANOVA. Once we have them, they have to be compared against the value that would result by chance alone. There are 4 ways how those chance values can be calculated: - Pillai-Bartlett trace (V)‏ - Hotelling's T2 - Wilks's lambda ()‏ - Roy's largest root Eigenvalues as F-ratios All those values are variations of a common theme: the ratio of explained to unexplained variance, and correspond more or less to the F-ratio: SSM/SSR in univariate ANOVA.

  42. s V = i/(1 + i)‏ i=1 V = 0.335 + 0.073 = 0.319 1+ 0.335 1+0.073 is the eigenvalue for each of the discriminant variates, s is the number of variates. Pillai's trace is thus the sum of the proportion of explained variance on the discriminant functions. It directly corresponds to SSM/SST. Pillai-(Bartlett) trace (V)‏ Action- and Thought- eigenvalues

  43. Hotelling's T2 is simply the sum of the eigenvalues for each variate. s T = i = 0.335 + 0.073 = 0.408 i=1 Here, we sum SSM/SSR for each of the variates. It compares directly to the F-ratio in ANOVA. Hotelling's T2 Action- and Thought- eigenvalues

More Related