1 / 37

The Multiple Correlation Coefficient

The Multiple Correlation Coefficient. Definition. has ( p +1)-variate Normal distribution. with mean vector. and Covariance matrix. We are interested if the variable y is independent of the vector.

locke
Download Presentation

The Multiple Correlation Coefficient

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Multiple Correlation Coefficient

  2. Definition has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable y is independent of the vector The multiple correlation coefficient is the maximum correlation between y and a linear combination of the components of

  3. Derivation This vector has a bivariate Normal distribution with mean vector and Covariance matrix We are interested if the variable y is independent of the vector The multiple correlation coefficient is the maximum correlation between y and a linear combination of the components of

  4. The multiple correlation coefficient is the maximum correlation between y and The correlation between y and Thus we want to choose to maximize Equivalently

  5. Note:

  6. The multiple correlation coefficient is independent of the value of k.

  7. We are interested if the variable y is independent of the vector The sample Multiple correlation coefficient Then the sample Multiple correlation coefficient is

  8. Testing for independence between y and The test statistic If independence is true then the test statistic F will have an F-distributions with n1 = p degrees of freedom in the numerator and n1 = n – p + 1degrees of freedom in the denominator The test is to reject independence if:

  9. Other tests for Independence Test for zero correlation (Independence between a two variables) The test statistic If independence is true then the test statistic t will have a t -distributions with n = n –2degrees of freedom. The test is to reject independence if:

  10. Test for non-zero correlation (H0: r = r0 ) The test statistic If H0 is true the test statistic z will have approximately a Standard Normal distribution We then rejectH0 if:

  11. Test for zero partial correlation correlation (Conditional independence between a two variables given a set of p Independent variables) The test statistic = the partial correlation between yi and yj given x1, …, xp. If independence is true then the test statistic t will have a t -distributions with n = n – p - 2degrees of freedom. The test is to reject independence if:

  12. Test for non-zero partial correlation The test statistic If H0 is true the test statistic z will have approximately a Standard Normal distribution We then rejectH0 if:

  13. Canonical Correlation Analysis

  14. The problem Quite often when one has collected data on several variables. The variables are grouped into two (or more) sets of variables and the researcher is interested in whether one set of variables is independent of the other set. In addition if it is found that the two sets of variates are dependent, it is then important to describe and understand the nature of this dependence. The appropriate statistical procedure in this case is called Canonical Correlation Analysis.

  15. Canonical Correlation: An Example In the following study the researcher was interested in whether specific instructions on how to relax when taking tests and how to increase Motivation , would affect performance on standardized achievement tests • Reading, • Language and • Mathematics

  16. A group of 65 third- and fourth-grade students were rated after the instruction and immediately prior taking the Scholastic Achievement tests on: • how relaxed they were (X1) and • how motivated they were (X2). In addition data was collected on the three achievement tests • Reading (Y1), • Language (Y2) and • Mathematics (Y3). The data were tabulated on the next page

  17. Definition: (Canonical variates and Canonical correlations) have p-variate Normal distribution and with Let and be such that U1 and V1 have achieved the maximum correlation f1. Then U1 and V1 are called the first pair of canonical variates and f1 is called the first canonical correlation coefficient.

  18. derivation: ( 1st pair of Canonical variates and Canonical correlation) Now Thus has covariance matrix

  19. derivation: ( 1st pair of Canonical variates and Canonical correlation) Now Thus has covariance matrix hence

  20. Thus we want to choose so that is at a maximum or is at a maximum Let

  21. Computing derivatives and

  22. Thus This shows that is an eigenvector of k is the largest eigenvalue of and is the eigenvector associated with the largest eigenvalue.

  23. Also and

  24. Summary: The first pair of canonical variates , eigenvectors of the matrices are found by finding associated with the largest eigenvalue (same for both matrices) The largest eigenvalue of the two matrices is the square of the first canonical correlation coefficientf1

  25. The remaining canonical variates and canonical correlation coefficients The second pair of canonical variates , so that are found by finding 1. (U2,V2) are independent of (U1,V1). 2. The correlation between U2 and V2 is maximized The correlation, f2, between U2 and V2 is called the second canonical correlation coefficient.

  26. The ith pair of canonical variates , so that are found by finding 1. (Ui,Vi) are independent of (U1,V1), …, (Ui-1,Vi-1). 2. The correlation between Ui and Vi is maximized The correlation, f2, between U2 and V2 is called the second canonical correlation coefficient.

  27. derivation: ( 2nd pair of Canonical variates and Canonical correlation) Now has covariance matrix

  28. Now and maximizing Is equivalent to maximizing subject to Using the Lagrange multiplier technique

  29. Now and also gives the restrictions

  30. These equations can used to show that are eigenvectors of the matrices associated with the 2nd largest eigenvalue (same for both matrices) The 2ndlargest eigenvalue of the two matrices is the square of the 2nd canonical correlation coefficientf2

  31. continuing Coefficients for the ith pair of canonical variates, are eigenvectors of the matrices associated with the ith largest eigenvalue (same for both matrices) The ithlargest eigenvalue of the two matrices is the square of the ith canonical correlation coefficientfi

  32. Example Variables • relaxation Score (X1) • motivation score (X2). • Reading (Y1), • Language (Y2) and • Mathematics (Y3).

  33. Summary Statistics

  34. Canonical Correlation statistics Statistics

  35. continued

  36. Summary U1 = 0.197 Relax + 0.979 Mot V1 = 0.504 Read + 0.900 Lang + 0.565 Math f1 = .592 U2 = 0.980 Relax + 0.203 Mot V2 = 0.391 Math - 0.361 Read - 0.354 Lang f2 = .159

More Related