1.17k likes | 1.19k Views
Stats 346.3. Stats 848.3. Multivariate Data Analysis. Instructor:. W.H.Laverty. Office:. 235 McLean Hall. Phone:. 966-6096. Lectures:. M W F 12:30am - 1:20pm Arts 104. Evaluation:. Assignments, Term tests - 40% Final Examination - 60%. Each test and the Final Exam are Open Book
E N D
Stats 346.3 Stats 848.3 Multivariate Data Analysis
Instructor: W.H.Laverty Office: 235 McLean Hall Phone: 966-6096 Lectures: M W F 12:30am - 1:20pm Arts 104 Evaluation: Assignments, Term tests - 40%Final Examination - 60%
Each test and the Final Exam are Open Book Students are allowed to take in Notes, texts, formula sheets, calculators (laptop computers.) Dates for midterm tests: • Friday, February 08 • Friday, March 22
Text: Stat 346 –Multivariate Statistical Methods – Donald Morrison Not Required - I will give a list of other useful texts that will be in the library
Bibliography • Cooley, W.W., and Lohnes P.R. (1962). Multivariate Procedures for the Behavioural Sciences, Wiley, New York. • Fienberg, S. (1980), Analysis of Cross-Classified Data , MIT Press, Cambridge, Mass. • Fingelton, B. (1984), Models for Category Counts , Cambridge University Press. • Johnson, R.A. and Wichern D.W. Applied Multivariate Statistical Analysis , Prentice Hall. • Morrison, D.F. (1976), Multivariate Statistical Methods , McGraw-Hill, New York. • Seal, H. (1968), Multivariate Statistical Analysis for Biologists , Metheun, London • Alan Agresti (1990) Categorical Data Analysis, Wiley, New York.
The lectures will be given in Power Point They are now posted on the Stats 346 web page
Multivariate Data • We have collected data for each case in the sample or population on not just one variable but on several variables – X1, X2, … Xp • This is likely the situation – very rarely do you collect data on a single variable. • The variables maybe • Discrete (Categorical) • Continuous (Numerical) • The variables may be • Dependent (Response variables) • Independent (Predictor variables)
Independent variables Dependent Variables Categorical Continuous Continuous & Categorical Categorical Multiway frequency Analysis (Log Linear Model) Discriminant Analysis Discriminant Analysis Continuous ANOVA (single dep var) MANOVA (Mult dep var) MULTIPLE REGRESSION (single dep variable) MULTIVARIATE MULTIPLE REGRESSION (multiple dependent variable) ANACOVA (single dep var) MANACOVA (Mult dep var) Continuous & Categorical ?? ?? ?? A chart illustrating Statistical Procedures
Multivariate Techniques Multivariate Techniques can be classified as follows: • Techniques that are direct analogues of univariate procedures. • There are univariate techniques that are then generalized to the multivariate situarion • e. g. The two independent sample t test, generalized to Hotelling’s T2 test • ANOVA (Analysis of Variance) generalized to MANOVA (Multivariate Analysis of Variance)
Techniques that are purely multivariate procedures. • Correlation, Partial correlation, Multiple correlation, Canonical Correlation • Principle component Analysis, Factor Analysis • These are techniques for studying complicated correlation structure amongst a collection of variables
Techniques for which a univariate procedures could exist but these techniques become much more interesting in the multivariate setting. • Cluster Analysis and Classification • Here we try to identify subpopulations from the data • Discriminant Analysis • In Discriminant Analysis, we attempt to use a collection of variables to identify the unknown population for which a case is a member
An Example: A survey was given to 132 students • Male=35, • Female=97 They rated, on a Likert scale • 1 to 5 • their agreement with each of 40 statements. All statements are related to the Meaning of Life
Cluster Analysis of n = 132 university students using responses from Meaning of Life questionnaire (40 questions)
Discriminant Analysis of n = 132 university students into the three identified populations
A Review of Linear Algebra With some Additions
Matrix Algebra Definition An n × m matrix, A, is a rectangular array of elements n = # of columns m = # of rows dimensions = n × m
Definition A vector, v,of dimension n is an n × 1matrix rectangular array of elements vectors will be column vectors (they may also be row vectors)
A vector, v,of dimension n can be thought a point in n dimensional space
v3 v2 v1
Matrix Operations Addition Let A = (aij) and B = (bij) denote two n × m matrices Then the sum, A + B, is the matrix The dimensions of A and B are required to be bothn × m.
Scalar Multiplication Let A = (aij) denote an n × m matrix and let c be any scalar. Then cA is the matrix
Addition for vectors v3 v2 v1
Scalar Multiplication for vectors v3 v2 v1
Matrix multiplication Let A = (aij) denote an n × m matrix and B = (bjl) denote an m × k matrix Then the n × k matrixC = (cil) where is called the product of A and B and is denoted by A∙B
In the case that A = (aij) is an n × m matrix and B = v = (vj) is an m × 1vector Then w =A∙v = (wi) where is an n × 1vector w3 v3 w2 v2 w1 v1
Definition An n × n identity matrix, I, is the square matrix Note: • AI = A • IA = A.
Definition (The inverse of an n × n matrix) Let Adenote the n × n matrix Let Bdenote an n × n matrix such that AB = BA = I, If the matrix B exists then A is called invertibleAlso B is called the inverse of A and is denoted by A-1
The Woodbury Theorem where the inverses
Proof: Let Then all we need to show is that H(A + BCD) = (A + BCD) H = I.
The Woodbury theorem can be used to find the inverse of some pattern matrices: Example: Find the inverse of the n × n matrix
where hence and
Thus Now using the Woodbury theorem