770 likes | 1.03k Views
ANCOVA. Group #4 AMS 572 – Data Analysis. Professor : Wei Zhu. Team 4. Lin Wang (Lana). Xian Lin (Ben). Zhide Mo (Jeff). Miao Zhang. Juan E. Mojica. Yuan Bian. Ruofeng Wen. Hemal Khandwala. Lei Lei. Xiaochen Li ( Joe ). Team 4. Introduction to ANCOVA. What is ANCOVA.
E N D
ANCOVA Group #4 AMS 572 – Data Analysis Professor:WeiZhu
Team 4 Lin Wang (Lana) XianLin (Ben) Zhide Mo (Jeff) Miao Zhang Juan E. Mojica Yuan Bian RuofengWen Hemal Khandwala LeiLei Xiaochen Li (Joe)
What is ANCOVA ANCOVA Analysis of Covariance ANCOVA merge of ANOVA&Linear Regression Analysis of Variance
ANOVA • described by R. A. Fisher to assist in the analysis of data from agricultural experiments. • Compare the means of any number of experimental conditions without any increase in Type 1error. H0 is rejected when it is true
ANOVAa way of determining whether the average scores of groups differed significantly. • Psychology • Assess the average effect of different experimental conditions on subjects in terms of a particular dependent variable.
Ronald Aylmer Fisher An English statistician, Evolutionary biologist, and Geneticist. Contributions: Feb.17, 1890 – July 29, 1962 • Analysis of Variance(ANOVA), Maximum likelihood, F-distribution, etc.
Linear Regression • developed and applied in different areas with that of ANOVA • got developed inbiologyand psychology • The term "regression" was coined by Francis Galtonin the nineteenth century to describe a biological phenomenon
Francis Galton studied the height of parents and their adult children • Conclusion: short parents’ children are usually shorter than average, but still taller than their parents. 5’4’’ 5’6’’ 5’8’’ < 5’9’’ Average height • Regression toward the Mean
Regressionapplied to data obtained from correlational or non-experimental research Regression analysis helps us understand the effect of changing one independent variable on changing dependent variable value
Francis Galton • (Feb. 16, 1822-Jan. 17, 1851) • English anthropologist, eugenicist, and statistician. • Contributions: • widely promoted regression toward the mean • created the statistical concept of correlation • a pioneer in eugenics, coined the term in 1883 • the first to apply statistical methods to the study of human differences
What is ANCOVA • a statistical technique that combines regression and ANOVA(analysis of variance). • originally developed by R.A. Fisher to increase the precision of experimental analysis • applied most frequently in quasi-experimental research • involve variables cannot be controlled directly
One-Way Layout Experiment factor A Levels Balanced design, if Samples
This is a linear model to represent Yij • , where • , where is the grand mean
ESTIMATORS (grand mean)
What is SSA? • the factor A sum of squares • the factor A mean square, with d.f.
What is SST? • the total sum of squares • ANOVA identity
Model of ANOVA Data, the jth observation of the ith group Error N(0,σ2) Grand mean of Y Effects of the ith group (We focus on if αi = 0, i = 1, …, a)
Model of Linear Regression Error Data, the (ij)th observation Predictor Slope and Intersect (We focus on the estimate)
ANCOVA is ANOVA merged with Linear Regression Effects of the ith group (We still focus on if αi = 0, i = 1, …, a) Known Covariate (What is this guy doing here?)
How to perform ANCOVA (This is just the ANOVA Model!)
How do we get ,then? Within each group, consider αi a constant, and notice that we actually only desire the estimate of slope β instead of INTERSECT.
How do we get ,then?(2) • Within each group, do Least Square: • Assume that
How do we get ,then?(3) • We use Pooled Estimate of β
ANCOVA begins: In each group, find Slope Estimation via Linear Regression Pool them together Get rid of the Covariate Do ANOVA on the model Go home and have dinner.
ANCOVA, ANOVA and Regression Regression General Linear Model ANOVA /ANCOVA
Simple Linear Regression Error Response Variable Predictor Intersect Slope All of them are Scalars!
ANOVA: Dummy Variable Regression Residual for the ith unit Outcome of the ith unit Categorical variable (binary) coefficient for the slope coefficient for the intersect More about the : =1 if unit is the treatment group =0 if unit is the control group
Two-way ANOVA Overall mean response Residual for the ith unit Response variable the effect due to any interaction between the ith level of A and the jth level of B effect due to the jth level of factor B effect due to the ith level of factor A
General Linear Model The ith response variable Random Error Categorical Variables Categorical Variables Continuous Variable Continuous Variable The above formula can be simply denoted as: What can this X be? Before we see an example of X, we have learned thatGeneral Linear Model covers (1) Simple Linear Regression; (2) Multiple Linear Regression; (3) ANOVA; (4) 2-way/n-way ANOVA.
X: Interaction Between Random Variables X in the GLM might be expanded as Where X3 in the above formula could be the INTERACTION between X1 and X2 Did you see the tricks? Next, let us see what assumptions shall be satisfied before using ANCOVA.
Test the Three Assumptions • Before using ANCOVA… • Test the homogeneity of variance • 2. Test the homogeneity of regression • whether H0: • 3. Test whether there is a linear relationship between the dependent variable and covariate.
2. Test Whether H0: (2) (1) Define Sum of Square of Errors within Groups Is calculated based on AND, is generated by the random error .
2. Test Whether H0: (3) (2) SSE is generated by • Random Error • Difference between distinct We can calculate SSE based on a common (3) Let SSB=SSE – SSEG. Sum of Square between Groups SSB is constituted by the difference between different
2. Test Whether H0: (4) Mean Square between Groups Mean Square within Groups Do F test on MSB and MSEG to see whether we can reject our HO F=MSB / MSEG
3. Test Linear Relationship (1) • Assumption 3: • Test a linear relationship between the • dependent variable • and • covariate. • Ho: = 0 • How to do it? • F test • on • SSR • and • SSE Sum of Square of Regression
3. Test Linear Relationship (2) • How to calculate SSR and MSR? From each SSR is the difference obtained from the summation of the square of the differences between and . Mean SquareRegression