Analysis of Variance

Analysis of Variance Comparisons among multiple populations

More than two populations H0: μA=μB H0: μC=μD H0: μB=μC Discuss in the Chapter 8 H0: μA=μB=μC=μD

Basic assumptions and the hypothesis testing logic • The observed data are normally distributed with the same variance (although unknown) σ2. • Derive two estimators for σ2 • The first is always valid whether the hypothesis, H0: μA=μB=μC=μD, is true or not. • The second one is usually greater than the real parameter σ2 when H0: μA=μB=μC=μD is not true. • Compare these two estimators (2nd/1st) through the sample. • If the ratio is too large, then reject H0

ANOVA testing • ANalysis Of VAriance • To test the considered hypothesis by analyzing the variance σ2 • Search the proper estimators • Decomposition of variance • …

E.g., Decomposition of Syy, Syy=SSr+SSm 如果離差越小,表示迴歸線越接近真實值,預測得越準確! (Yi- ) Yi Y的離差,因給定sample之後固定不變 (Yi- ) Y ( - ) 如果離差越大,表示Y^不太可能是水平線,因為若是水平線,則差的平方和將會很小 X

Review x2 distribution X1, X2, X3,…Xn are independent random variables ( ) ~N(0,1) ～～N(0,1) ～

One-way ANOVA approach (i) # of obs. In each group # of groups The ith group mean Total d.f =# of group × obs. in each group Equal group size, n Replace the group mean by sample mean ～ . Total d.f =# of group ×( obs. in each group-1)

One-way ANOVA approach (ii) By definition Called as “within samples sum of squares” SSw/σ2 ～ ∵ ∴ Group sample mean Total sample mean By definition Called as “between samples sum of squares” ∵ Var(Xi.)=Var(Xij)/n ∴ i,e., assume all Xi. population means are equal to μ in order to replace μ by total sample mean X.. Replaced total mean by total sample mean X.. SSb/σ2 ～ d.f.=# of group -1

One-way ANOVA approach (iii) i.e., the numerator is sufficiently large, while the denominator is smaller Or reject H0 when <α =TS

Decomposition of Var(Xij) (i) Total mean X.. If the group difference is smaller, the deviation from the center should be caused by the within randomness. B group mean XB. Between deviation Within deviation A group mean C group mean Xij A group B group C group Total deviation

Decomposition of Var(Xij) (ii) In usual, define SST as the total sum of squares =

If H0 is not accepted? Xi.~N(μi, σ2/n) Yi~N(μ., σ2/n) Set & ∴ ∵Xi.=Yi+μi-μ. X..=Y. Within deviation =E[Yi]-E[Y.] =μ.-μ.=0

ANOVA table ∴ SST= If p-value<α

The meaning of ANOVA table SST=SSb+SSw, 如果SSb越大，SSw將越小，則在不變的組數m之下, MSb將越大,MSw越小,於是 F值就越大，越可能 reject H0: 各組平均值無差異。也就是說觀察的變數Xij 與X 之總平均數的差異，大部份肇因於Xi. 類別平均數之間的差異。

Unbalanced case—unequal sample size within the groups Different group size ni unconditional estimator of σ2 conditional estimator of σ2

Unbalanced F-test for ANOVA A balanced design is suitable over an unbalanced one because of the insensitivity to slight departures from the assumption of equal population variances.

Two classification factors Column factor Row factor

Two-way ANOVA approach (i) Only one observation within each cell m types n types Review (αi=μi-μ,the deviation from total μ) ) (∵

Two-way ANOVA approach (ii) Supposed an additive model for cell mean, composed by ai and bj The cell mean of size k or other The ith row mean The total mean The jth column mean Average row factor Average column factor The ith row mean=the average column factor+ the specific ith row factor Deviation from average row factor, column factor j j &

Two-way ANOVA approach (iii)

Two-way ANOVA approach (iv) i.e., The expected value of specific ij cell could be decomposed into: Total mean + the ith deviation from average row factor (the ith row deviation from the total mean) + the jth deviation from average column factor (the jth column deviation from total mean) The assumed two-way ANOVA model & Use the unbiased estimators to test the objective hypothesis

Two-way ANOVA approach (v) Apply each unbiased estimator ? Reduced n-1 d.f. Reduced m-1 d.f. Reduced 1 d.f.

Two-way ANOVA approach (vi) If is true then Define the row sum of squares 2

Two way ANOVA table =m

Two-way ANOVA with interaction (i)

Two-way ANOVA with interaction (ii)

Two-way ANOVA with interaction (iii)

Two-way ANOVA with interaction (iv)

Two-way ANOVA with interaction (v)

Two-way ANOVA with interaction (vi)

Homework #3 • Problem 5, 15, 19, 20, 25

Analysis of Variance