1 / 31

Analysis of Variance

Analysis of Variance. Comparisons among multiple populations. More than two populations. H0: μ A =μ B. H0: μ C =μ D. H0: μ B =μ C. Discuss in the Chapter 8. H 0 : μ A =μ B =μ C =μ D. Basic assumptions and the hypothesis testing logic.

zarek
Download Presentation

Analysis of Variance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Variance Comparisons among multiple populations

  2. More than two populations H0: μA=μB H0: μC=μD H0: μB=μC Discuss in the Chapter 8 H0: μA=μB=μC=μD

  3. Basic assumptions and the hypothesis testing logic • The observed data are normally distributed with the same variance (although unknown) σ2. • Derive two estimators for σ2 • The first is always valid whether the hypothesis, H0: μA=μB=μC=μD, is true or not. • The second one is usually greater than the real parameter σ2 when H0: μA=μB=μC=μD is not true. • Compare these two estimators (2nd/1st) through the sample. • If the ratio is too large, then reject H0

  4. ANOVA testing • ANalysis Of VAriance • To test the considered hypothesis by analyzing the variance σ2 • Search the proper estimators • Decomposition of variance • …

  5. E.g., Decomposition of Syy, Syy=SSr+SSm 如果離差越小,表示迴歸線越接近真實值,預測得越準確! (Yi- ) Yi Y的離差,因給定sample之後固定不變 (Yi- ) Y ( - ) 如果離差越大,表示Y^不太可能是水平線,因為若是水平線,則差的平方和將會很小 X

  6. Review x2 distribution X1, X2, X3,…Xn are independent random variables ( ) ~N(0,1) ~ ~N(0,1) ~

  7. One-way ANOVA approach (i) # of obs. In each group # of groups The ith group mean Total d.f =# of group × obs. in each group Equal group size, n Replace the group mean by sample mean ~ . Total d.f =# of group ×( obs. in each group-1)

  8. One-way ANOVA approach (ii) By definition Called as “within samples sum of squares” SSw/σ2 ~ ∵ ∴ Group sample mean Total sample mean By definition Called as “between samples sum of squares” ∵ Var(Xi.)=Var(Xij)/n ∴ i,e., assume all Xi. population means are equal to μ in order to replace μ by total sample mean X.. Replaced total mean by total sample mean X.. SSb/σ2 ~ d.f.=# of group -1

  9. One-way ANOVA approach (iii) i.e., the numerator is sufficiently large, while the denominator is smaller Or reject H0 when <α =TS

  10. Decomposition of Var(Xij) (i) Total mean X.. If the group difference is smaller, the deviation from the center should be caused by the within randomness. B group mean XB. Between deviation Within deviation A group mean C group mean Xij A group B group C group Total deviation

  11. Decomposition of Var(Xij) (ii) In usual, define SST as the total sum of squares =

  12. If H0 is not accepted? Xi.~N(μi, σ2/n) Yi~N(μ., σ2/n) Set & ∴ ∵Xi.=Yi+μi-μ. X..=Y. Within deviation =E[Yi]-E[Y.] =μ.-μ.=0

  13. ANOVA table ∴ SST= If p-value<α

  14. The meaning of ANOVA table SST=SSb+SSw, 如果SSb越大,SSw將越小,則在不變的組數m之下, MSb將越大,MSw越小,於是 F值就越大,越可能 reject H0: 各組平均值無差異。也就是說觀察的變數Xij 與X 之總平均數的差異,大部份肇因於Xi. 類別平均數之間的差異。

  15. Unbalanced case—unequal sample size within the groups Different group size ni unconditional estimator of σ2 conditional estimator of σ2

  16. Unbalanced F-test for ANOVA A balanced design is suitable over an unbalanced one because of the insensitivity to slight departures from the assumption of equal population variances.

  17. Two classification factors Column factor Row factor

  18. Two-way ANOVA approach (i) Only one observation within each cell m types n types Review (αi=μi-μ,the deviation from total μ) ) (∵

  19. Two-way ANOVA approach (ii) Supposed an additive model for cell mean, composed by ai and bj The cell mean of size k or other The ith row mean The total mean The jth column mean Average row factor Average column factor The ith row mean=the average column factor+ the specific ith row factor Deviation from average row factor, column factor j j &

  20. Two-way ANOVA approach (iii)

  21. Two-way ANOVA approach (iv) i.e., The expected value of specific ij cell could be decomposed into: Total mean + the ith deviation from average row factor (the ith row deviation from the total mean) + the jth deviation from average column factor (the jth column deviation from total mean) The assumed two-way ANOVA model & Use the unbiased estimators to test the objective hypothesis

  22. Two-way ANOVA approach (v) Apply each unbiased estimator ? Reduced n-1 d.f. Reduced m-1 d.f. Reduced 1 d.f.

  23. Two-way ANOVA approach (vi) If is true then Define the row sum of squares 2

  24. Two way ANOVA table =m

  25. Two-way ANOVA with interaction (i)

  26. Two-way ANOVA with interaction (ii)

  27. Two-way ANOVA with interaction (iii)

  28. Two-way ANOVA with interaction (iv)

  29. Two-way ANOVA with interaction (v)

  30. Two-way ANOVA with interaction (vi)

  31. Homework #3 • Problem 5, 15, 19, 20, 25

More Related