1 / 31

Analysis of Variance

Learn about the Analysis of Variance (ANOVA) method used to compare multiple independent random selections in terms of their levels and hypotheses. This statistical method is frequently used in the evaluation of biological experiments.

marvaj
Download Presentation

Analysis of Variance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analysis of Variance doc.Ing. Zlata Sojková,CSc.

  2. In practice it is often necessary to compare a large number of independent random selections in terms of level, we are interested in hypothesis: for at least onei(i = 1, 2,…m) form > 2, wheni , i =1, 2, …m are mean values of normally distributed populations with equal variances2 , t.j. N(, 2) • To verify this hypothesis is used important statistical method called Analysis of variance, abbreviatedANOVA (resp. AV) doc.Ing. Zlata Sojková,CSc.

  3. In practice isAV used for examination of the impact of one, or more factors (treatments) on the statistical sign. Factors are labeledA, B,…in AVthey will beregarded as qualitative attributes with differentvariations – levels of factor Result will be quantitativestatistical sign denoted Y AV is frequently used in the evaluation of biological experiments The simplest case is AV with single factor called One factor analysis of variance doc.Ing. Zlata Sojková,CSc.

  4. Level of the factor refer to: certain amount of quantitative factor, e.g. Amount of pure nutrients in manure, different income groups of households Certain kind of qualitative factor, e.g. different types of the same crop, methods of products placing in stores, AV is a generalization of Student's t-test for independent choices AV also examines the impact of qualitative factorsresulting in a quantitative character -> analyzes the relationships between attributes doc.Ing. Zlata Sojková,CSc.

  5. A 1 2… j… n Yi . yi . 1 y11 y12 y1j y1n Y1.y1. 2 y21 y22 y2j y2n Y2. y2. … ……….. i yi1 yi2 yij yin Yi. yi. … ……….. m ym1 ym2 ymj ymn Ym. ym. Y..y.. Scheme of single-factor experiment “balanced attempt” row sum row average Repetition Levels of the factor Overall average Total sum doc.Ing. Zlata Sojková,CSc.

  6. Row sum: Total sum: Row average: Overall average: doc.Ing. Zlata Sojková,CSc.

  7. Model for resulting observed value: wherei = 1, 2,…, m j = 1,2,…, n  - expected values for all levels of the factor and observed values i - impact of i-thlevel of the factor A eij - random error, every measurement is biased, resp. impact of random factors doc.Ing. Zlata Sojková,CSc.

  8. or Then we can formulate null hypothesis: Ho : 1 = 2 =… i = m =0 -> effects of all levels of factor A are zero, insignificant, against the alternative hypothesis H1: i  0for at least onei (i = 1,2…m) effecti at least onei – level of the factor is significant, =>significantly different from zero doc.Ing. Zlata Sojková,CSc.

  9. Estimates of parameters aresample characteristics:: What can be rewrited: doc.Ing. Zlata Sojková,CSc.

  10. 1 2 3 Comparison of two experiments with three levels of factor 1 2 3 doc.Ing. Zlata Sojková,CSc.

  11. Principle of the ANOVA Essence of the analysis of variance lies in the decomposition of the total variability of theinvestigated sign. Sr Sc S1 Variability between levels of factor, caused by the action of factor A, “variability between groups” Random variability, residual, “variability within groups“ Total variability doc.Ing. Zlata Sojková,CSc.

  12. 2 Degrees of freedom 3 Mean square (MS) (1/2) 1 Sum of squares (SS) 4 F critical Variability Variability between groups s12 m-1 S1 Variability within groups sr2 m.n - m Sr Total variability N-1= m .n-1 Sc doc.Ing. Zlata Sojková,CSc.

  13. Test statistics for one factor ANOVA can be written: F value will be compared with appropriate table value for F-distribution: F , with(m-1)and (m.n - m)degrees of freedom doc.Ing. Zlata Sojková,CSc.

  14. Decision about test result: • IfF vyp  F. ((m-1,(N-m)) We reject H0, In that case is effect of at least one level of the factor significant, thus average level of the indicator is significantly different from others.=> At least one effect iis statistically significantly different from zero. If F vyp  F Do not reject Ho F Acceptance regon Ho Rejection region H0 doc.Ing. Zlata Sojková,CSc.

  15. If null hypothesis is rejected: • We found only that effect of the factor on examined attribute is significant. • It is also necessary to identify levels of the factor, which are significantly different - for this purpose are usedtests of contrasts • Test of contrast: Duncan test, Scheffe test, Tuckey test and others….. doc.Ing. Zlata Sojková,CSc.

  16. Terms of use AV: • Samples have normal distribution, violating of this assumption has significant effect on the results of AV • statistical independence of random errors eij • Identical residual variances 12 = 22 = …. = 2 , t.j. D(eij) = 2 for all i = 1,2…., m, j=1,2, …n this assumption is more serious and can be verified by Cochran, resp. Bartlett test. doc.Ing. Zlata Sojková,CSc.

  17. A 1 2… j … ni Yi . yi . 1 y11 y12 y1j ...n1Y1.y1. 2 y21 y22 y2j ...n2Y2. y2. … ……….. i yi1 yi2 yij ...niYi. yi. … ……….. m ym1 ym2 ymj ...nmYm. ym. Y..y.. Scheme of single-factor experiment “unbalanced attempt” row sum Row average Different number of repetitions Levels of the factor Overall average Where doc.Ing. Zlata Sojková,CSc.

  18. 4 F- critical 1 Sum of squares (SS) 3 Mean square (MS) (1/2) 2 Degrees of freedom Variability Variability between groups s12 m-1 S1 Variability within groups sr2 N - m Sr Total variability N-1 S doc.Ing. Zlata Sojková,CSc.

  19. Two-factor analysis of variance with one observation in each subclass.... TAV • Consider the effect of factor A, which we investigate on the m - levels, i = 1,2, ...., m • Then consider the effect of factor B,which is observed on n - levels , j = 1,2, …, n • On every i-level of factor A and j-level of factor B we have only one observation (repetition) yij • =>We are veryfying two null hypothesis doc.Ing. Zlata Sojková,CSc.

  20. A 1 2 … j … n Yi . yi . 1 y11 y12 y1j y1n Y1.Y1. 2 y21 y22 y2j y2n Y2. y2. … ……….. i yi1 yi2 yij yin Yi. yi. … ……….. m ym1 ym2 ymj ymn Ym. ym. Y.1 Y.2 ...Y.j ...Y.1 Y.. y.1 y.2 ...y.j ...y.1 y.. Scheme for Two-factor experiment with one observation in each subclass TAV row sum n- levels of factor B B m-levels of factor A Row average Overall average Column sum Column average doc.Ing. Zlata Sojková,CSc.

  21. We can write model for examined attributeas follows: We are verifying the validity of two null hypothesis Hypothesis for factor A: Ho1: 1 = 2 =… i = m =0 t.j. All effects of factor A levels are equal to zero, thus insignificant, against alternative hypothesis H11 :i  0for at least onei (i = 1,2…m) effecti of at least onei – level of factor A is significant, significantly different from zero doc.Ing. Zlata Sojková,CSc.

  22. Hypothesis for factor B: Ho2:  1 =  2 =…  j =  n =0 => All effects of factor A levels are equal to zero, thus insignificant, against alternative hypothesis H12 : j  0for at least one j (j = 1,2…m) effect j of at least one j – level of the factor B is significant, significantly different from zero doc.Ing. Zlata Sojková,CSc.

  23. 4 F - critical 2 Degrees of freedom 3 Mean square (MS) (1/2) 1 Sum of squares (SS) TAV Variability Variability between rows S1 s12 m-1 Variability between columns n-1 s22 S2 Residual variability Sr sr2 (m-1)(n-1) Total variability Sc m.n -1 doc.Ing. Zlata Sojková,CSc.

  24. Decomposition of the total variability Sc= S1 +S2 +S r Variability between rows, effect of factor A Variability between columns, Effect of factor B Residual variability Total variability doc.Ing. Zlata Sojková,CSc.

  25. Investigating the relationships between statistical attributes • Investigating the relationship between qualitative attributes, e.g. AB , calledmeasurement of the association • Investigating the relationship between quantitative attributes -regression and correlation analysis doc.Ing. Zlata Sojková,CSc.

  26. Inestigating the association • Based on the association, resp. pivot tables • For testing the existence of  significant relationship between qualitative signs we use 2 - test of independence Ho: two signs A and B are independent H1: signsA and B are dependent Attribute A has m - levels, variations Attribute B hask - levels , variation doc.Ing. Zlata Sojková,CSc.

  27. Hypotheses formulation • Dependence of the attributes will appear in different frequency • E.g. We examine wheter the size of the package is affected by the size of the family • Ho : Choice of the package size depend on the count of family members • H1 : Choice of package size is affected by the size of the family • The procedure lies in comparing empirical and theoretical frequencies, (how should be empirical frequencies, if the attributes A and B were independent doc.Ing. Zlata Sojková,CSc.

  28. Marginal frequencies (ai) resp.(bj) Simultanous frequencies, frequencies of the second order(aibj) Size of the family Package size 1-2 3-4 5 < Total (b1) (b2) (b3) do 100g 25 37 8 70 (a1)(a1b1) (a1 b2) 100-150g 10 62 53 125 (a2) 250g < 5 41 59 105 (a3) (a3b3) Total40 140 120300 Total count of the respondentsn doc.Ing. Zlata Sojková,CSc.

  29. Determination of theoretical frequencies Based on the sentence about independence of the randomevents A and B: P(AB) = P(A) . P(B),thus signs A and B are independent, then: P(aibj) = P(ai) .P(bj) Estimate based on the relative frequencies: (aibj)o = (ai) . (bj)  (aibj)o = (ai) .(bj) n n n n Theoretical frequencies doc.Ing. Zlata Sojková,CSc.

  30. Calculation of theoretical frequencies (a1b1)o = 70.40/300 = 9,33 Family size Package size 1-2 3-4 5 and <Total (b1) (b2) (b3) do 100g 25 37 8 70 (a1)9.3332,67 28.00 100-150g 10 62 53 125 (a2) 16.67 58.33 50 250g < 5 41 59 105 (a3) 14.00 49 42 Total 40 140 120 300 Total count of respondents n doc.Ing. Zlata Sojková,CSc.

  31. Calculation of test criteria and decision: If2calculated 2 for significance for degrees of freedom(m-1).(k-1)  Hois rejected => signs A and B are dependent In our case it means, that count of the family members significantly affects choice of the package size. Further, we should measure strength (power) of the dependence. doc.Ing. Zlata Sojková,CSc.

More Related