570 likes | 638 Views
ANOVA. Siti Nor Jannah bt Ahmad Siti Shahida bt Kamel Zamriyah bt Abu Samah. ANOVA : Definition. A statistical method for making simultaneous comparisons between two or more means.
E N D
ANOVA Siti Nor Jannah bt Ahmad Siti Shahida bt Kamel Zamriyah bt Abu Samah
ANOVA : Definition • A statistical method for making simultaneous comparisons between two or more means. • ANOVA is a general technique that can be used to test the hypothesis that the means among two or more groups are equal, under the assumption that the sampled populations are normally distributed. • Analysis of variance can be used to test differences among several means for significance without increasing the Type I error rate.
Different types of ANOVA • To begin, let us consider the effect of temperature on a passive component such as a resistor. • We select three different temperatures and observe their effect on the resistors. • This experiment can be conducted by measuring all the participating resistors before placing n resistors each in three different ovens. • Each oven is heated to a selected temperature. Then we measure the resistors again after, say, 24 hours and analyze the responses, which are the differences between before and after being subjected to the temperatures. • The temperature is called a factor. • The different temperature settings are called levels. In this example there are three levels or settings of the factor Temperature.
Different types of ANOVA A factor is an independent treatment variable whose settings (values) are controlled and varied by the experimenter. The intensity setting of a factor is the level. Levels may be quantitative numbers or, in many cases, simply "present" or "not present" ("0" or "1"). What is a factor? In the experiment, there is only one factor, temperature, and the analysis of variance that we will be using to analyze the effect of temperature is called a one-way or one-factor ANOVA. The 1-way ANOVA We could have opted to also study the effect of positions in the oven. In this case there would be two factors, temperature and oven position. Here we speak of a two-way or two-factor ANOVA. Furthermore, we may be interested in a third factor, the effect of time. Now we deal with a three-way or three-factor ANOVA. The 2-way or 3-way ANOVA
When You Use The ANOVA • You may use ANOVA whenever you have 2 or more independent groups • You must use ANOVA whenever you have 3 or more independent groups.
When You Use The ANOVA One-way ANOVA • 1 factor-e.g. smoking status (never,former,current) Two-way ANOVA • 2 factors-e.g. gender and smoking status Three-way ANOVA • 3 factors-e.g. gender, smoking and beer consumption
How to interpret ANOVA results The P value answers this question: If all the populations really have the same mean (the treatments are ineffective), what is the chance that random sampling would result in means as far apart (or more so) as observed in this experiment? • If the overall P value is large, the data do not give you any reason to conclude that the means differ. Even if the population means were equal, you would not be surprised to find sample means this far apart just by chance. You just don't have compelling evidence that they differ.
How to interpret ANOVA results • If the overall P value is small, then it is unlikely that the differences you observed are due to random sampling. You can reject the idea that all the populations have identical means. • This doesn't mean that every mean differs from every other mean, only that at least one differs from the rest.
How to interpret ANOVA results • F(2,27) = 8.80, p < .05 • F = test statistic • 2,27 • 2 =dfbetween groups • 27 = dfwithin groups • 8.80 = obtained value of F • p < .05 = probability less than 5% that null hypothesis is true • Reject the null hypothesis • Some of the group means differ significantly from each other.
One Way Analysis of Variance • Example • An apple juice manufacturer is planning to develop a new product -a liquid concentrate. • The marketing manager has to decide how to market the new product. • Three strategies are considered • Emphasize convenience of using the product. • Emphasize the quality of the product. • Emphasize the product’s low price.
One Way Analysis of Variance • Example continued • An experiment was conducted as follows: • In three cities an advertisement campaign was launched . • In each city only one of the three characteristics (convenience, quality, and price) was emphasized. • The weekly sales were recorded for twenty weeks following the beginning of the campaigns.
One Way Analysis of Variance Weekly sales Weekly sales Weekly sales
Terminology • In the context of this problem… Response variable – weekly salesResponses – actual sale valuesExperimental unit – weeks in the three cities when we record sales figures.Factor – the criterion by which we classify the populations (the treatments). In this problems the factor is the marketing strategy. Factor levels – the population (treatment) names. In this problem factor levels are the marketing strategies.
One Way Analysis of Variance • Solution • The data are interval • The problem objective is to compare sales in three cities. • We hypothesize that the three population means are equal
Defining the Hypotheses • Solution • H0: m1 = m2= m3 • H1: At least two means differ • To build the statistic needed to test thehypotheses use the following notation:
The rationale behind the test statistic – I • If the null hypothesis is true, we would expect all the sample means to be close to one another (and as a result, close to the grand mean). • If the alternative hypothesis is true, at least some of the sample means would differ. • Thus, we measure variability between sample means.
Variability between sample means • The variability between the sample means is measured as the sum of squared distances between each mean and the grand mean. • This sum is called the • Sum of Squares for Treatments • SST In our example treatments are represented by the different advertising strategies.
Sum of squares for treatments (SST) There are k treatments The mean of sample j The size of sample j Note: When the sample means are close toone another, their distance from the grand mean is small, leading to a small SST. Thus, large SST indicates large variation between sample means, which supports H1.
Sum of squares for treatments (SST) • Solution – continuedCalculate SST = 20(577.55 - 613.07)2 + + 20(653.00 - 613.07)2 + + 20(608.65 - 613.07)2 = = 57,512.23 The grand mean is calculated by
The rationale behind test statistic – II • Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. • Therefore, even though sample means may markedly differ from one another, SST must be judged relative to the “within samples variability”.
Within samples variability • The variability within samples is measured by adding all the squared distances between observations and their sample means. This sum is called the Sum of Squares for Error SSE In our example this is the sum of all squared differences between sales in city j and the sample mean of city j (over all the three cities).
Sum of squares for errors (SSE) • Solution – continuedCalculate SSE = (n1 - 1)s12 + (n2 -1)s22 + (n3 -1)s32 = (20 -1)10,774.44 + (20 -1)7,238.61+ (20-1)8,670.24 = 506,983.50
Calculation of MST - Mean Square for Treatments Calculation of MSE Mean Square for Error The mean sum of squares To perform the test we need to calculate the mean squaresas follows:
Calculation of the test statistic Required Conditions: 1. The populations tested are normally distributed. 2. The variances of all the populations tested are equal. with the following degrees of freedom: v1=k -1 and v2=n-k
H0: m1 = m2 = …=mk H1: At least two means differ Test statistic: R.R: F>Fa,k-1,n-k The F test rejection region the hypothesis test: And finally
The F test Ho: m1 = m2= m3 H1: At least two means differ Test statistic F= MST/ MSE= 3.23 Since 3.23 > 3.15, there is sufficient evidence to reject Ho in favor of H1,and argue that at least one of the mean sales is different than the others.
single factor ANOVA SS(Total) = SST + SSE
Models of Fixed and Random Effects • Fixed effects • If all possible levels of a factor are included in our analysis we have a fixed effect ANOVA. • The conclusion of a fixed effect ANOVA applies only to the levels studied. • Random effects • If the levels included in our analysis represent a random sample of all the possible levels, we have a random-effect ANOVA. • The conclusion of the random-effect ANOVA applies to all the levels (not only those studied).
Models of Fixed and Random Effects. • In some ANOVA models the test statistic of the fixed effects case may differ from the test statistic of the random effect case. • Fixed and random effects - examples • Fixed effects - The advertisement Example .All the levels of the marketing strategies were included • Random effects - To determine if there is a difference in the production rate of 50 machines, four machines are randomly selected and there production recorded.
Two-Factor Analysis of Variance - • Example • Suppose in the Example, two factors are to be examined: • The effects of the marketing strategy on sales. • Emphasis on convenience • Emphasis on quality • Emphasis on price • The effects of the selected media on sales. • Advertise on TV • Advertise in newspapers
Attempting one-way ANOVA • Solution • We may attempt to analyze combinations of levels, one from each factor using one-way ANOVA. • The treatments will be: • Treatment 1: Emphasize convenience and advertise in TV • Treatment 2: Emphasize convenience and advertise in newspapers • ……………………………………………………………………. • Treatment 6: Emphasize price and advertise in newspapers
Attempting one-way ANOVA • Solution • The hypotheses tested are: H0: m1= m2= m3= m4= m5= m6 H1: At least two means differ.
Attempting one-way ANOVA City1City2City3City4City5City6Convnce Convnce Quality Quality Price Price TV Paper TV Paper TV Paper • Solution • In each one of six cities sales are recorded for ten weeks. • In each city a different combination of marketing emphasis and media usage is employed.
City1City2City3City4City5City6Convnce Convnce Quality Quality Price Price • TV Paper TV Paper TV Paper Attempting one-way ANOVA • Solution • The p-value =.0452. • We conclude that there is evidence that differences exist in the mean weekly sales among the six cities.
Interesting questions – no answers • These result raises some questions: • Are the differences in sales caused by the different marketing strategies? • Are the differences in sales caused by the different media used for advertising? • Are there combinations of marketing strategy and media that interact to affect the weekly sales?
Two-way ANOVA (two factors) • The current experimental design cannot provide answers to these questions. • A new experimental design is needed.
Factor A: Marketing strategy Factor B: Advertising media Two-way ANOVA (two factors) Convenience Quality Price City 1 sales City3 sales City 5 sales TV City 2 sales City 4 sales City 6 sales Newspapers Are there differences in the mean sales caused by different marketing strategies?
Calculations are based on the sum of square for factor ASS(A) Two-way ANOVA (two factors) Test whether mean sales of “Convenience”, “Quality”, and “Price” significantly differ from one another. H0: mConv.= mQuality = mPrice H1: At least two means differ
Two-way ANOVA (two factors) Factor A: Marketing strategy Convenience Quality Price City 1 sales City 3 sales City 5 sales TV Factor B: Advertising media City 2 sales City 4 sales City 6 sales Newspapers Are there differences in the mean sales caused by different advertising media?
Calculations are based onthe sum of square for factor BSS(B) Two-way ANOVA (two factors) Test whether mean sales of the “TV”, and “Newspapers” significantly differ from one another.H0: mTV = mNewspapers H1: The means differ
Quality TV Two-way ANOVA (two factors) Factor A: Marketing strategy Convenience Quality Price City 1 sales City 3 sales City 5 sales TV Factor B: Advertising media City 2 sales City 4 sales City 6 sales Newspapers Are there differences in the mean sales caused by interaction between marketing strategy and advertising medium?
Two-way ANOVA (two factors) Test whether mean sales of certain cells are different than the level expected. Calculation are based on the sum of square for interaction SS(AB)
MS(AB) MSE MS(A) MSE MS(B) MSE F= F= F= F tests for the Two-way ANOVA • Test for the difference between the levels of the main factors A and B SS(A)/(a-1) SS(B)/(b-1) SSE/(n-ab) Rejection region: F > Fa,a-1 ,n-ab F > Fa, b-1, n-ab • Test for interaction between factors A and B SS(AB)/(a-1)(b-1) Rejection region: F > Fa,(a-1)(b-1),n-ab
Required conditions: • The response distributions is normal • The treatment variances are equal. • The samples are independent.
F tests for the Two-way ANOVA • Example – continued • Test of the difference in mean sales between the three marketing strategies H0: mconv. = mquality = mprice H1: At least two mean sales are different Factor A Marketing strategies
F tests for the Two-way ANOVA • Example – continued • Test of the difference in mean sales between the three marketing strategies H0: mconv. = mquality = mprice H1: At least two mean sales are different F = MS(Marketing strategy)/MSE = 5.33 Fcritical = Fa,a-1,n-ab = F.05,3-1,60-(3)(2) = 3.17; (p-value = .0077) • At 5% significance level there is evidence to infer that differences in weekly sales exist among the marketing strategies. MS(A)/MSE