500 likes | 605 Views
Week 10. Chapter 10 - Hypothesis Testing III : The Analysis of Variance (ANOVA) & Chapter 11 – Hypothesis Testing IV: Chi Square. Chapter 10. Hypothesis Testing III : The Analysis of Variance (ANOVA). In This Presentation. The basic logic of analysis of variance (ANOVA)
E N D
Week 10 Chapter 10 - Hypothesis Testing III : The Analysis of Variance (ANOVA) & Chapter 11 – Hypothesis Testing IV: Chi Square
Chapter 10 Hypothesis Testing III : The Analysis of Variance (ANOVA)
In This Presentation • The basic logic of analysis of variance (ANOVA) • A sample problem applying ANOVA • The Five Step Model • Limitations of ANOVA • Post hoc techniques
Chapter 8 Group 1 = All Education Majors RS of 100 Education Majors Population = Penn State University
Chapter 9 Group 1 = All Males in Population RS of 100 Males Population = Pennsylvania RS of 100 Females Group 2 = All Females in Population
In this Chapter Group 1 = All Protestants in Population RS of 100 Protest. Group 2 = All Catholics in Population Population = Pennsylvania RS of 100 Catholics Group 2 = All Jews in Population RS of 100 Jews
Basic Logic • ANOVA can be used in situations where the researcher is interested in the differences in sample means across three or more categories • Examples: • How do Protestants, Catholics and Jews vary in terms of number of children? • How do Republicans, Democrats, and Independents vary in terms of income? • How do older, middle-aged, and younger people vary in terms of frequency of church attendance?
Basic Logic • ANOVA is used when: • The independent variable has more than two categories • The dependent variable is measured at the interval or ratio level
Basic Logic • Can think of ANOVA as extension of t test for more than two groups • The t test can only be used when the independent variable only has two categories • ANOVA asks “are the differences between the samples large enough to reject the null hypothesis and justify the conclusion that the populations represented by the samples are different?” (p. 243) • The Ho is that the population means are the same: • Ho: μ1= μ2= μ3 = … = μk
Basic Logic • If the Ho is true, the sample means should be about the same value • If the Ho is true, there will be little difference between sample means • If the Ho is false, there should be substantial differences between categories, combined with relatively little difference within categories • The sample standard deviations should be low in value • If the Ho is false, there will be big difference between sample means combined with small values for sample standard deviations
Basic Logic The larger the differences between the sample means, the more likely the Ho is false – especially when there is little difference withincategories When we reject the Ho, we are saying there are differences between the populations represented by the sample
Example 1 We have administered the support for capital punishment scale to a sample of 20 people who are equally divided across five religious categories
Example 1 • Hypothesis Test of ANOVA • Step 1: Make assumptions and meet test requirements • Independent random samples • Interval-ratio level of measurement • Normally distributed populations • Equal population variances
Example 1 • Step 2: State the null hypothesis • Ho: μ1 = μ2 = μ3 = μ4 = μ5 • H1: at least one of the populations means is different • Step 3: Select the sampling distribution and establish the critical region • Sampling distribution = F distribution • Alpha = 0.05 • dfw = 15, dfb = 4 • F(critical) = 3.06
Example 1 • Step 4: Compute test statistic • F = 2.57 • Step 5: Make a decision and interpret the results • F(critical) = 3.06 • F(obtained) = 2.57 • The test statistic does not fall in the critical region, so fail to reject the null hypothesis – support for capital punishment does not differ across the populations of religious affiliations
Limitations of ANOVA Requires interval-ratio level measurement of the dependent variable and roughly equal numbers of cases in the categories of the independent variable Statistically significant differences are not necessarily important The alternative (research) hypothesis is not specific – it only asserts that at least one of the population means differs from the others Use post hoc techniques for more specific differences
USING SPSS • On the top menu, click on “Analyze” • Select “Compare Means” • Select “One Way ANOVA”
ANOVA in SPSS • Analyze / Compare means / One-way ANOVA
Chapter 11 Hypothesis Testing IV: Chi Square
In This Presentation Bivariate (crosstabulation) tables The basic logic of Chi Square The terminology used with bivariate tables The computation of Chi Square with an example problem The five step model Limitations of Chi Square
The Bivariate Table Bivariate tables: display the scores of cases on two different variables at the same time
The Bivariate Table Note the two dimensions: rows and columns. What is the independent variable? What is the dependent variable? Where are the row and column marginals? Where is the total number of cases (N)?
Chi Square • Chi Square can be used: • with variables that are measured at any level (nominal, ordinal, interval or ratio) • with variables that have many categories or scores • when we don’t know the shape of the population or sampling distribution
Basic Logic Independence: “Two variables are independent if the classification of a case into a particular category of one variable has no effect on the probability that the case will fall into any particular category of the second variable” (p. 274)
Basic Logic • Chi Square as a test of statistical significance is a test for independence
Basic Logic Chi Square is a test of significance based on bivariate, crosstabulation tables (also called crosstabs) We are looking for significant differences between the actual cell frequencies OBSERVED in a table (fo) and those that would be EXPECTED by random chance or if cell frequencies were independent (fe)
Example • RQ: Is the probability of securing employment in the field of social work dependent on the accreditation status of the program? • NULL HYP: The probability of securing employment in the field of social work is NOT dependent on the accreditation status of the program. (The variables are independent) • HYP: The probability of securing employment in the field of social work is dependent on the accreditation status of the program. (The variables are dependent)
Computation of Chi Square Example
Computation of Chi Square Expected frequency (fe) for the top-left cell:
Computation of Chi Square Example
Step 1: Make Assumptions and Meet Test Requirements • Independent random samples • Level of Measurement is nominal • Note the minimal assumptions. In particular, note that no assumption is made about the shape of the sampling distribution. The chi square test is nonparametric, or distribution-free
Step 2: State the Null Hypothesis • Ho: The variables are independent • Another way to state the Ho, more consistently with previous tests: • H0: fo = fe • H1: The variables are dependent • Another way to state the H1: • H1: fo ≠ fe
Step 3: Select the Sampling Distribution and Establish the Critical Region Sampling Distribution = Chi Square, χ2 Alpha = 0.05 df = (r-1)(c-1) = (2-1)(2-1)= 1 χ2 (critical) = 3.841
Step 4: Calculate the Test Statistic χ2 (obtained) = 10.78
Step 5: Make a Decision and Interpret the Results of the Test χ2 (critical) = 3.841 χ2 (obtained) = 10.78 The test statistic falls in the critical region, so reject Ho There is a significant relationship between employment status and accreditation status in the population from which the sample was drawn
Interpreting Chi Square The chi square test tells us only if the variables are independent or not It does not tell us the pattern or nature of the relationship To investigate the pattern, compute percentages within each column and compare across the columns
Computation of Chi Square Are the homicide rate and volume of gun sales related for a sample of 25 cities? (Problem 11.4, p. 295) The bivariate table shows the relationship between homicide rate (columns) and gun sales (rows) This 2 x 2 table has 4 cells
Step 1: Make Assumptions and Meet Test Requirements • Independent random samples • Level of Measurement is nominal • Note the minimal assumptions. In particular, note that no assumption is made about the shape of the sampling distribution. The chi square test is nonparametric, or distribution-free
Step 2: State the Null Hypothesis • Ho: The variables are independent • Another way to state the Ho, more consistently with previous tests: • Ho: fo = fe • H1: The variables are dependent • Another way to state the H1: • H1: fo ≠ fe
Step 3: Select the Sampling Distribution and Establish the Critical Region Sampling Distribution = χ2 Alpha = 0.05 df = (r-1)(c-1) = (2-1)(2-1)=1 χ2 (critical) = 3.841
Step 4: Calculate the Test Statistic χ2 (obtained) = 2.00
Step 5: Make a Decision and Interpret the Results of the Test χ2 (critical) = 3.841 χ2 (obtained) = 2.00 The test statistic is not in the critical region, fail to reject the Ho There is no relationship between homicide rate and gun sales in the population from which the sample was drawn
Interpreting Chi Square Cities low on homicide rate were high in gun sales, and cities high in homicide rate were low in gun sales As homicide rates increase, gun sales decrease We found this relationship not to be significant, but it does have a clear pattern
Limitations of Chi Square • Difficult to interpret when variables have many categories • Best when variables have four or fewer categories • With small sample size, cannot assume that Chi Square sampling distribution will be accurate • Small sample: High percentage of cells have expected frequencies of 5 or less • Like all tests of hypotheses, Chi Square is sensitive to sample size • As N increases, obtained Chi Square increases • With large samples, trivial relationships may be significant It is important to remember that statistical significance is not the same as substantive significance
Chi Square in SPSSStep 4: computing the test statistic in SPSS
Chi Square in SPSS • Step 5: making a decision and interpreting the results of the test Result (χ2 obtained)
Chi Square in SPSS The nominal symmetric measures indicate both the strength and significance of the relationship between the row and column variables of a crosstabulation.