PSY 307 – Statistics for the Behavioral Sciences

PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use

Chi-Square (c2) Test • For qualitative data • Tests whether observed frequencies are closely similar to hypothesized expected frequencies. • Expected frequencies can be probabilities determined by chance or other values based on theory.

Two Tests • One-way (one variable) chi-square: • Tests observed frequencies against a null hypothesis of equal or specified proportions. • Two-way (two variable) chi-square: • Tests observed frequencies against specified proportions across all cells of two cross-classified variables. • Another way of saying this is that it tests for an interaction.

Frequencies • Observed frequencies – the obtained frequency for each category in a study. • Expected frequencies – the hypothesized frequency for each category given a true null hypothesis.

Calculating Chi-Square (c2) • Determine the expected frequencies. • Are the differences between the expected and the observed frequencies large enough to qualify as a rare outcome? • Calculate the c2 ratio. • Compare against the c2 table with appropriate degrees of freedom.

Blood Type Example H0: PO = .44, PA = .41, PB = .10, PAB = .05 H1: H0 is false

Calculating c2 df = categories (c) - 1

Chi-Square Distribution

Chi Square Table Reject H0 Look up the critical value for our df (c-1) and significance level (e.g., p < .05). Is 11.24 greater than 7.81? If yes, reject the null hypothesis. Conclude blood types are not distributed as in the general population.

About c2 • Because differences from expected values are squared, the value of c2 cannot be negative. • Because differences are squared, the c2 test is nondirectional. • A significant c2 is not necessarily due to big differences, small ones can add up.

Two-Way c2 • When observations are cross-classified according to two variables, a two-way test is used. • The two-way test examines the relationship between two variables. • It is a test of independence between them. • Null hypothesis: independence. • Alternative hypothesis: H0 is false.

Returned Letter Example H0: Type of neighborhood and return rate of lost letters are independent. H1: H0 is false.

Calculating Expected Frequencies

Calculating Two-Way c2 • Expected frequencies are based on the proportions found in the column and row totals. • Degrees of freedom are limited by the column and row totals. • Once expected frequencies and df have been found, calculate c2 the same as in a one-way test.

Calculating c2 df = (columns – 1)(rows – 1) df = (3-1)(2-1) = 2 From the Chi Square Table, critical value is 5.99. Our value of 9.17 exceeds 5.99 so reject the null. There is a relationship between neighborhood and letter return rate.

Effect Size for c2 • Cramer’s Phi Coefficient ( ) • Roughly estimates the proportion of explained variance (predictability) between two qualitative variables. • .01 = small effect • .09 = medium effect • .25 = large effect where k is the smaller of the number of rows or columns

Precautions • Observations must be independent of each other. • One observation per subject. • Avoid small expected frequencies – must be 5 or more. • Avoid small sample sizes – increases danger of Type II error (retaining a false null hypothesis). • Avoid very large sample sizes.

A Repertoire of Hypothesis Tests • z-test – for use with normal distributions when σ is known. • t-test – for use with one or two groups, when σ is unknown. • F-test (ANOVA) – for comparing means for multiple groups. • Chi-square test – for use with qualitative data.

Null and Alternative Hypotheses • How you write the null and alternative hypothesis varies with the design of the study – so does the type of statistic. • Which table you use to find the critical value depends on the test statistic (t, F, c2, U, T, H). • t and z tests can be directional.

Deciding Which Test to Use • Is data qualitative or quantitative? • If qualitative use Chi-square. • How many groups are there? • If two, use t-tests, if more use ANOVA • Is the design within or between subjects? • How many independent variables (IVs or factors) are there?

Summary of t-tests • Single group t-test for one sample compared to a population mean. • Independent sample t-test – for comparing two groups in a between-subject design. • Paired (matched) sample t-test – for comparing two groups in a within-subject design.

Summary of ANOVA Tests • One-way ANOVA – for one IV, independent samples • Repeated Measures ANOVA – for one or more IVs where samples are repeated, matched or paired. • Two-way (factorial) ANOVA – for two or more IVs, independent samples. • Mixed ANOVA – for two or more IVs, between and within subjects.

Summary of Nonparametric Tests • Two samples, independent groups – Mann-Whitney (U). • Like an independent sample t-test. • Two samples, paired, matched or repeated measures – Wilcoxon (T). • Like a paired sample t-test. • Three or more samples, independent groups – Kruskal-Wallis (H). • Like a one-way ANOVA.

Summary of Qualitative Tests • Chi Square (c2) – one variable. • Tests whether frequencies are equally distributed across the possible categories. • Two-way Chi Square – two variables. • Tests whether there is an interaction (relationship) between the two variables.

PSY 307 – Statistics for the Behavioral Sciences