Quantitative Methods

Quantitative Methods Part 3 Chi - Squared Statistic

Recap on T-Statistic • It used the mean and standard error of a population sample • The data is on an “interval” or scale • Mean and standard error are the parameters • This approach is known as parametric • Another approach is non-parametric testing

Introduction to Chi-Squared • It does not use the mean and standard error of a population sample • Each respondent can only choose one category (unlike scale in T-Statistic) • The expected frequency must be greater than 5 for the test to succeed. • If any of the categories have less than 5 for the expected frequency, then you need to increase your sample size

Example using Chi-Squared • “Is there a preference amongst the UW student population for a particular web browser? “ (Dr C Price’s Data) • They could only indicate one choice • These are the observed frequencies responses from the sample

Was it just chance? • How confident am I? • Was the sample representative of all UW students? • Was it just chance? • Chi-Squared test for significance • Some variations on test • Simplest is Null Hypothesis • :The students show “no preference” for a particular browser

Chi-Squared: “Goodness of fit” (No preference) : The students show no preference for a particular browser • This leads to Hypothetical or Expected distribution of frequency • We would expect an equal number of respondents per category • We had 50 respondents and 5 categories Expected frequency table

Stage1: Formulation of Hypothesis • : There is no preference in the underlying population for the factor suggested. • : There is a preference in the underlying population for the factors suggested. • The basis of the chi-squared test is to compare the observed frequencies against the expected frequencies

Stage 2: Expected Distribution • As our “null- hypothesis” is no preference, we need to work out the expected frequency: • You would expect each category to have the same amount of respondents • Show this in “Expected frequency” table • Has to have more than 5 to be valid

Stage 3a: Level of confidence • Choose the level of confidence (often 0.05) • 0.05 means that there is 5% chance that conclusion is chance • 95% chance that our conclusions are certain Stage 3b: Degree of freedom • We need to find the degree of freedom • This is calculated with the number of categories • We had 5 categories, df = 5-1 (4)

Stage 3: Critical value of Chi-Squared • In order to compare our calculated chi-square value with the “critical value” in the chi-squared table we need: • Level of confidence (0.05) • Degree of freedom (4) • Our critical value from the table = 9.49

Stage 4: Calculate statistics • We compare the observed against the expected for each category • We square each one • We add all of them up = 52

Stage 5: Decision • Can we reject the That students show no preference for a particular browser? • Our value of 52 is way beyond 9.49. We are 95% confident the value did not occur by chance • So yes we can safely reject the null hypothesis • Which browser do they prefer? • Firefox as it is way above expected frequency of 10

Chi-Squared: “No Difference from a Comparison Population”. • RQ: Are drivers of high performance cars more likely to be involved in accidents? • Sample n = 50 and Market Research data of proportion of people driving these categories • Once null hypothesis of expected frequency has been done, the analysis is the same as no preference calculation

Chi-Squared test for “Independence”. • What makes computer games fun? • Review found the following • Factors (Mastery, Challenge and Fantasy) • Different opinion depending on gender • Research sample of 50 males and 50 females Observed frequency table

What is the research question? • A single sample with individuals measured on 2 variables • RQ: ”Is there a relationship between fun factor and gender?” • HO : “There is no such relationship” • Two separate samples representing 2 populations (male and female) • RQ: ““Do male and female players have different preferences for fun factors?” • HO : “Male and female players do not have different preferences”

Chi-Squared analysis for “Independence”. • Establish the null hypothesis (previous slide) • Determine the critical value of chi-squared dependent on the confidence limit (0.05) and the degrees of freedom. • df = (R – 1)*(C – 1) = 1 * 2 = 2 (R=2, C=3) • Look up in chi-squared table • Chi-squared value = 5.99

Chi-Squared analysis for “Independence”. • Calculate the expected frequencies • Add each column and divide by types (in this case 2) • Easier if you have equal number for each gender (if not come and see me)

Chi-Squared analysis for “Independence”. • Calculate the statistics using the chi-squared formula • Ensure you include both male and female data

Stage 5: Decision • Can we reject the null hypothesis? • Our value of 24.01 is way beyond 5.99. We are 95% confident the value did not occur by chance • Conclusion: We are 95% confident that there is a relationship between gender and fun factor • But else can we get from this? • Significant fun factor for males = Challenge • Significant fun factor for females = Mastery and Fantasy

Workshop • Work on Workshop 7 activities • Your journal (Homework) • Your Literature Review (Complete/update) References • Dr C. Price’s notes 2010 • Gravetter, F. and Wallnau, L. (2003) Statistics for the Behavioral Sciences, New York: West Publishing Company

Quantitative Methods