140 likes | 150 Views
Explore the use of Chi-Square testing methods for comparing observed and expected frequencies, without assuming a normal distribution. This chapter covers the characteristics, limitations, and guidelines for conducting hypothesis tests.
E N D
Chapter 15 Nonparametric Methods: Chi-Square Applications
Nonparametric • One-Look.Com Definition: • adjective: not involving an estimation of the parameters of a statistic • adjective: not requiring knowledge of underlying distribution: used to describe or relating to statistical methods that do not require assumptions about the form of the underlying distribution • You mean we can test without assuming a normal curve? • Yes!
We can test a hypothesis with assuming data distribution is normal! Goals • Conduct a test of hypothesis comparing an observed set of frequencies to an expected set of frequencies • Goodness-of-fit tests: • Equal Expected Frequencies • Unequal Expected Frequencies • List the characteristics of the Chi-square distribution
Chi-square (2) Applications • Testing Method where we don’t need assumptions about the shape of the data • Testing methods for Nominal data • Data with no natural order • Examples: • Gender • Brand preference • Color • There will be two difference from earlier tests when we do our hypothesis testing: • Look up critical value of Chi-square in appendix B • Use new formula for Calculated Test Statistic
Conduct A Test Of Hypothesis Comparing An Observed Set Of Frequencies To An Expected Set Of Frequencies • Goodness-of-fit tests: • Equal Expected Frequencies • Unequal Expected Frequencies
Purpose Of Goodness-of-fit Tests: • Compare an observed distribution (sample) to an expected distribution (population) • We will ask the question: • Is the difference between the observed values and the expected values: • Due to chance (sampling error): • The observed distribution is the same as the expected distribution • Not due to chance: • The observed distribution is not the same as the expected distribution
Hypothesis Testing: Equal Expected Frequencies • Step 1: State null and alternate hypotheses • Ho : There is no significant difference between the set of observed frequencies and the set of expected frequencies • H1 : There is a difference between the observed and expected frequencies • Step 2: Select a level of significance • α = .01 or .05…
Hypothesis Testing • Step 3: Identify the test statistic (Chi Square = 2) and draw curve with critical value • Use α and df to look up critical value in appendix B • k = number of categories • (k – 1) = degrees of freedom
Hypothesis Testing • Step 4: Formulate a decision rule • If our calculated test statistic is greater than 18.307, we reject Ho and accept H1, otherwise we fail to reject Ho
Equal Expected Frequencies Unequal Expected Frequencies Hypothesis Testing • Step 5: Take a random sample, compute the calculated test statistic, compare it to critical value, and make decision to reject or not reject null and hypotheses fe will be given or n*% for cell 1st 2nd
Hypothesis Testing • Step 5: Conclude: • There is either: • The sample evidence suggests that there is not a difference between the observed and expected frequencies • The observed distribution is the same as the expected distribution • The sample evidence suggests that there is a difference between the observed and expected frequencies • The observed distribution is not the same as the expected distribution
List The Characteristics Of The Chi-square Distribution • It is positively skewed • However, as the degrees of freedom increase, the curve approaches normal • It is non-negative • Because (fo – fe)2 is never negative • There is a family of chi-square distributions • df determines which curve to use • df = k – 1 • k = # of categories
C2 Distribution df = 3 df = 5 df = 10 c2
Limitations Of Chi-Square • Because fe is used in the denominator, very small fe could result in very large calculated test statistic • In General, avoid using Chi-Square when: • If there are only two cells: fe >= 5 • If there are more than two cells 20% of fe cells contain values less than 5