1 / 54

The  2 test

The  2 test. Sections 19.1 and 19.2 of Howell This section actually includes 2 totally separate tests goodness-of-fit test contingency table analysis Each has its own point, and requires different things Only thing in common - same formula Keep them separate in your mind!.

sidney
Download Presentation

The  2 test

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The 2 test • Sections 19.1 and 19.2 of Howell • This section actually includes 2 totally separate tests • goodness-of-fit test • contingency table analysis • Each has its own point, and requires different things • Only thing in common - same formula • Keep them separate in your mind!

  2. Return to hypothesis testing • We can test statistical significance, no prob • need p and alpha (and a computer) • Sometimes, no computer available • can use tables to test statistical significance • Little more work, but works just as well • This method uses the same logic as the p value method

  3. Testing Ho without a PC • The strategy (new stuff is underlined) • Step1: Set up Ho, Ha and decide on alpha • Step 2: Calculate the statistic and df • Step 3:Get the critical value from the table • Step 4:Compare critical value to statistic

  4. Step 1 • Set up Ho and alpha - already know • Ha - the alternative hypothesis • If Ho is false, what do we believe then? (Ha) • Ha represents the opposite of Ho • eg. if Ho: r = 0 then Ha: r  0 • If we reject Ho (because its false), then we must accept Ha as being true.

  5. Step 2 • Nothing different • use appropriate formulas for stat and df!

  6. Step 3 • Get the critical value • from the table (back of Howell) • Use alpha and df to look it up • Critical value: the value of your statistic at which p = alpha (the edge of the rejection region)

  7. Step 4 • Compare your stat to the crit value: • Ignore any minuses (look only at value) • If your calculated stat is morethan the crit value, then p < alpha (ie. significance!) • The test is significant if calculated value is greater than the crit value • Reject the Ho, and accept the Ha. • Pretty easy!

  8. Example • Lets use an r value: • We get r = 0.61 with df = 10, alpha = 0.05 • Is this significant? • Critical value: use df and alpha on table D2 in Howell (significant values of the correlation coefficient) • for alpha = 0.05 and df = 10, crit value = 0.576

  9. Example • Now we have the calculated value and crit value • Calculated = 0.61 • Critical = 0.576 • Check: • if calculated > critical, reject Ho • 0.61 > 0.576, so we reject Ho • The result is statistically significant!

  10. Return to 2 • Note: 2 only works with discrete data • What is the point of 2 ? • Goodness-of-fit: Used to see if data matches a hypothetical distribution • Are there the same number of men as women? • Are about 25% of South Africans unemployed? • Contingency table analysis (independence test): used as a correlation for discrete data (are the variables related?)

  11. Goodness-of-fit 2 • Used to test a model distribution of data • Have an idea of how data should be distributed • eg. There should be 60% brunettes, 40% blondes • Collect data, check to see if our idea (model) is supported by the data • Does the data fit the model? • Before starting a goodness-of-fit test, always be sure of what the model is

  12. Creating a model • We put our expectations as percentages on a table • One cell of the table for each possible value of the variable • Each cell has the percentage of observations we expect

  13. Example model • We expect 40% brunettes, 60% blondes, so Blondes Brunettes 40% 60%

  14. Observed scores and Expected scores • Strategy: Want to see if our observation matches our model • We collect some data (Observed scores) • We work out what the data would look like if our model were correct (Expected scores) • Compare the two: do the observed scores show the same pattern as the expected scores?

  15. Converting the model to expected scores • We have our model as percentages • We must now convert % to actual values (frequencies) - use n (number of observations) If we collected 134 observations, then Blondes Brunettes 60% 40% Blondes Brunettes (40/100) x 134 = 53.6 (60/100) x 134 = 80.4

  16. Converting % to frequency • To do this: • (percentage / 100) x n • Keep the decimals! • You cannot work with % for 2 - you must have frequencies (number of observations)

  17. Beginning the 2analysis • To begin, need Ho • For , 2 it is always “observed data = expected data” • Need to state the model (in %) • Collect the data • Create an expected freq table (using your model and n) • Calculate 2 to see if the observed = expected

  18. 2 Formula O = observed score E = expected score

  19. 2 formula, step by step • Step 1: for each subject, that subject’s O minus that subject’s E • Step 2: for each subject, square the step 1s. • Step 3: for each subject, take their step2, and divide it by that subject’s E • Step 4: sum all the step 3’s

  20. Table method for 2 • Use the following columns: • O E O-E (O-E)2 (O-E)2 E Add up here

  21. Degrees of freedom (df) • The df for goodness-of-fit tests is easy to calculate: • df = k-1 • k is the number of possible values for your variable (categories) • using males and females k = 2 • using coke, pepsi, sprite k = 3 • using easy, moderate, hard, awesome k =4

  22. Worked example 1 • We suspect that there is a 50%/50% gender distribution at UCT. We observed 147 people, 68 male, 79 female. Do we really have a 50%/50% distribution? • Set up (step 1) • Ho: Distribution is 50%/50% • Ha: Distribution is not 50%/50% • alpha = 0.05

  23. Example: work out expected scores • (What would we have seen if Ho were true?) • Model: • Males 50% • Females 50% • Convert to scores • n = 147 • Males expected: (50/100) x 147 = 73.5 • Females expected: (50 / 100) x 147 = 73.5

  24. Example: O and E values • Now we have our values (O-E)2 O E O-E (O-E)2 Value E 68 73.5 Male 79 73.5 Female

  25. Example - Work out the columns (O-E)2 O E O-E (O-E)2 Value E 68 73.5 -5.5 30.25 0.411 Male 30.25 0.411 5.5 79 73.5 Female

  26. Example - Add up the values in the last column (O-E)2 O E O-E (O-E)2 Value E 68 73.5 -5.5 30.25 0.411 Male 30.25 0.411 5.5 79 73.5 Female 0.823

  27. Example - df • Now we have our 2 value: 0.823 • Is it statistically significant? (does the model explain the population?) • Need the critical value for this! • Degrees of freedom: k-1 • 2 categories (male, female) • so df = 1

  28. Example: critical value • What is the critical value for our male/female example? • Df: k = 2 (male and female), so df = 1 • For df = 1 and alpha = 0.05, the table says: • crit = 3.84 • To be significant, our value must be more that 3.84

  29. Example: conclusions • Calculated < critical • (0.823 < 3.84), so the Ho is true • (this means: it is true that “distribution is 50%/50%) • Conclusion: it seems that at UCT there are as many males as there are females.

  30. Interpreting 2 findings • 2 findings are interpreted a little differently • False Ho (significance) means we cannot accept the model (the model is wrong for this population) • True Ho (non-significance) means we must assume that the model applies to this population • This is the case for goodness-of-fit tests

  31. Contingency table analysis with 2 • Pearson’s product moment allowed us to establish a relationship between 2 continuous variables • doesn’t work for discrete data (categories) • Eg. “is there are relationship between gender and owning a dog or cat?” (2 discrete variables) • Contingency table analysis is used for this • can work with nominal variables

  32. Something old, something new • Quite similar to goodness-of-fit tests • Work out the expected values • Use the chi square formula • Work out df • get a critical value from the table • Differences: • Slightly different O table • New way of working out expected values • New way of working out df

  33. Observed values • For each person, we ask 2 questions (2 vars) • “are you male/female” and “do you have a dog or a cat” (let’s assume we sample only pet owners) • We end up with: • Subject Gender Pet • 1 M D • 2 M C • 3 F D etc.

  34. O table • We need to convert those data into a frequency table that looks like: GENDER Male Female Dog PET Cat

  35. Filling in the O table • Each cell has only one number in it • number of people fitting that condition In cell 1: number of people who are Male AND have a dog GENDER Male Female In cell 2: number of people who are Female AND have a dog 1 2 Dog PET 4 In cell 3: number of people who are Male AND have a cat 3 Cat etc

  36. The finished O table • An o table usually looks like: GENDER Male Female We had 7 males with cats 36 34 Dog PET We had 34 females with dogs 32 7 Cat This table is a 2x2 table - 2 rows (pet) and 2 columns (gender)

  37. Notes about O tables • The numbers inside the cells are frequencies (just like goodness-of-fit) • You can have as many levels of a variable as you like • eg. dog, cat, parakeet, moose, hamster, other (6 levels) • BUT you can only have 2 variables • eg. not gender, pet AND car type

  38. E values • Expected values are a bit more tricky • We want to finish with an E table, of the same form as the O table Male Female Expected Need to calculate a value for each cell we will use the O values to do this Dog Cat

  39. E values, step by step • Step 1: work out the grand total from the O table (N) • Step 2: work out the marginal totals from the O table • Step 3: use a formula (RiCj/N) to get a value for each cell of the E table

  40. Step 1: Grand total (N) • How many people did we use? • Same idea as the usual n • called capital N (for some reason) • To calculate: Add up all the numbers in each of the cells • So in the gender/pet example: N is • 36+34+7+32 = 109 • N = 109

  41. Step 2: Marginal totals • We can work out the total of the margins of the O table The marginal totals are written on the edges of the o table Male Female O 36 34 70 Dog 32 7 39 Cat 43 66

  42. Step 2: Calculating marginals • For each marginal, add up the numbers in that line, so: Do the rows AND the columns! Male Female O 36 34 36+34 = 70 Dog 32 7 Cat 7 + 32 = 39 36+7 = 43 34+32 = 66

  43. Step 3: Work out E table • Write your marginals around your blank E table - in the right places! We will now use the marginals to compute one E value for each cell Male Female E 70 Dog The formula for E: E = 39 Cat Ri x Cj 43 66 N

  44. Step 3: Work out a single cell • For each cell, look at the cell’s row and column marginal (Ri and Cj) For Male/Dog Ri = 70 Cj = 43 Male Female E R = 70 C = 43 70 Dog The formula for E: E = 70 x 43 39 Cat = 27.614 109 43 66 Do the same for each cell

  45. Ready to calculate 2 • Now we have O and E, ready to calculate 2 (using the same formula as before) Male Female O E 42.385 36 34 27.614 Dog 32 7 15.385 23.614 Cat

  46. Calculate 2 • This is almost the same as for goodness-of-fit, but be careful in building your table (the O and the E columns) (O-E)2 O E O-E (O-E)2 E 36 27.614 34 42.385 7 15.385 32 23.614

  47. Matching up the O and E columns • Be careful!! Each type of response has an O and an E - match up the correct ones! • Male/Cat has O = 7 and E = 15.385 • Female/Dog has O=34 and E = 42.385 • If you get the wrong E for an O, all your results are wrong!! Do it slowly.

  48. Working out the table • Step 1: O-E (go row by row, slowly) (O-E)2 O E O-E (O-E)2 E 36 8.385 27.614 34 42.385 -8.385 7 15.385 -8.385 32 23.614 8.385

  49. Working out the table • Step 2: square the differences (O-E)2 O E O-E (O-E)2 E 36 8.385 70.3136 27.614 34 42.385 -8.385 70.3136 7 70.3136 15.385 -8.385 32 70.3136 23.614 8.385

  50. Working out the table • Step 3: divide the squares by E (O-E)2 O E O-E (O-E)2 E 36 8.385 70.3136 2.546 27.614 34 42.385 -8.385 70.3136 1.658 7 15.385 -8.385 70.3136 4.57 32 23.614 8.385 70.3136 2.977

More Related