270 likes | 287 Views
Statistic for the day: Portion of all international arms sales since 1980 that went to the middle East: 2 out of 5. Source: US Arms Control and Disarmament Agency. Assignment: Read Chapter 12 pp. 195-203 Do the exercises at the end of this lecture. (Answers will be given in Fri. lecture.).
E N D
Statistic for the day:Portion of all international arms sales since 1980 that went to the middle East: 2 out of 5 Source: US Arms Control and Disarmament Agency Assignment: Read Chapter 12 pp. 195-203 Do the exercises at the end of this lecture. (Answers will be given in Fri. lecture.) These slides were created by Tom Hettmansperger and in some cases modified by David Hunter
Arby’s serving calories 1 reg roast beef 5.5oz 383 2 beef and cheddar 6.9 508 3 junior roast beef 3.1 233 4 super roast beef 9.0 552 5 giant roast beef 8.5 544 6 chicken breast fillet 7.2 445 7 grilled chicken deluxe 8.1 430 8 French dip 6.9 467 9 Italian sub 10.1 660 10 roast beef sub 10.8 672 11 turkey sub 9.7 533 12 light roast beef deluxe 6.4 294 13 light roast turkey deluxe 6.8 260 14 light roast chicken deluxe 6.8 276
Research Question: At Arby’s are calories related to • the size of the sandwich? • Observational study • Response = calories • Explanatory variable = small or large sandwich • Small sandwich means less than 7 oz (n = 7) • Large sandwich means more than 7 oz (n = 7)
Observational study • Response = calories • Explanatory variable = small or large sandwich • THE RESPONSE VARIABLE IS A MEASUREMENT VARIABLE. • THE EXPLANATORY VARIABLE IS A CATEGORICAL • VARIABLE.
There seems to be a difference. (Is it statistically significant?) We can refine the explanatory variable and get more information about the relationship between calories and serving (sandwich) size. Rather than split it into small and large Keep the numerical values of the explanatory variable.
Observational studyResponse = caloriesExplanatory variable = size of the sandwich ( in oz.) BOTH RESPONSE AND EXPLANATORY VARIANBLES ARE MEASUREMENT VARIABLES. (NEITHER IS A CATEGORICAL VARIABLE)
Best fitting line through the data: called the REGRESSION LINE Strength of relationship: measured by CORRELATON
Can we have two categorical variables? Recall we split the explanatory variable at 7 oz. So we defined small as less than 7 oz and large to be greater than 7 oz. Next we split the response variable at 456 calories. Then we define low calories as less than 456 and high calories to be greater than 456.
Response: Calories Data: Explanatory: Size
Response: calories Low High Small 5 2 7 Large 2 5 7 7 7 14 Explanatory: size Proportions Percentages Low High Small .72 .28 Large .28 .72 Low High Small 72% 28% Large 28% 72% Question: Is 72% - 28% = 44% significant?
Two categorical variables: Explanatory variable: GenderResponse variable: Body Pierced or Not Survey question: Have you pierced any other part of your body? (Except for ears) Research Question: Is there a significant difference between women and men in terms of body pierces?
Response Data: Pierced? Explanatory Gender? From Stat 100.2, spring 2004 (missing responses omitted)
Percentages 62.22 = 84/135 96.97 = 96/99 Response: body pierced? no yes All female 62.22 37.78 100.00 male 96.97 3.03 100.00 All 76.92 23.08 100.00 Research question: Is there a significant difference Between women and men? (i.e., between 62.22% and 96.97%)
Counts and percentages Rows: gender Columns: body no yes All female 84 51 135 62.22 28.57 100.00 ------------------------------- male 96 3 99 96.97 5.56 100.00 All 180 54 234 76.92 23.08 100.00 Are the differences between females and males significant? (i.e., between 62.22% and 96.97%)
The Debate: The research advocate claims that there is a significant difference. The skeptic claims there is no real difference. The data differences simply happen by chance.
The strategy for determining statistical significance: • First, figure out what you expect to see if there is no difference between females and males • Second, figure out how far the data is from what is expected. • Third, decide if the distance in the second step is large. • Fourth, if large then claim there is a statistically significant difference.
Research Advocate: OK. Suppose there is really no difference in the population as you, the Skeptic, claim. We will compare what you, The Skeptic, expect to see and what you actually do see in the data.Skeptic: How do we figure out what we expect to see?
Actual data: Rows: gender Columns: body pierces no yes All female 84 51 135 male 96 3 99 All 180 54 234
Rows: gender Columns: body pierces top lines of numbers are observed bottom lines are expected no yes All female 84 51 135 103.85 31.15 135.00 male 96 3 99 76.15 22.85 99.00 All 180 54 234 180.00 54.00 234.00
How to measure the distance between what the research advocate observes in the table and what the skeptic expects: Add up the following for each cell: Now how do we decide if 38.85 is large or not? If it is large enough the skeptic concedes to the research advocate and agrees there is a statistically significant difference. How large is enough?
Chi-squared distribution with 1 degree of freedom: If chi-squared statistic is larger than 3.84, it is declared large and the research advocate wins. But our chi-squared is 38.85 so the research advocate easily wins! There is a statistically significant difference between men and women.
Exercise:Follow the 4 steps and answer theResearch Question: Is there a relationship between gender and ownership of cell phones in Stat 100.2? Data Rows: gender Columns: cell phone no yes All female 12 124 136 male 14 87 101 All 26 211 237
Exercise: Follow the 4 steps and answer the research question: Is there a statistically significant difference in calories between small and large sandwiches? Data on slide #12.