1 / 27

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing. Raymond J. Carroll Department of Statistics Faculty of Nutrition Texas A&M University http://stat.tamu.edu/~carroll. Outline. Series of Examples Data Collection for Examples. Example #1.

zuriel
Download Presentation

Introduction to Hypothesis Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Hypothesis Testing Raymond J. Carroll Department of Statistics Faculty of Nutrition Texas A&M University http://stat.tamu.edu/~carroll

  2. Outline • Series of Examples • Data Collection for Examples

  3. Example #1 • My Hypothesis: Texas A&M Students simply guess when they are asked whether they are drinking diet Pepsi or diet Coke • The Experiment: Blind taste test. You are asked which cup you drink is diet Coke • Our Goal: Test this hypothesis, using statistical principles and probability statements

  4. A Warning • Yes or No: No statistician will ever answer a question “yes” or “no” • Probabilities: We always say things like “the chance is less than 5% that your hypothesis is correct”

  5. Example #1 • Data Model: The data model is • Normal? • Gamma? • Binomial? • Poisson?

  6. Example #1 • Data Model: The data model is • Normal? • Gamma? • Binomial? Because each outcome is yes or no, success or failure • Poisson?

  7. Example #1 • My Hypothesis in terms of population parameters: I have claimed that you can do no better than guess • Each of you is a Binomial(1,p) or Binomial(1,p) • When I say you are guessing, what am I saying about the population?

  8. Example #1 • My Hypothesis in terms of population parameters: I have claimed that you can do no better than guess • Each of you is a Binomial(1,p) or Binomial(1,p) • When I say you are guessing, what am I saying about the population? • That the proportion of successes is p = p = ½

  9. Example #2 • My Hypotheses: Keebler used to advertise • 17 chocolate chip per cookie • More chocolate chips than another brand • The Experiment: Get a cookie of each type, count the number of chips, criticize the experiment • Our Goal: Test these hypotheses, using statistical principles and probability statements

  10. Example #2 • Data Model: The data model is • Normal? • Gamma? • Binomial? • Poisson?

  11. Example #2 • Data Model:The data model is • Normal? • Gamma? • Binomial? • Poisson? • It could be Poisson or normal. Poisson is the better choice, because it is a count • We’ll use the central limit theorem to make inferences

  12. Example #2 • My Hypothesis in terms of population parameters: Keebler has claimed that it gives you 17 chips per cookie, on average • Each of you is a Poisson with mean l • When I say Keebler is correct, what am I saying about the population?

  13. Example #2 • My Hypothesis in terms of population parameters: Keebler has claimed that it gives you 17 chips per cookie, on average • Each of you is a Poisson with mean l • When I say Keebler is correct, what am I saying about the population? • That the population mean number of chips is 17

  14. Example #3 • My Hypotheses: The percentage of regular M&M’s that are green is the same as the percentage of peanut M&M’s that are green • The Experiment: Compute the percentage of green M&M’s in each bag • Our Goal: Test these hypotheses, using statistical principles and probability statements

  15. Example #3 • Data Model: The data model is • Normal? • Gamma? • Binomial? • Poisson?

  16. Example #3 • Data Model:The data model is • Normal? • Gamma? • Binomial? • Poisson? • Roughly normal, since each data point is a percentage • We’ll use the central limit theorem to make inferences

  17. Example #3 • My Hypothesis in terms of population parameters: The %-green M&M’s does not depend on the type of M&M’s • What am I saying about the two populations?

  18. Example #3 • My Hypothesis in terms of population parameters: The %-green M&M’s does not depend on the type of M&M’s • What am I saying about the two populations? • That they have the same population mean.

  19. Example #4 • My Hypotheses: Women who keep track of their diet by diaries or PDA do not lower their caloric intake in a 6-day period • The Experiment: The WISH Study at the National Cancer Institute, with 400 women • The data appear to contradict my hypothesis

  20. Typical (Median) Values of Reported Caloric Intake Over 6 Diary Days: WISH Study A major point of STAT211 is to prepare you to answer the question as to whether these data, which look convincing, really are convincing in terms of probability statements.

  21. Example #4 • Data Model: The data model is • Normal? • Gamma? • Binomial? • Poisson?

  22. Example #4 • Data Model:The data model is • Normal? • Gamma? • Binomial? • Poisson? • Lognormal, so most people take logarithms of caloric intake and analyze them as normal

  23. Example #4 • Data Model: The data that we use is the difference between Day 1 and Day 6, i.e., Day 1 – Day 6

  24. Example #4 • My Hypothesis in terms of population parameters: • What am I saying about the population, when I claim that writing down diets will not lead to a change in reported caloric intake?

  25. Example #4 • My Hypothesis in terms of population parameters: • What am I saying about the population, when I claim that writing down diets will not lead to a change in reported caloric intake? • That the population mean difference between Day 1 and Day 6 = 0

  26. Some Final Comments • Formulating statistical hypothesis testing is really intuitive • Don’t let the formulae obscure the fact that all we are doing is • Asking questions about population parameters • Constructing confidence intervals for population parameters • Using these confidence intervals to answer the question

  27. The WISH Data • I computed a 99% confidence interval for the population mean change in the WISH data. • This interval was entirely above 0, and ranged roughly from 75 to 375 • In other words, with 99% confidence, Day 1 reported between 75 and 375 more calories than Day 6. • Is the hypothesis true?

More Related