1 / 31

Stat 31, Section 1, Last Time

Stat 31, Section 1, Last Time. Sampling Experiments Design Controls Randomization Blind & Double Blind Pepsi Challenge. Midterm I. Coming up: Tuesday, Feb. 15 Material: HW Assignments 1 – 4 Extra Office Hours: Mon. Feb. 14, 8:30 – 12:00, 2:00 – 3:30

mathewg
Download Presentation

Stat 31, Section 1, Last Time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stat 31, Section 1, Last Time • Sampling • Experiments • Design • Controls • Randomization • Blind & Double Blind • Pepsi Challenge

  2. Midterm I Coming up: Tuesday, Feb. 15 Material: HW Assignments 1 – 4 Extra Office Hours: Mon. Feb. 14, 8:30 – 12:00, 2:00 – 3:30 (Instead of Review Session) Bring Along: 1 8.5” x 11” sheet of paper with formulas

  3. Sec. 3.4: Basics of “Inference” Idea: Build foundation for statistical inference, i.e. quantitative analysis (of uncertainty and variability) Fundamental Concepts: Population described by parameters e.g. mean , SD . Unknown, but can get information from…

  4. Fundamental Concepts Last page: Population, here: Sample (usually random), described by corresponding “statistics” e.g. mean , SD . (Will become important to keep these apart)

  5. Population vs. Sample E.g. 1: Political Polls • Population is “all voters” • Parameter of interest is: = % in population for A (bigger than 50% or not?) • Sample is “voters asked by pollsters” • Statistic is = % in sample for A (careful to keep these straight!)

  6. Population vs. Sample E.g. 1: Political Polls • Notes • is an “estimate” of • Variability is critical • Will construct models of variability • Possible when sample is random • Recall random sampling also reduces bias

  7. Population vs. Sample E.g. 2: Measurement Error (seemingly quite different…) • Population is “all possible measurem’ts” (a thought experiment only) • Parameters of interest are: = population mean = population SD

  8. Population vs. Sample E.g. 2: Measurement Error • Sample is “measurem’ts actually made” • Statistics are: = mean of measurements = SD of measurements

  9. Population vs. Sample E.g. 2: Measurement Error • Notes: • estimates • estimates • Again will model variability • “Randomness” is just a model for measurement error

  10. Population vs. Sample HW: 3.59 3.61

  11. Basic Mathematical Model Sampling Distribution Idea: Model for “possible values” of statistic E.g. 1: Distribution of in “repeated samplings (thought experiment only) E.g. 2: Distribution of in “repeated samplings (again thought experiment)

  12. Basic Mathematical Model Sampling Distribution Tools Can study these with: • Histograms  “shape”: often Normal • Mean  Gives measure of “bias” • SD  Gives measure of “variation”

  13. Bias and Variation Graphical Illustration Scanned from text: Fig. 3.9

  14. Bias and Variation Class Example: Results from previous class on “Estimate % of males at UNC” https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg16.xls Recall several approaches to estimation (3 bad, on sensible)

  15. E.g. % Males at UNC At top: • Counts • Corresponding proportions (on [0,1] scale) • Bin Grid (for histograms on [0,1] numbers) Next Part: • Summarize mean of each • Summarize SD (spread) of each Histograms (appear next)

  16. E.g. % Males at UNC Recall 4 way to collect data: Q1: Sample from class Q2: Stand at door and tally • Q1 “less spread and to left”? Q3: Make up names in head • Q3 “more to right”? Q4: Random Sample • Supposed to be best, can we see it?

  17. E.g. % Males at UNC Better comparison: Q4 vs. each other one Use “interleaved histograms” Q1 & Q4: • Q1 has smaller center: • i.e. “biased”, since Class Population • And less spread: • since “drawn from smaller pool”

  18. E.g. % Males at UNC Q2 & Q4: • Centers have Q2 bigger: • Reflects bias in door choice • And Q2 is “more spread” : • Reflects “spread in doors chosen” + “sampling spread”

  19. E.g. % Males at UNC Q3 & Q4: • Center for Q3 is bigger: • Reflects “more people think of males”? • And Q3 is “more spread” : • Reflects “more variation in human choice”

  20. E.g. % Males at UNC A look under the hood: • Highlight an interleaved Chart • Click Chart Wizard • Note Bar (and interleaved subtype) • Different colors are in “series” • Computed earlier on left • Using Tools  Data Anal.  Histo’m

  21. E.g. % Males at UNC Interesting question: What is “natural variation”? Will model this soon. This is “binomial” part of this example, which we will study later.

  22. Bias and Variation HW: 3.62 (Hi bias – hi var, lo bias – lo var, lo bias – hi var, hi bias – lo var) 3.65

  23. Chapter 4: Probability Goal: quantify (get numerical) uncertainty • Key to answering questions above (e.g. what is “natural variation” in a random sample?) (e.g. which effects are “significant”) Idea: Represent “how likely” something is by a number

  24. Simple Probability E.g. (will use for a while, since simplicity gives easy insights) Roll a die (6 sided cube, faces 1,2,…,6) • 1 of 6 faces is a “4” • So say “chances of a 4” are: “1 out of 6” . • What does that number mean? • How do we find such for harder problems?

  25. Simple Probability A way to make this precise: “Frequentist Approach” In many replications (repeat of die roll), expect about of total will be 4s Terminology (attach buzzwords to ideas): Think about “outcomes” from an “experiment” e.g. #s on die e.g. roll die, observe #

  26. Simple Probability Quantify “how likely” by assigning “probabilities” I.e. a number between 0 and 1, to each outcome, reflecting “how likely”: Intuition: • 0 means “can’t happen” • ½ means “happens half the time” • 1 means “must happen”

  27. Simple Probability HW: C10: Match one of the probabilities: 0, 0.01, 0.3, 0.6, 0.99, 1 with each statement about an event: • Impossible, can’t occur. • Certain, will happen on every trial. • Very unlikely, but will occur once in a long while. • Event will occur more often than not.

  28. Simple Probability Main Rule: Sum of all probabilities (i.e. over all outcomes) is 1: E.g. for die rolling:

  29. Simple Probability HW: 4.13a 4.15

  30. Probability General Rules for assigning probabilities: • Frequentist View (what happens in many repititions?) • Equally Likely: for n outcomes P{one outcome} = 1/n (e.g. die rolling) iii. Based on Observed Frequencies e.g. life tables summarize when people die Gives “prob of dying” at a given age “life expectancy”

  31. Probability General Rules for assigning probabilities: • Personal Choice: • Reflecting “your assessment: • E.g. Oddsmakers • Careful: requires some care HW: 4.16

More Related