1 / 18

Chapter 15

Chapter 15. Inference in Practice. Effective use of inferential methods requires more than knowing the facts. It requires understanding the reasoning behind the process. z Procedures. If we know standard deviation s before data collected, the confidence interval for m is:

asha
Download Presentation

Chapter 15

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 15 Inference in Practice Chapter 15

  2. Effective use of inferential methods requires more than knowing the facts. It requires understanding the reasoning behind the process.

  3. z Procedures • If we know standard deviation s before data collected, the confidence interval for m is: • To test H0: m = m0, we use this statistic: • These are called z procedures because they rely on critical values from the Z~N(0,1) density function Chapter 15

  4. Conditions for Z Procedures • Data must resemble an SRS from the populationAsk: “where did the data come from?” • Bad samples (see next slide) invalidate methods • Population must be Normal…BUT…a fact known as the Central Limit Theorem tells us the sampling distribution of x-bar will be Normal even if the population is not Normal ifthe sample is “large enough” • In practice, z procedures are robust in large samples • Population standard deviation smust be known before data are collected … Chapter 17 will introduce procedures that can be used when s is not known Chapter 15

  5. Examples of BadSamples • Convenience samples - selecting members of the population that are easiest to reach • Example: sample of mall shoppers teenagers and retired people will be over-represented • Voluntary response samples - people who choose themselves by responding to a broad appeal • Example: online polls are useless scientifically(people who take the trouble to respond are not representative of the larger population) • Under-coverage - some groups in the population are left out or underrepresented • Example: using telephone listing to select subjects (not everyone has a listed phone number • If the data do not come from an SRS or a randomized experiment  conclusions are open to challenge. • Always ask where the data came from. Chapter 15

  6. Normality Assumption and the Central Limit Theorem Normality can be assumed when n is large because of the Central Limit Theorem Sample size less than 15:“Normality” can be assumed if data are symmetric, have a single peak and no outliers. If data are highly skewed, avoid z [and t]procedures. Sample size at least 15: Normality can be assumed unless data are strongly skewed or have outliers. Large samples n > 30 - 60: Normality can be assumed even for skewed distributions when the sample is large (n ≥ ~40) 9/23/2014 Inference about µ Inference about µ 6 6

  7. What shape is the population? • In practice we rely on previous studies and exploratorydata analysisto shed light on population shape • Previous studies may suggests a population is Normal (eg., heights are approx. Normal, but weights are not) • If individual data points are available, alwaysexplore the data’s shape (make stemplot) before doing inference. • The shape of the data (sample) parallels the shape of the population. • Beware that small samples have a lot of chance variation. It is is difficult to judge “Normality” in small samples(but you can still check only for clear departures)

  8. Can Normality be assumed? Moderately sized dataset (n = 20) w/strong skew. Normality cannot be assumed Do NOT use z [or t] procedures 9/23/2014 Inference about µ Inference about µ 8 8

  9. Can Normality be assumed?Extremely large data set (n ≈ 1000) The data has a strong positive skew But since sample is large, central limit theorem is strong and we can assume Normality. Do use z [or t] procedures. 9/23/2014 Inference about µ Inference about µ 9 9

  10. Can Normality be assumed? n is moderate The distribution has no clear departures from Normality. Therefore, we can trust z [and t] procedures. 9/23/2014 Inference about µ Inference about µ 10 10

  11. Garbage In, Garbage Out A study is only as good as the quality of the data CIs and P-values are valueless when the INFORMATION is of POOR QUALITY Example: Self-reported data can be inaccurate and biased Additional Caution: GIGO Chapter 15

  12. Additional Caution: P-values • P-values (significance tests) are often misunderstood • Even large differences can fail to be significant if the sample is small • Statistical significance does NOT tell us whether a finding is important  statistical significance is NOT the same as practical significance • P values are NOT the probability that H0 is true; it is the probability the data came from a distribution in which H0 is correct • Failure to reject H0 is NOT the same as accepting H0 • Although a = 0.05 is a common cut-off, there is NO set border between “significant” and “insignificant” results, surely God loves P = .06 nearly as much as P = .05. PSLS/2e Chapter 15 12

  13. Margin of Error (m) • When estimating µ with C confidence, the margin of error: • The margin of error = half the CI length  indicates the precision of the estimate • z* and σ are immutable at a given level of confidence • To increase precision, increase the sample size: ↑ n→ ↓ m → ↑ precision Chapter 15

  14. Choosing a Sample Size • To determine the sample size requirement to achieve margin of error m when estimating µ use: Chapter 15

  15. Example: National Assessment of Educational Progress (NAEP) Math Scores NEAP math scores predict success following High School Suppose that we want to estimate a population mean NAEP scores with 90% confidence and want the margin of error to be no more than ±5 points We know the NEAP math scores have s = 60 What sample size will be required to enable us to create such an interval? Chapter 15

  16. Example NAEP Quantitative Scores = 399.67 If you round down your margin of error will be bigger If you round up your margin of error will be smaller (a good thing). Always round UP to next integer. Study 400 individuals so m no greater than 5. Chapter 15

  17. Example: Decrease margin of error m Now suppose we want to estimate the population mean NAEP scores with 90% confidence and want the margin of error not to exceed 3 points (recall that s = 60). What sample size will be required to enable us to create such an interval? Chapter 15

  18. Case Study NAEP Quantitative Scores Therefore resolve to study 1083 (so that the margin of error does not exceed 3 points. Note that lowering the margin of error to 3 points, required a much larger sample size! Chapter 15

More Related