1 / 24

Statistical Inference and Hypothesis Testing in Population Studies

Explore one-sample mean, variance, proportion, and hypothesis testing concepts in two independent populations with classic examples.

ulyssesj
Download Presentation

Statistical Inference and Hypothesis Testing in Population Studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CHAPTER 6Statistical Inference & Hypothesis Testing 6.1 - One Sample Mean μ, Variance σ2, Proportion π 6.2 - Two Samples Means, Variances, Proportions μ1vs.μ2σ12vs.σ22π1vs.π2 6.3 - Multiple Samples Means, Variances, Proportions μ1, …, μkσ12, …,σk2π1, …, πk

  2. CHAPTER 6Statistical Inference & Hypothesis Testing 6.1 - One Sample Mean μ, Variance σ2, Proportion π 6.2 - Two Samples Means, Variances, Proportions μ1vs.μ2σ12vs.σ22π1vs.π2 6.3 - Multiple Samples Means, Variances, Proportions μ1, …, μkσ12, …,σk2π1, …, πk

  3. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =?

  4. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =?

  5. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =? Recall from section 4.1 (Discrete Models): Mean(X – Y) = Mean(X) – Mean(Y) and if X and Y are independent… Var(X – Y) = Var(X) + Var(Y)

  6. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =? Recall from section 4.1 (Discrete Models): Mean(X – Y) = Mean(X) – Mean(Y) and if X and Y are independent… Var(X – Y) = Var(X) + Var(Y)

  7. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =? Recall from section 4.1 (Discrete Models): Mean(X – Y) = Mean(X) – Mean(Y) and if X and Y are independent… Var(X – Y) = Var(X) + Var(Y)

  8. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =? Recall from section 4.1 (Discrete Models): Mean(X – Y) = Mean(X) – Mean(Y) and if X and Y are independent… Var(X – Y) = Var(X) + Var(Y)

  9. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =? Recall from section 4.1 (Discrete Models): Mean(X – Y) = Mean(X) – Mean(Y) and if X and Y are independent… Var(X – Y) = Var(X) + Var(Y)

  10. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =? Recall from section 4.1 (Discrete Models): Mean(X – Y) = Mean(X) – Mean(Y) and if X and Y are independent… Var(X – Y) = Var(X) + Var(Y)

  11. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 μ0 1 2 Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control Random Sample, size n1 Random Sample, size n2 Sampling Distribution =? Recall from section 4.1 (Discrete Models): Mean(X – Y) = Mean(X) – Mean(Y) = 0 under H0 and if X and Y are independent… Var(X – Y) = Var(X) + Var(Y)

  12. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 1 2 Null Distribution But what if σ12andσ22are unknown? Then use sample estimates s12 and s22 with Z- or t-test, if n1 and n2 are large. s.e. 0

  13. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α σ1 σ2 1 2 Null Distribution But what if σ12andσ22are unknown? Then use sample estimates s12 and s22 with Z- or t-test, if n1 and n2 are large. Later… s.e. (But what if n1andn2are small?) 0

  14. Example:X = “$ Cost of a certain medical service” Assume X is known to be normally distributed at each of k = 2 health care facilities (“groups”). Hospital: X1 ~ N(μ1, σ1) Clinic: X2 ~ N(μ2, σ2) • Null Hypothesis H0: μ1 = μ2, • i.e., μ1 – μ2= 0 • (“No difference exists.") • 2-sided test at significance level α = .05 • DataSample 1: n1 = 137 Sample 2: n2 = 140 NOTE: > 0 Null Distribution 4.2 95% Margin of Error = (1.96)(4.2) = 8.232 95% Confidence Interval for μ1 – μ2: (84 – 8.232, 84+ 8.232) = (75.768, 92.232) does not contain 0 Z-score = = 20 >> 1.96  p << .05 Reject H0; extremely strong significant difference 0

  15. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α 1 2 Samplesize n1 Sample size n2 largen1andn2 Null Distribution

  16. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α 1 2 Samplesize n1 Sample size n2 largen1andn2 smalln1andn2 Null Distribution IF the two populations are equivariant, i.e., then conduct a t-test on the “pooled”samples.

  17. Test Statistic Sampling Distribution =? Working Rule of Thumb Acceptance Region for H0 ¼ < F < 4

  18. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) 1 2 smalln1andn2 Null Distribution IFequal variances is accepted, then estimate their common value with a “pooled”sample variance. The pooled variance is a weighted average of s12 and s22, using the degrees of freedom as the weights.

  19. Consider two independent populations… and a random variable X, normally distributed in each. POPULATION 1 POPULATION 2 Null Hypothesis H0: μ1 = μ2, i.e., μ1 – μ2= 0 (“No mean difference") Test at signif level α X1 ~ N(μ1, σ1) X2 ~ N(μ2, σ2) 1 2 smalln1andn2 Null Distribution IFequal variances is accepted, then estimate their common value with a “pooled”sample variance. IFequal variances is rejected, The pooled variance is a weighted average of s12 and s22, using the degrees of freedom as the weights. then use Satterwaithe Test, Welch Test, etc. SEE LECTURE NOTES AND TEXTBOOK.

  20. Example:Y = “$ Cost of a certain medical service” Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”). Hospital: Y1 ~ N(μ1, σ1) Clinic: Y2 ~ N(μ2, σ2) • Null Hypothesis H0: μ1 = μ2, • i.e., μ1 – μ2= 0 • (“No difference exists.") • 2-sided test at significance level α = .05 • Data: Sample 1 ={667, 653, 614, 612, 604}; n1 = 5 Sample 2 ={593, 525, 520}; n2 = 3 NOTE: > 0 • Analysis via T-test(if equivariance holds): Point estimates “Group Means”  “Group Variances” SS1 SS2 s2 = SS/df Pooled Variance The pooled variance is a weighted average of the group variances, using the degrees of freedom as the weights. df1 df2

  21. Example:Y = “$ Cost of a certain medical service” Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”). Hospital: Y1 ~ N(μ1, σ1) Clinic: Y2 ~ N(μ2, σ2) • Null Hypothesis H0: μ1 = μ2, • i.e., μ1 – μ2= 0 • (“No difference exists.") • 2-sided test at significance level α = .05 • Data: Sample 1 ={667, 653, 614, 612, 604}; n1 = 5 Sample 2 ={593, 525, 520}; n2 = 3  NOTE: > 0 • Analysis via T-test(if equivariance holds): Point estimates “Group Means” “Group Variances” s2 = SS/df SS = 6480 Pooled Variance The pooled variance is a weighted average of the group variances, using the degrees of freedom as the weights. df = 6 p-value = Reject H0 at α = .05 stat signif, Hosp > Clinic > 2 * (1 - pt(3.5, 6)) Standard Error [1] 0.01282634

  22. R code: > y1 = c(667, 653, 614, 612, 604) > y2 = c(593, 525, 520) > > t.test(y1, y2, var.equal = T) Two Sample t-test data: y1 and y2 t = 3.5, df = 6, p-value = 0.01283 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 25.27412 142.72588 sample estimates: mean of x mean of y 630 546 Formal Conclusion p-value < α = .05 Reject H0at this level. Interpretation The samples provide evidence that the difference between mean costs is (moderately) statistically significant, at the 5% level, with the hospital being higher than the clinic (by an average of $84).

More Related