Chapter 11: Inference About a Mean

Chapter 11: Inference About a Mean

In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for μ 11.5 Paired Samples 11.6 Conditions for Inference 11.7 Sample Size and Power

σ not known • Prior chapter: σ was known before collecting data z procedures used to help infer µ • When σ NOT known, calculate sample standard deviations sand use it to calculate this standard error:

Additional Uncertainty The Normal distribution doesn’t fit well • Using s instead of σ adds uncertainty to inferences  can NOT use z procedures • Instead, rely on Student’s t procedures William Sealy Gosset (1876–1937)

Student’s t distributions • Familyof probability distributions • Family members identified by degrees of freedom (df) • Similar to “Z”, but with broader tails • As df increases → tails get skinnier → t become like z A t distribution with infinite degrees of freedom is a Standard Normal Z distribution

Table C (t table) Rows  df Columns  probabilities Entries  t values Notation: tcum_prob,df = t value Example: t.975, 9 = 2.262

Objective: test a claim about population mean µ Conditions : Simple Random Sample Normal population or “large sample” One-Sample t Test

Hypothesis Statements • Null hypothesisH0: µ = µ0 where µ0 represents the pop. mean expected by the null hypothesis • Alternative hypotheses Ha: µ < µ0(one-sided, left) Ha: µ > µ0 (one-sided, right) Ha: µ ≠ µ0 (two-sided)

Example • Do SIDS babies have lower average birth weights than a general population mean µ of 3300 gms? • H0: µ = 3300 • Ha: µ < 3300 (one-sided) or Ha: µ ≠ 3300 (two-sided)

One-Sample t Test Statistic where This t statistic has n – 1 degrees of freedom

Example (Data) SRS n = 10 birth weights (grams) of SIDS cases

Example Testing H0: µ = 3300

For a more precise P-value use a computer utility Here’s output from the free utility StaTable Graphically:

Interpretation • TestingH0: µ = 3300 gms • Two-tailed P > .10 • Conclude: weak evidence against H0 • The sample mean (2890.5) is NOT significantly different from 3300

(1− α)100% CI for µ where

Same Data Interpretation: Population mean µ is between 2375 and 3406 grams with 95% confidence

§11.5 Paired Samples • Two samples • Each data point in one sample uniquely matched to a data point in the other sample • Examples of paired samples • “Pre-test/post-test” • Cross-over trials • Pair-matching

Example • Does oat bran reduce LDL cholesterol? • Start half of subjects on CORNFLK diet • Start other half on OATBRAN • Two weeks  LDL cholesterol • Washout period • Cross-over to other diet • Two weeks  LDL cholesterol

Oat bran dataLDL cholesterol mmol Subject CORNFLK OATBRAN ---- ------- ------- 1 4.61 3.84 2 6.42 5.57 3 5.40 5.85 4 4.54 4.80 5 3.98 3.68 6 3.82 2.96 7 5.01 4.41 8 4.34 3.72 9 3.80 3.49 10 4.56 3.84 11 5.35 5.26 12 3.89 3.73

Within-pair difference “DELTA” • Let DELTA = CORNFLK - OATBRAN • First three observations in OATBRAN data: ID CORNFLK OATBRAN DELTA ---- ------- ------- ----- 1 4.61 3.84 0.77 2 6.42 5.57 0.85 3 5.40 5.85 -0.45 etc. All procedures are now directed toward difference variable DELTA

Exploratory and descriptive stats DELTA: 0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16 Stemplot |-0f|4|-0*|2 |+0*|01|+0t|33 |+0f| |+0s|6677 |+0.|88×1 LDL (mmol) subscript d denotes “difference”

95% CI for µd  95% confident population mean difference µd is between 0.105 and 0.656 mmol/L

Claim: oat bran diet is associated with a decline (one-sided) or change (two-sided) in LDL cholesterol. Test H0: µd = µ0 where µ0 = 0 Ha: µd > µ0 (one-sided) Ha: µ ≠ µ0 (two-sided) Hypothesis Test

Paired t statistic

|tstat| =3.043 P-value via Table C Thus  One-tailed: .005 < P < .01 Two-tailed: .01 < P < .02

P-value via Computer

SPSS Output: “Oat Bran”

Interpretation My P value is smaller than yours! • Testing H0: µ = 0 • Two-tailed P = 0.011 •  Good reason to doubt H0 • (Optional) The difference is “significant” at α = .05 but not at α = .01

The Normality Condition • t Procedures require Normal population or large samples • How do we assess this condition? • Guidelines. Use t procedures when: • Population Normal • population symmetrical and n ≥ 10 • population skewed and n≥ ~45(depends on severity of skew)

Can a t procedures be used? Skewed small sample  avoid t procedures

Can a t procedures be used? Mild skew in moderate sample t OK

Can a t procedures be used? Skewed moderate sample  avoid t

Sample Size and Power Methods: (1) n required to achieve m when estimating µ (2) n required to test H0 with 1−β power (3) Power of a given test of H0

Power • α≡ alpha (two-sided) • Δ≡ “difference worth detecting” = µa – µ0 • n ≡ sample size • σ≡ standard deviation • Φ(z) ≡ cumulative probability of Standard Normal z score with .

Power: SIDS Example • Let α = .05 and z1-.05/2 = 1.96 • Test: H0: μ = 3300 vs. Ha: μ = 3000. Thus:Δ≡ µ1 – µ0 = 3300 – 3000 = 300 • n = 10 and σ≡ 720 (see prior SIDS example) Use Table B to look up cum prob  Φ(-0.64) = .2611

Power: Illustrative Example

Chapter 11: Inference About a Mean

Chapter 11: Inference About a Mean

Presentation Transcript

Chapter 6

Chapter 9: Statistical Inference: Significance Tests About Hypotheses

Chapter

Representation, Inference and Learning in Relational Probabilistic Languages

Statistical inference for astrophysics

Inference in First-Order Logic

Inference for Distributions - for the Mean of a Population

Sensitivity Analysis: Quantifying the Discourse About Causal Inference Kenneth A. Frank Help from Yun-jia Lo and Mik

Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01

Graphical Models for the Internet

Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01

Inference: significance Tests about hypotheses

Structured Probabilistic Inference in an Embodied Construction Grammar

Statistics for Business and Economics

Chapter 8 Fuzzy Inference 模糊推論

Lecture 5

Welcome

Appendix C

Exploratory Failure Time Analysis and Copy Number Variation Inference

Appendix C