360 likes | 458 Views
Unit3: Statistical Inferences. Wenyaw Chan Division of Biostatistics School of Public Health University of Texas - Health Science Center at Houston. Estimation. Point Estimates A point estimate of a parameter θ is a single number used as an estimate of the value of θ .
E N D
Unit3: Statistical Inferences Wenyaw Chan Division of Biostatistics School of Public Health University of Texas - Health Science Center at Houston
Estimation • Point Estimates • A point estimate of a parameter θ is a single number used as an estimate of the value of θ. • e.g. A natural estimate to use for estimating the population mean is the sample mean . • Interval Estimation • If an random interval I=(L,U) satisfying Pr(L<θ<U)=1- α, the observed values of L and U for a given sample is called a 1- α conference interval estimate for θ. Which one is more accurate? Which one is more precise?
Estimation What to estimate? • B(n, p) proportion • Poisson () mean • N(, σ2) mean and/or variance
Estimation of the Mean of a Distribution • A point estimator of the population mean is sample mean. • Sampling Distribution of is the distribution of values of over all possible samples of size n that could have been selected from the reference population.
Estimation • An estimator of a parameter is unbiased estimator if its expectation is equal to the parameter. • Note: The unbiasedness is not sufficient to be used as the only criterion for chosen an estimator. • The unbiased estimator with the minimum variance(MVUE) is preferred. • If the population is normal, then is the MVUE of .
Sample Mean • Standard error (of the mean) = standard deviation of the sample mean • The estimated standard error where s: sample standard deviation .
Central Limit Theorem • Let X1,…,Xn be a random sample from some population with mean and varianceσ2 Then, for large n,
Interval Estimation • Let X1, ….Xn be a random sample from a normal population N(, σ2). If σ2 is known, a 95% confidence interval (C.I.) for is why? (next slide)
Interval Estimation Interpretation of Confidence Interval • Over the collection of 95% confidence intervals that could be constructed from repeated random samples of size n, 95% of them will contain the parameter • It is wrong to say:There is a 95% chance that the parameter will fall within a particular 95% confidence interval.
Interval Estimation • Note: • When and n are fixed, 99% C.I. is wider than 95% C.I. • If the width of the C.I. is specified, the sample size can be determined. n length length
Hypothesis Testing • Null hypothesis(H0): the statement to be tested, usually reflecting the status quo. • Alternative hypothesis (H1): the logical compliment of H0. • Note: the null hypothesis is analogous to the defendant in the court. It is presumed to be true unless the data argue overwhelmingly to the contrary.
Hypothesis Testing • Four possible outcomes of the decision: • Notation: = Pr (Type I error) = level of significance = Pr (Type II error) 1- = power= Pr(reject H0|H1 is true)
Hypothesis Testing • Goal : to make and both small • Facts: then then • General Strategy: fix , minimize
Testing for the Population Mean • When the sample is from normal population H0 : = 120 vs H1 : < 120 • The best test is based on ,which is called the test statistic. The "best test" means that the test has the highest power among all tests with a given type I error. Is there any bad test? Yes. • Rejection Region: • range of values of test statistic for which H0 is rejected.
One-tailed test • Our rejection region is • Now,
Result • To test H0 : = 0vs H1 : < 0, based on the samples taken from a normal population with mean and variance unknown, the test statistic is . • Assume the level of significance is α then, • if t < tn-1, α , then we reject H0. • if t ≥ tn-1, α, then we do not reject H0.
P-value • The minimum α-level at which we can reject Ho based on the sample. • P-value can also be thought as the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic obtained from the sample, given that the null hypothesis is true.
Remarks • Two different approaches on determining the statistical significance: • Critical value method • P-value method.
One-tailed test • Testing H0: µ=µ0vs H1: µ> µ0 When unknown and population is normal Test Statistic: Rejection Region: t > tn-1,α p-value = 1- Ft,n-1 (t), where Ft,n-1 ( ) is the cdf for t distribution with df=n-1. • Note:If is known, the s in test statistic will be replaced σby and tn-1,αin rejection region will be replaced by zα, Ft,n-1 (t) will be replace by Ф(t).
Testing For Two-Sided Alternative • Let X1,….,Xn be the random samples from the population N(µ, σ²), whereσ²is unknown. • H0 : µ=µ0vs H1 : µ≠µ0 • Test Statistic: • Rejection Region: |t|> tn-1,1-α/2 • p-value = 2*Ft,n-1 (t), if t<= 0. (see figures on next slide) 2*[1- Ft,n-1 (t)], if t > 0. • Warning: exact p-value requires use of computer.
Testing For Two-Sided Alternative P-value for X>U0 P-value for X<=U0
The Power of A Test • To test H0 : µ=µ0vs H1 : µ<µ0 in normal population with known variance σ²,the power is • Review: Power= Pr [rejecting H0 | H0 is false ] • Factors Affecting the Power
The Power of The 1-Sample T Test • To test H0 : µ=µ0 vs H1 : µ<µ0in a normal population with unknown variance σ²,the power, for true meanµ1 and true s.d.= σ, is F(tn-1, .05), where F( ) is the c.d.f of non-central t distribution with df=n-1 and non-centrality
Power Function For Two-Sided Alternative • To test H0 :µ=µ0vs H1 : µ≠µ0in normal population with known variance σ²,the power is ,where µ1 is true alternative.
Case of Unknown Variance • For the same test with an unknown variance population, the power is F(-tn-1, 1-α/2) + 1- F(tn-1, 1- α/2), where F( ) is the c.d.f of non-central t distribution with df=n-1 and non-centrality
For example:H0 :µ=µ0vs H1 : µ<µ0 power : Hence, Sample Size Determination
Factor Affecting Sample Size 1. 2. 3. 4. • To test H0 :µ=µ0vs H1 : µ≠µ0, σ²is known. Sample size calculation is
Relationship between Hypothesis Testing and Confidence Interval • To test H0 :µ=µ0vs H1 : µ≠µ0, H0 is rejected with a two-sided level α test if and only if the two-sided 100%*(1 - α) confidence interval for µ does not contain µ0.
Exact Method • If p(hat) < p0, the p-value • If p(hat) ≥ p0, the p-value
One-Sample Inference for the Poisson Distribution • X ~ Poisson with mean μ • To test H0 :µ=µ0vs H1 : µ≠µ0 at α level of significance, • Obtain a two-sided 100(1- α)% C.I. for µ, say (C1, C2) • If µ0 (C1, C2), we accept H0 otherwise reject H0.
One-Sample Inference for the Poisson Distribution • The p-value (for above two-sided test) • If observed X < µ0, then • If observed X > µ0, Where F(x |µ0) is the Poisson c.d.f with mean = µ0.
Large-Sample Test for Poisson(for µ0≥ 10) • To test H0 :µ=µ0vs H1 : µ≠µ0 at α level of significance, • Test Statistic: • Rejection Region: • p-value: