160 likes | 362 Views
Inferential Statistics 4. Maarten Buis 18/01/2006. Outline. interpretation of confidence interval confidence interval and testing Analysis of Variance. Interpreting confidence intervals.
E N D
Inferential Statistics 4 Maarten Buis 18/01/2006
Outline • interpretation of confidence interval • confidence interval and testing • Analysis of Variance
Interpreting confidence intervals • If you draw a hundred samples and compute a 95% confidence interval of the mean in each of these samples than the population mean will be inside the interval in 95 samples • If you draw one sample and compute the confidence interval, than the population mean is either within that interval or it is not. • So you are not 95% sure that the population mean is in that interval.
Confidence vs. Probability • The procedure will give the correct conclusion in 95% of the times it is used. • You have no way of knowing if you are one of the 95% ‘lucky ones’ or the 5% ‘unlucky ones’ when you have drawn one sample and computed a confidence interval. • All you can say is that you have used a high quality method to construct the interval.
confidence interval and the sampling distribution • If we have an estimate of the sampling distribution, than the 2.5th and the 97.5th percentiles will form the 95% confidence interval. • These percentiles are the critical values and can be looked up in the appropriate table. • In 5% of the samples the true parameter will be outside that interval • Notice that the true parameter remains fixed and the estimates of the lower and upper bound change between samples.
Best estimate of the sampling distribution of a mean • Our best estimate of the mean in the population is the mean in the sample • So, our best estimate of the mean of the sampling distribution is the mean of the sample • Our best estimate of the standard error is the standard deviation divided by the square root of N • So our best estimate of the sampling distribution of the mean is a t-distribution with mean equal to the sample mean, a standard deviation of the standard error, and N-1 degrees of freedom
confidence interval for mean rent • N=19, so df =18 • look up the two sided critical t-value in Appendix B, table 2: 2.101 • mean is 258, s = 99, so se = • lb = 258 - 22.7*2.101 = 210 • ub = 258 + 22.7*2.101 = 306
Comparing means of more than two groups • Until now we have compared the means of two groups, and not • compared means of more than two groups or, • compared means for a continuous x-variable (regression) • In these cases we use analysis of variance (ANOVA) and the F-test
The Null Hypothesis • The null hypothesis is that the means of all groups are equal: m1 = m2 = m3 = ... = mk • We observe the means of group 1 till k: M1, M2, M3, ..., Mk, and these differ due to sampling error • Are these deviations large enough to reject H0
Decomposition of Sum of Squares • McCall p. 358 • Yi, Mk, M • (Yi-M) = (Yi-Mk) + (Mk-M) • Deviation of a score from the overall mean consists of a deviation of the score to the group mean plus a deviation of the group mean to the overall mean. • Square and sum: SStotal=SSwithin + SSbetween
Mean Sum of Squares • Estimates of the Mean Sum of Squares (variance) are obtained by dividing the Sum of Squares by the number of degrees of freedom: • MStotal = SStotal/(N-1) • MSwithin = SSwithin/(N-k) • MSbetween = SSbetween/(k-1) • N is the sample size and k is the number of groups
old friends • MStotal = variance • MSwithin = (standard error of the estimate)2 • MSbetween/MStotal = R2 or proportion of variance explained, so: • MSbetween = variance explained
F-test • The F statistic is just an estimate like the mean, or the correlation, so it has a sampling distribution: the F-distribution, appendix 2, table E. • The F-distribution has two types of degrees of freedom: • for the numerator, MSbetween; k-1) and • for the denominator, MSwithin; n-k
F-test • If H0 is true (all group means are equal) than MSwithin = MSbetween • Otherwise MSbetween > MSwithin • F = MSbetween / MSwithin • So H0 can be rewritten as: F = 1 • And HA: F > 1 • This is not a directional hypothesis since F>1 implies: m1 m2 m3 ... mk
To do before Monday • read chapter 14, pay special attention to pp. 356-360 • Skip: • pp. 367-375 computational procedure • pp. 375-385 • Use SPSS when making sums with example data