280 likes | 590 Views
Chapter 7 and Chapter 8. Inference for the Mean of a Population – Part 1. Chapter 7.1 (omit sign test pp 469 – 470). The situation where is not known. If is known then the std deviation of the sample mean is given by /sqrt(n)
E N D
Inference for the Mean of a Population – Part 1 Chapter 7.1 (omit sign test pp 469 – 470)
The situation where is not known • If is known then the std deviation of the sample mean is given by /sqrt(n) • We now consider the more realistic situation where is not known. In effect, we estimate using, s, the sample standard deviation.
Example: The following data are the amounts of vitamin C, measured in mg. per 100 grams of blend (dry basis) for a random sample of size 8 from a production run: 26,31,23,22,11,22,14,31 We want a 95% c.i. for µ, the mean vitamin C content produced during this run.
Example: A random sample of 10 one-bedroom apartment rental ads from your local newspaper has these monthly rents (dollars): 500,650,600,505,450,550,515,495,650,395. Do these data give good reason to believe that the mean rent of all advertised one apartments is greater than $500 per month?
Matched Pairs • Here are some sales before and after a motivational course. Does the course appear to be effective in increasing sales?
A statistical procedure is said to be robust if the probability calculations required are insensitive to violations of the assumptions made: For t: n < 15: use t if data is clearly close to normal. If clearly non-normal or outliers are present do not use t. 40>=n ≥ 15: can use t except in presence of outliers or strong skewness. Large samples: can use t procedures even for clearly skewed data when sample size is large, roughly n ≥ 40. Robustness of the t procedures
Inference for the Mean of a Population – Part 2: Comparing Two Means Chapter 7.2 (omit pp 498- 503)
Want to compare means of two populations Can use c.i. or hypothesis tests. Many specialized procedures -- depending on data and underlying distributions. We’ll look at some of the most important ones. Overview
We assume variances are known and normal population. Doesn’t happen often in practice Can do hypothesis tests and compute p-values as in Ch 6. Example: sigma1 =20, sigma 2 =30, n1 = 120, n2 = 150, x1bar = 67.3, x2 bar = 72.0 H0: mu1- mu2 = 0. Ha: mu1-mu2 ≠0 (a) Compute the z statistic and p-value. (b) Get a 95% c.i for mu1- mu2 The idealized situation
The most common situation. We use sample standard deviations to estimate sigma1 and sigma2. Two sample t-procedures
The purchasing department has suggested that all new computer monitors for your company should have flat screens. You want to be sure employees like them. The next 20 employees needing screens are randomly divided into two groups, with 10 in each group. 10 get flat screens, the other 10 get conventional monitors. One month after receiving the monitors, the employees rate their satisfaction with their monitors on a scale from 1 to 5 by responding to the question “I like my new monitor ( 1= strongly disagree, 5 = strongly agree). Flat screen employees have an average satisfaction of 4.6 with std dev of 0.7. The employees with the standard monitors have an average 3.2 with a standard deviation of 1.6. (a) Give a 95% c.i for the difference in mean satisfaction scores for all employees. What about a hypothesis test for comparing the two means? Example
Generally procedures are quite robust If sample sizes are equal and distributions of the two populations have similar shapes, p-values from t table are quite accurate even when n1 and n2 are as small as 5. If sample sizes are unequal can use the following (same as for one sample t-tests and conf.ints., but replace n by n1+n2): n1+n2 < 15: use t if data is clearly close to normal. If clearly non-normal or outliers are present do not use t. n1+n2 ≥ 15: can use t except in presence of outliers or strong skewness. Large samples: can use t procedures even for clearly skewed data when sample size is large, roughly n1+n2 ≥ 40. Robustness of the two sample procedures
Have to be very careful. Substantial uncertainty in estimates, but if differences in means is large, can often detect this Specialized procedures If we can assume that two populations have equal variances then can use pooled estimator. Can test for equal variances (F test) Numerical procedures (optional) appear in text. Small samples
Excel • Data analysis tool pack can do two-sample t-tests that we have discussed + optional material: • Most important for us are the two sample t test that does not assume equal variances • Excel also does the calculation for a specialized test that assumes the two populations have equal variance • All are very easy to use. • We Should alway plot data, do normal quantile plots, etc.
Excel example • Example– Do piano lessons improve spatial-temporal reasoning? • Excel output appears below.
Chapter 8 Inferences for Proportions (Section 8.1)
Chapter 5 tells us … How do sample proportions behave?
A SRS of 1600 BC residents found that 954 favored construction of a new highway to Whistler. Give a 95% c.i for the true proportion of BC residents who favor a new highway to Whistler. Example
9 of 15 people in a SRS of 15 Buec 232 students felt that the course workload was too heavy. Compute an approximate 90% c.i. for the proportion of students who felt the course workload was too heavy. Using the plus 4 estimator for small samples
Hypothesis tests for proportions – we use sample proportion rather than plus 4 estimate.
We found that 11 customers in a sample of 40 would be willing to buy a software upgrade that costs $100. If the upgrade is to be profitable, you will need to sell it to more than 20% of your customers. Do the sample data give good evidence that more than 20% are willing to buy? Example
A poll (March 2, 2004) estimated that support for the BC Liberal party was 39%. Using this estimate as a “guessed value” for a follow up study, how large a sample would I need to estimate Liberal support to within +/- 3%? I want a 95% level of confidence in my estimate.