220 likes | 329 Views
Section 9.1 Introduction to Inference. Question for you…. With replacement or without replacement?. Statistical Inferences. Draw conclusions about a population based on data about a sample. Ask questions about a number which describe a population.
E N D
Question for you… • With replacement or without replacement?
Statistical Inferences • Draw conclusions about a population based on data about a sample. • Ask questions about a number which describe a population. • Numbers which describe populations are called… • Parameters • To estimate a parameter, choose a sample from the population and use a statistic (a number calculated from the sample).
Sample Proportions • Population Proportions: p • Sample Proportions: • (also known as p-hat) – we’ve seen this before… it’s the count in the sample which applied (i.e., said yes) divided by the sample size • Example: The 2001 Youth Risk Behavioral Survey questioned a nationally representative sample of 12,960 students in grade 9-12. Of these, 3340 said they had smoked cigarettes at least one day in the past month. • Who is the population? • How do you write the sample proportion?
The Answers • The population is high school students in the United States. • The sample is the 12,960 students surveyed. • The sample proportion (p-hat) … • Give all answers to 4 decimal places.
What it means… • The parameter is the proportion that have smoked cigarettes in the past month. (This is usually an unknown value). • Using the value of p-hat, we can make generalized statements about the population. • Based on this sample, we can estimate that the proportion of all high school students who have smoked cigarettes in the past month is about 25.77%.
Confidence Intervals • On the last example, the answer was about 25.77%. In order to capture the true population parameter in 95% of all samples, use a 95% confidence interval… • Remember that with a 68-95-99.7 curve, 95% is 2 standard deviations in each direction. • This is the formula we will use…
Confidence Intervals • Using this formula will give two endpoints for a confidence interval. 95% of all samples will fall in this interval. • In this formula, the value under the square root represents the standard deviation. n stands for the number in the sample.
Example • Remember, p-hat=0.2577 and n=12960.
The answer… • This interval catches the true unknown population proportion in 95% of all samples. • In other words, we are 95% confident that the true proportion of high school students who have smoked cigarettes at least one day in the past month is between 25.01% and 26.53%.
Another way you may see it… • Estimate ± Margin of Error • What would the confidence interval be here?
Critical Values • So far, we have used 68% = 1 std. dev., 95% = 2 std. dev., and 99.7% = 3 std. dev. • These were estimates which were used. • More accurate values are Critical Values, denoted z* (this is on page 502):
So this changes the equation… • Z* now takes the place of the number of standard deviations outside the square root.
Conclusions • We are 95% confident that the true proportion of high school students who have smoked cigarettes at least one day in the past month is between 25.03% and 26.51%. • Earlier, with using “2” for the number of standard deviations, instead of “1.96” as the critical value z*, we had stated: • We are 95% confident that the true proportion of high school students who have smoked cigarettes at least one day in the past month is between 25.01% and 26.53%.
Conclusions • While this is a very small difference, the benefits with using critical values is that there are more options to use (8) than with the 68-95-99.7 rule’s 3 standard deviations. • Critical Values z* can be used for the following confidence levels: 50%, 60%, 70%, 80%, 90%, 95%, 99%, and 99.9%. • Standard Deviations can be used for 68%, 95%, and 99.7%
Comparing critical values… • Remember, the larger the number, the wider the interval…if you don’t need a high proportion of confidence, you can use a smaller number (like 50%). 50% will give a smaller confidence interval (will contain a smaller proportion of the true unknown population). • 50%: (0.2551, 0.2603) • 60%: (0.2545, 0.2609) • 70%: (0.2537, 0.2617) • 80%: (0.2528, 0.2626) • 90%: (0.2514, 0.2640) • 95%: (0.2503, 0.2651) • 99%: (0.2478, 0.2676) • 99.9%: (0.2451, 0.2704) • So the smallest interval is 25.51% to 26.03% and the largest interval is 24.51%-27.04%
Quiz… • What is the formula for p-hat?
Quiz… • What is the formula for a confidence interval, using z* (just the basic formula—no numbers)?
Quiz… • What does n stand for in the equation? • The sample size.
Quiz… • How many critical values (z* values) are there to choose from? • 8…they are 50%, 60%, 70%, 80%, 90%, 95%, 99%, and 99.9%
Homework • Page 495-496, #9.1-9.5 • Page 505, #9.9, 9.10