380 likes | 550 Views
STAT 110 - Section 5 Lecture 5. Professor Hao Wang University of South Carolina Spring 2012. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A. Last time . How to get a simple random sample (SRS). POPULATION. SAMPLE. This time .
E N D
STAT 110 - Section 5 Lecture 5 Professor Hao Wang University of South Carolina Spring 2012 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA
Last time • How to get a simple random sample (SRS)
POPULATION SAMPLE This time • What Do Samples Tell Us?
Chapter 3 – What Do Samples Tell Us? parameter – a number that describes the population a parameter is a fixed number, but in practice, we don’t know it’s value
statistic – a number that describes a sample the value of a statistic is known when we have taken a sample, but it can change from sample to sample we often use a statistic to estimate an unknown population parameter
The BIG Picture (select) Population Sample (calculate) (describes) (estimate) Parameter Statistic
Example: Parameter and Statistic • Voter registration records show that 18% of all voters in Philadelphia are registered as Republicans. However, a radio talk show host in Philadelphia found that of 20 local residents who called the show recently, 60% were registered Republicans. Population? Parameter? Sample? Statistic?
Example: Parameter and Statistic The Bureau of Labor Statistics announces that last month it interviewed all members of the labor force in a sample of 55,000 households; 5.6% of the people interviewed were unemployed. Population? Parameter?
Example: Parameter and Statistic A lot of ball bearings has an average diameter of 2.503 cm. This is within the specifications of the purchaser. The inspector inspects 100 bearings from the lot, and they have an average diameter of 2.515 cm. The lot is rejected. Population? Sample? Parameter? Statistic? A. Lot of ball bearings B. 100 ball bearings A. 2.503 cm B. 2.515 cm
Proportions • A Columbia-based health club wants to estimate the proportion of Columbia residents who enjoy running. Let • =proportion of all Columbia residents who enjoy running • We decide to take an SRS of n = 100 Columbia residents. • =proportion of residents in our sample who enjoy running
Proportions • In our SRS of n = 100 Columbia residents, 17 said that they enjoy running. The sample proportion is • Suppose now that I take another SRS of Columbia residents of size n = 100 and 22 of them said that they enjoy running. Find
Example How many hours of sleep does an average USC undergrad have ? Ask your 2 neighbors and average their answers. A less than 5 hrs B 5-6 C 7-8 D 9-10 E more than 10 hrs Population ? Parameter ? Sample ? Statistic ?
Example How many hours of sleep does an average USC undergrad have ? Ask your 5 neighbors and average their answers. A less than 5 hrs B 5-6 C 7-8 D 9-10 E more than 10 hrs Population ? Parameter ? Sample ? Statistic ?
Sampling Variability bias – consistent, repeated deviation of the sample statistic from the population parameter in the same direction when we take many samples systematically misses in the same direction variability – describes how spread out the values of the sample statistic are when we take many samples. amount of scattering
Variability of 1,000 of size n = 100
Variability of 1,000 of size n = 1,523 Notice that with larger samples (1523 vs. 100), there is a lot less variability….but the distribution is still centered at p = 0.60 (so p-hat is unbiased for p)
http://www.rasmussenreports.com/public_content/politics/elections/election_2012/election_2012_presidential_election/florida/2012_florida_republican_primaryhttp://www.rasmussenreports.com/public_content/politics/elections/election_2012/election_2012_presidential_election/florida/2012_florida_republican_primary Example: 2012 Florida Republican Primary 20
In the previous poll:A – The population is the 750 votersB – The population is all likely Florida voters
In the previous poll:A – The percent of all likely FL voters favoring Gingrich is the Parameter and the 41% of the 750 is the statisticB – The percent of all likely FL voters favoring Gingrich is the statistic and the 41% of the 750 is the parameter
In the previous poll:A – The variability is because Gingrich has been in the news a lot recently, and the bias is because it was a random sample.B – The variability is because it was a random sample, and the bias is because Gingrich has been in the news a lot recently.
Margin of Error • During the week of 8/10/01, CNN conducted a poll asking an SRS of 1000 Americans whether they approve of President Bush's performance as President. The approval rating was 57% (plus or minus 3%). In their next poll conducted during the week of 9/21/01, CNN conducted the same poll asking an SRS of 1000 Americans whether they approve of President Bush's performance as President. The approval rating was 90% (plus or minus 3%). • Why the difference in ratings? • Where does plus or minus 3% come from?
Margin of Error The margin of error (MOE) is a value that quantifies the uncertainty in our estimate. When using the sample proportion to estimate the population proportion, the MOE is a measure of how close we believe the sample proportion is to the population proportion.
Calculating Margin of Error • Use the sample proportion from a SRS of size n to estimate an unknown population proportion p. • For 95% confidence (the quick formula):
Example: Margin of Error • The CNN Poll interviewed 1000 people. What is the margin of error for 95% confidence (using the quick formula)? Answer: Recall 95% confidence
Example: Margin of Error If the sample size is 100, what is the margin of error for 95% confidence (using the quick formula)? 0.10% 0.01% 10%
Confidence Interval • Use MOE to calculate an interval that we think includes the parameter • Form for most confidence intervals: • Approximate (because we’re using the quick MOE) 95% confidence interval for p
Confidence Statements A confidence statement interprets a confidence interval and has two parts: a margin of error and a level of confidence. Margin of error says how close the statistic lies to the parameter. Level of confidence says what percentage of all possible samples result in a confidence interval which contains the true parameter
Example: President Bush • Pre 9/11: 57% with MOE 3% • Post 9/11: 90% with MOE 3% • Interpretations • We are 95% confident that the percent of all Americans who approve of the job president Bush was doing was between 54% and 60% before 9/11. • We are 95% confident that the percent of all Americans who approve of the job president Bush was doing was between 87% and 93% after 9/11.
Example: College Education This May 2011 survey finds that 57% of the 2142 adult Americans polled think that “the higher education system in the United States fails to provide students good value for the money they and their families spend”. Using the quick formula for MOE, compute a 95% confidence interval for p.
Example: Coke or Pepsi Suppose you take a sample of 1231 people and ask them if they prefer Coke over Pepsi. You find that 696 say they do. What is , the observed percent from the population? A .725 = 72.5% B .565 = 56.5% C .029 = 2.9% D .038 = 3.8%
Example Coke Or Pepsi continued Suppose you take a sample of 1231 people and ask them if they prefer Coke over Pepsi. You find that 696 say they do. What is the margin of error for 95% confidence? A square root of 1231 = 35.06 = 35.06% B square root of 696 = 26.38 = 26.38% C 1/square root of 1231 = 0.0285 = 2.85% D 1/square root of 696 = 0.0379 = 3.79%
Hints for Interpretation The conclusion of a confidence statement always applies to the population, not to the sample. Our conclusion about the population is never completely certain. If you want a smaller margin of error with the same confidence, take a larger sample.
Hints for Interpretation • It is very common to report the margin of error for 95% confidence. • If the level of confidence is not mentioned, assume 95% confidence. • Can choose to use a confidence level other than 95%. • Other popular levels: 80%, 90%, 99% • For a fixed sample size, if you increase the level of confidence, your interval will become wider. • For a fixed confidence level, if you increase sample size, your interval will become narrower
Population Size Doesn’t Matter The variability of a statistic from a SRS does not depend on the size of the population as long as the population is at least 100 times larger than the sample. Suppose we take a sample of size 2527 from a population of 300,000. Then we take a sample of 2527 from a population of 1,000,000. Which sample statistic would have more variability?