Chapter 9: Sampling Distributions

Chapter 9: Sampling Distributions “It has been proved beyond a shadow of a doubt that smoking is one of the leading causes of statistics.” Fletcher Knebel

9.1 Sampling Distributions (pp. 456-469) • Samples: • Examined in order to come to a reasonable conclusion about the population from which the sample is chosen • To glean meaningful information, one must be statistically literate • Must have an awareness of what the sample results tell us and don’t tell us • A statistic calculated from a sample may suffer from bias or high variability • Does not represent a good estimate of a population parameter

Vocabulary Review • Parameter: an index that is related to a population • Statistic: an index that is related to a sample • Sampling distribution of a statistic: the distribution of values of a statistic taken from all possible samples of a specific size • A statistic is unbiased if the mean of the sampling distribution is equal to the true value of the parameter being estimated

Rules to Review: • Variance formula for a POPULATION: • Variance formula for a SAMPLE:

Consider the 3-element populationP = {1, 2, 3}

Now consider all possible samples of size 2, with replacement • There would be 9 samples. • Order is important!

Complete the Chart

Completed Chart SampleSample MeanSample VarianceSample S.D. 1, 1 1 0 0 1, 2 1.5 .25 1 1, 3 2 1 .5 2, 1 1.5 .25 .5 2, 2 2 0 0 2, 3 2.5 .25 .5 3, 1 2 1 1 3, 2 2.5 .25 .5 3, 3 3 0 0 MEANS 2 .33

The mean of the distribution of sample meansis the mean (μ) of the population This illustrates that a sample mean is an unbiased estimator of the population mean The distribution of sample means “centers” around the mean of the population The mean of the distribution of sample variances (s2) is equal to the variance (σ2) of the population This illustrates that a sample variance (s2) is anunbiased estimator of the population variance The distribution of sample variances “centers” around the variance of the population The chart shows that…

Take Note: • A sample standard deviation is NOT an unbiased estimator of the population standard deviation • In the above example, the mean of the sample deviation is 0.628539, and the standard deviation for the population is 8.81649658 • The distribution of sample standard deviations does not center around the standard deviation of the population

9.2 Sample Proportions (pp. 472-477) The normal distribution curve is often extremely useful in analyzing sample proportions. This section provides insights into the circumstances that allow for use of normal distribution properties.

Consider an SRS of 1000 people from a large population • X represents the number in this sample who are Republicans. • There are 1001 possible values for X: 0, 1, …1000/ • P-hat represents the possible sample proportions of Republicans in the sample. • There are 1001 possible values of p-hat: 0/1000, 1/1000…1000/1000. • We could choose many SRS’s and calculate a p-hat for each. • We would expect the distribution of p-hat to be approximately normal.

If we choose an SRS of size n from a largepopulation with population proportion p having some characteristic of interest, and if p-hat is the proportion of the sample having that characteristic, then: • The sampling distribution of p-hat is approximately normal. • The mean of the sampling distribution is p (the population parameter). • The standard deviation of the sampling distribution is

It is reasonable to use the previous statements when: • The population is at least 10 times as large as the sample. • Rule of Thumb #1 • Np is at least 10 and n(1-p) is at least 10. • Rule of Thumb #2

Suppose it is known that 60% of the registered voters in a district of over 20,000 people are Republicans. IF YOU CHOOSE AN SRS OF 1000 REGISTERED VOTERS: • What is the probability that the proportion of registered voters in the sample is between 58% and 62%? • What is the probability that the sample will contain more than 550 Republicans? • Are both rules of thumb satisfied?

Convert x = .55 to its z-score. • Interpret. • Rare occurrence???? • .000628 is approx. = 1/1592. So if we had 1600 random samples of size 1000, how many of them would we “expect” to have 550 or fewer Republicans?

9.3 Sample Means (pp. 481-494) • This section contains one of the most important of all statistical theorems, the Central Limit Theorem of Statistics. • It also emphasizes that it is conventionally the Greek letters μ and σ that are used for the population parameters mean and standard deviation and x-bar and s are used to represent the mean and standard deviation for samples.

The Central Limit Theorem • Consider an SRS of size n from any population with mean μ and standard deviation σ. When n is large, the sampling distribution of x-bar has the following properties: • It is approximately normal. • The mean of the distribution is x-bar ( = μ). • The standard deviation of the distribution is s.

Consider the population • Now consider all possible sample size 2 with NO REPLACEMENT. There would be 3x3 or 9 such samples.

Sample Sample Mean 2, 2 2 2, 4 3 2, 6 4 4, 2 3 4, 4 4 4, 6 5 6, 2 4 6, 4 5 6, 6 6

The mean of the sample means is equal to 4, which is equal to μ. • This illustrates the second part of the Central Limit Theorem. • The standard deviation of the sample means is equal to 1.154700538. • This illustrates the third part of the Central Limit Theorem.

Chapter 9: Sampling Distributions

Chapter 9: Sampling Distributions

Presentation Transcript

Chapter 17 Audit Sampling for Tests of Details of Balances

Basic Probability And Probability Distributions

Pattern Recognition and Machine Learning

Overview of Sampling

CHAPTER 19

Chapter 5 The Multiple Regression Model

Bootstrap Distributions

Electrofishing Efficiency and Sampling Design 6

Chapter 6 Discrete Probability Distributions

Lecture Slides

Chapter 2

Chapter 7 Probability Distributions, Information about the Future

Lecture Slides

Chapter 1: Looking at Data: Distributions

A Mathematical View of Our World

Sampling Theory

Chapter 4 Basic Probability And Probability Distributions

Probability Distributions

Chapter 7

Lecture Slides