320 likes | 560 Views
Chapter 10: Sampling and Central Limit Theorem. Number of Cards : 52 Excluding Jokers Unit of analysis = 1 card Denominations : 13 2-10 Ace (1) Jack Queen King. Colors : 2 Black (26 cards) Red (26 cards) Suits : 4 Clubs ( ♣ ) Diamonds ( ♦ ) Hearts ( ♥ ) Spades ( ♠ ).
E N D
Number of Cards: 52 Excluding Jokers Unit of analysis = 1 card Denominations: 13 2-10 Ace (1) Jack Queen King Colors: 2 Black (26 cards) Red (26 cards) Suits: 4 Clubs (♣) Diamonds (♦) Hearts (♥) Spades (♠) Characteristics of a Deck of Cards
Population and Sample • Population: The entire collection of individuals, groups, or social artifacts in which a researcher is interested in studying • Card Example: We are interested in studying the characteristics of all 52 cards • Sample: A portion of a population • Card Example: Fewer than 52 cards
Representative Sample and Sampling Error • Representative Sample: A sample that is similar to the population from which it was chosen • Card Example: A sample of cards will be representative if… • About 50% are red and 50% are black • About 25% are from each suit • About 7.7% (4 out of 52) are from each denomination • Sampling Error: Difference between characteristics of the sample and population • A representative sample will have low sampling error
Illustration: Unrepresentative Sample and High Sampling Error
Illustration: Unrepresentative Sample and High Sampling Error
Inferential Statistics and Probability Sampling • Inferential Statistics: Using data from a sample to make generalizations about a population • Card Example: Using a sample of cards to draw inferences about the whole deck • You might say we’re not playing with a full deck… • Probability Sampling: In statistics we assume that probability sampling methods have bee used • Sample is selected randomly from a population • The probability of selection from a population is known
Parameter and Statistic • Parameter: A measure used to describe features of a variable in a population • µ = mean • σ = standard deviation • = standard error (standard deviation of the mean) • Statistic: A measure used to describe features of a variable in a sample • = mean • S = standard deviation • = standard error (standard deviation of the mean)
Problems and Solutions • Problem 1: We don’t know the population parameter • Why? We are studying a sample • Solution? Use the sample statistic as an estimate of the population parameter • Problem 2: There will be sampling error • Why? The sample is usually at least a little different from the population • Solution? Use probability sampling methods, which minimizes sampling error • Problem 3: We don’t know how much sampling error there will be • Why? We don’t know the parameter • Solution? Use the sample standard error ( ) as an estimate of sampling error
Central Limit Theorem (CLT) • If all possible random samples of size N are drawn from a population distribution with a mean µ and standard deviation σ, then as N becomes larger, the sampling distribution of the sample means becomes approximately normal with mean µ and standard deviation
Population Distribution of Card Color • Variable Values • 0 = red • 1 = black • Parameters • Mean: • Standard Deviation*: *When dealing with the population, we divide by N rather than N-1
Population Distribution of Card Color • Population Histogram
Sampling Distribution of the Mean for N = 1 Card • Possible Outcomes and Mean of Each Outcome
Sampling Distribution of the Mean for N = 1 Card • Mean of the Sampling Distribution • Standard Error (Standard Deviation of the Means)
Sampling Distribution of the Mean for N = 1 Card • Sampling Distribution of the Mean
Sampling Distribution of the Mean for N = 2 Cards • Possible Outcomes and Mean of Each Outcome
Sampling Distribution of the Mean for N = 2 Cards • Mean of the Sampling Distribution • Standard Error (Standard Deviation of the Means)
Sampling Distribution of the Mean for N = 2 Cards • Sampling Distribution of the Mean
Sampling Distribution of the Mean for N = 3 Cards • Possible Outcomes and Mean of Each Outcome
Sampling Distribution of the Mean for N = 3 Cards • Mean of the Sampling Distribution • Standard Error (Standard Deviation of the Means)
Sampling Distribution of the Mean for N = 3 Cards • Sampling Distribution of the Mean
Sampling Distribution of the Mean for N = 4 Cards • Possible Outcomes and Mean of Each Outcome
Sampling Distribution of the Mean for N = 4 Cards • Mean of the Sampling Distribution • Standard Error (Standard Deviation of the Means)
Sampling Distribution of the Mean for N = 4 Cards • Sampling Distribution of the Mean
Sampling Distribution of the Mean for N = 6 Cards • Mean of the Sampling Distribution • Standard Error (Standard Deviation of the Means) μ = 0.50
Sampling Distribution of the Mean for N = 6 Cards • Sampling Distribution of the Mean
Using the Central Limit Theorem • Sample Size (N): When conducting research, you need a random sample of 50 or more people in order to apply the central limit theorem • Sample Mean: If the sample size is at least 50 (randomly selected), then the sample mean ( ) will be a good estimate of the population mean (µ)
Using the Central Limit Theorem • Standard Error: If the sample size is at least 50 (randomly selected), then the sample standard error ( ) will be a good estimate of the population standard error ( ), and it can be used to determine how accurate the sample mean is as an estimate of the population mean • Normal Distribution: We can use the normal distribution (Chapter 9) to draw inferences about the sample mean, no matter what the distribution of the variable in the population (Chapters 11 and 12)
Central Limit Theorem: Example Questions • Question 1: The income distribution of everyone in a large city has a mean of µ = $25,000 and a standard deviation of σ = $20,000. The distribution of income is highly positively skewed, and a researcher wants to study a random sample of N = 1,000 people from this population. Explain whether the sampling distribution of the mean is approximately normal. Justify your response. • Answer: The sampling distribution of the mean is approximately normal by the central limit theorem because the sample size is at least 50 • Question 2: For the scenario given in Question #1, calculate the mean and standard error for the sampling distribution of the mean for the sample of N = 1,000 people • Answer: The mean of the sampling distribution is the same as the population mean ($25,000). The standard error for the sampling distribution is:
Central Limit Theorem: In-Class Exercise • Description: The weight of all newborn babies in the U.S. has a normal distribution with mean µ = 7.5 and standard deviation σ = 1.25 pounds. • Question 1: Explain whether the mean of 7.5 pounds is a parameter or statistic. • Question 2: Suppose that a researcher is interested in studying the weights of a random sample of N = 100 newborn babies. Explain whether you can conclude that the sampling distribution of the mean is approximately normal. Justify your response. • Question 3: Calculate the mean and standard error for the sampling distribution of the mean for the sample of 100 newborn babies.