380 likes | 543 Views
Outline. Review of last week Sampling distributions The sampling distribution of the mean The Central Limit Theorem Confidence intervals Normal distribution example Sampling distribution example Confidence interval example. Review of last week.
E N D
Outline • Review of last week • Sampling distributions • The sampling distribution of the mean • The Central Limit Theorem • Confidence intervals • Normal distribution example • Sampling distribution example • Confidence interval example
Review of last week • Last week, we learned how to use the Standard Normal Distribution to work out the probability of finding individual scores in some interval – e.g., what is the probability that the next Canadian woman we meet is taller than 175 cm? • Today, we’re going to do the same sort of thing with sample meansrather than individual scores.
The sampling distribution of a sample statistic (such as X) is the probability distribution of that statistic. Population (µ) Sample X
The sampling distribution of the mean consists of all possible sample means – for all possible samples of size n – that you could take from the population Population (µ) Sample 4 X4 Sample 3 X3 Sample 2 X2 Sample 1 X1
When we draw a sample from a population, we are at the same time drawing a sample mean from the distribution of sample means for samples of size n X Distribution of sample means for samples of size n µX
The sampling distribution of a sample statistic is the probability distribution of that statistic. We can have sampling distributions of any sample statistic Mean Median M Variance s2 Std devn s Sampling distributions
The sampling distribution of the mean • The sampling distribution of the sample mean X. E(X) = μ = μ • Variability of this distribution is given by the standard error of the mean: σ = σ ≅ s
Consider a random sample of n observations from a population with mean µ and standard deviation . When n is sufficiently large, the sampling distribution of X will be approximately normal with mean µ = µ and = / . Note: this is true regardless of the shape of the underlying distribution of raw scores The Central Limit Theorem
The larger the sample size, the better the approximation to the normal distribution. For most populations, n ≥ 30 will be “sufficiently large.” The Central Limit Theorem
The Central Limit Theorem • When we draw a sample and measure its mean, by the CLT, we may assume the sampling distribution of the sample mean is normal. • That means we can use the standard normal distribution (SND) to work out the probability of finding a sample mean in a given range relative to the population mean.
μ The sampling distribution of the sample mean
The sampling distribution of the mean • We use the sampling distribution of the mean the way we used the SND last week. We obtain probabilities of finding sample means in a given range relative to the population mean, for samples of size n. • Don’t forget to use the standard error, σX, rather than the standard deviation, σ!
Confidence Intervals • There are two ways to estimate population parameters such as the mean: • Point estimates, such as X • Interval estimates, which tell us a range of values that will contain the parameter with known probability.
.45 .45 Z = -1.645 µX Z = 1.645 90% of the time, X will fall within the range Z = -1.645 to Z = +1.645
Confidence Intervals • If 90% of the time X falls in the range Z = -1.645 to Z = +1.645 around the mean µ, then… • 90% of the time, µ must fall within a range of the same width centered on X.
Confidence Intervals • For given , the 100 (1-)% Confidence Interval for µX is: • C.I. = X ± Z/2 X • C.I. = X ± Z/2 /√n
Confidence Intervals • When is not known and n is large (≥ 30), use s: • C.I. = X ± Z/2 sX • C.I. = X ± Z/2 s/√n
Normal Distribution Example • The amount of time that students wait to be served when buying coffee from the “Campus Perks” coffee outlet is normally distributed with a mean of 62.0 seconds and a 98.5 percentile of 79.36 seconds. In a random sample of 30 students buying coffee at Campus Perks, approximately how many will wait between 40 and 58 seconds to be served? NOTE: This is not a question about a sample mean!
Normal Distribution Example .50 .4850 40 58 62 P98.5 Z for .4850 = 2.17
Normal Distribution Example • = 79.36 – 62 = 8 • 2.17 • Z1 = 40 – 62 = -2.75 (p = .4970 from table) • 8 • Z2 = 58 – 62 = -0.50 (p = .1915 from table) • 8
Normal Distribution Example • P(40 ≤ X ≤ 58) = .4970 - .1915 = .3055 • The probability of any one student waiting between 40 and 58 seconds is .3055. • Therefore, in a random sample of 30, we expect approximately .3055 (30) = 9.165 ≈ 9 students to wait between 40 and 58 seconds.
Sampling Distribution Example • People’s reaction times (RTs) to a simple visual stimulus are normally distributed with a mean of 500 milliseconds and a standard deviation of 150 milliseconds. You believe that people who go on a low-carb diet, however, will have slower (longer) RTs than this, on average, though their standard deviation will remain at 150. To test your belief, you take a random sample of 40 people who self-report having being on a low-carb diet for at least 6 months and measure their RTs. You decide that your belief will be supported if the mean RT of the low-carb group is 565 milliseconds or slower. What is the probability that you will conclude that your belief has been supported even if a low-carb diet actually has no effect on RTs whatsoever?
We want this probability 500 565 You decide that your belief will be supported if the mean RT of the low-carb group is 565 milliseconds or slower. What is the probability that you will conclude that your belief has been supported even if a low-carb diet actually has no effect on RTs whatsoever?
Example 2 • What is P(X ≥ 565 │µ = 500)? • Z = 565 – 500 • 150/√40
Example 2 • What is P(X ≥ 565 │µ = 500)? • Z = 565 – 500 = 65 = 2.74 • 150/√40 23.72 • P for Z = 2.74 (from table) is .4969. • Therefore, desired probability is .5 - .4969 = .0031.
Example 3 • Two variables important to a professional football player are speed and strength. Each year, camps are held to determine potential players’ speed and strength, both of which are continuous, normally-distributed, and independent of each other. The middle 95% of strength scores is bounded by 600 and 900 (on a composite strength index). The average time to run 40 yards is 4.6 seconds, and 40 yard time exceeds 6 seconds only 5% of the time. • a. In order to be considered by a team, a potential player must not exceed the 75th percentile for time to run 40 yards. What is the slowest a player can run 40 yards and still be considered?
.25 4.6 6 .45 seconds X Probability distribution for time to run 40 yards (seconds)
Example 3 • Z(.45) = 1.645 = 6 – 4.6 • σ • σ = 6 – 4.6 = .851 • 1.645
Example 3 • Now we can find X (the 75th percentile): • Z(.25) = 0.675 = X – 4.6 • .851 • X = 0.675 * (.851) + 4.6 = 5.15 (seconds)
4.6 6 seconds 5.15 The 75th percentile for 40 yard times is 5.15 seconds.
Example 3 • Two variables important to a professional football player are speed and strength. Each year, camps are held to determine potential players’ speed and strength, both of which are continuous, normally-distributed, and independent of each other. The middle 95% of strength scores is bounded by 600 and 900 (on a composite strength index). The average time to run 40 yards is 4.6 seconds, and 40 yard time exceeds 6 seconds only 5% of the time. • b. You take a random sample of 200 potential players. What is the probability that the average strength score of the sample is less than or equal to 740?
.45 .45 600 900 µ 750 Probability distribution for strength scores
Example 3 • Z = 1.645 = 900 – 750 • σ • σ = 900 – 750 = 91.19 • 1.645 • Z = 740 – 750 = -1.55 • 91.19/√200
Example 3 • P (Z < 1.55) = .4394 (From table) • Tail probability will be .5 – .4394 = .0606
This is the sampling distribution of mean strength scores for samples with n = 200 .0606 740 What is the probability that the mean for a sample of 200 players is less than this value? 750
Confidence Interval Example • A researcher samples 36 undergraduates from a local university and finds it took them 36.4 days, on average, to find a job, with a standard deviation of 8 days. Use these data to form a 96% confidence interval for the true mean time it takes for graduates to find a job. NOTE: We are not given the population standard deviation
Confidence Interval Example • Recall: • C.I. = X ± Z/2 sX = X ± Z/2 s/√n • X = 36.4 • S = 8 • n = 36 • S/√n = 8/6 = 1.33
Confidence Interval Example • (1-)% = 96%, so /2 = .02 – this is the tail probability. • We get /2 = .02 when we look up Z.48 = 2.05 • C.I. = 36.4 ± 2.05 (1.33) • (33.67 ≤ µ ≤ 39.13)