220 likes | 622 Views
INFERENTIAL STATISTICS. Samples are only estimates of the population Sample statistics will be slightly off from the true values of its population’s parameters Sampling error: The difference between a sample statistic and a population parameter Probability theory
E N D
INFERENTIAL STATISTICS • Samples are only estimates of the population • Sample statistics will be slightly off from the true values of its population’s parameters • Sampling error: • The difference between a sample statistic and a population parameter • Probability theory • Permits us to estimate the accuracy or representativeness of the sample
The “Catch-22” of Inferential Statistics • When we collect a sample, we know nothing about the population’s distribution of scores • We can calculate the mean (x-bar) & standard deviation (s) of our sample, but and are unknown • The shape of the population distribution (normal or skewed?) is also unknown
Probability Theory Allows Us To Answer: What is the likelihood that a given sample statistic accurately represents a population parameter? SampleN = 150 μ = ??? (N= Thousands) X=9.6 Number of serious crimes committed in year priorto prison for inmates entering the prison system
Sampling Distribution(a.k.a. “Distribution of Sample Outcomes”) • “OUTCOMES” = proportions, means, etc. • From repeated random sampling, a mathematical description of all possible sampling event outcomes • And the probability of each one • Permits us to make the link between sample and population… • What is the probability that a sample finding is reflects the population? • Is something that is true of a sample statistic likely to be true of a population parameter?
Relationship between Sample, Sampling Distribution & Population
Sampling Distribution: Characteristics • Central tendency • Sample means will cluster around the population mean • Since samples are random, the sample means should be distributed equally on either side of the population mean • The mean of the sampling distribution is always equal to the population mean • Shape: Normal distribution • Central Limit Theorem: • Regardless of the shape of a raw score distribution (sample or population) of an interval-ratio variable, the sampling distribution will be approximately normal, as long as sample size is ≥ 100
Sampling Distribution: Characteristics • Dispersion: Standard Error (SE) • Measures the spread of sampling error that occurs when a population is sampled repeatedly • Same thing as standard deviation of the sampling distribution • Tells exactly how much error, on average, should exist between the sample mean & the population mean • Formula: σ / √N • However, because σ usually isn’t known, s (sample standard deviation) is used to estimate population standard deviation
Sampling Distribution • Standard Error • Law of Large Numbers: The larger the sample size (N), the more probable it is that the sample mean will be close to the population mean • In other words: a big sample works better (should give a more accurate estimate of the pop.) than a small one • Makes sense if you study the formula for standard error • Sampling Distribution Applet
1. Estimation ESTIMATION
Introduction to Estimation • Estimation procedures • Purpose: • To estimate population parameters from sample statistics • Using the sampling distribution to infer from a sample to the population • Most commonly used for polling data • 2 components: • Point estimate • Confidence intervals
Point Estimate: Value of a sample statistic used to estimate a population parameter Confidence Interval: A range of values around the point estimate Estimation Point Estimate Confidence Interval .58 .546 .614 Confidence Limit (Lower) Confidence Limit (Upper)
Example • CNN Poll (CNN.com; Feb 20, 2009): Slight majority thinks stimulus package will improve economy • “The White House's economic stimulus plan isn't a surefire winner with the American public, but a majority does think the recovery plan will help. According to a new poll, fifty-three percent said the plan will improve economic conditions, while 44 percent said it won't stimulate the economy.” • “On an individual level, there was less hope for improvement. According to the poll, 67 percent said it would not help them personally.” • “The Poll was conducted Wednesday and Thursday (Feb 18-19, 2009), with 1,046 people questioned by telephone. The survey's sampling error is plus or minus 3 percentage points.”
Estimation • POINT ESTIMATES • (another way of saying sample statistics) • CONFIDENCE INTERVAL • a.k.a. “MARGIN OF ERROR” • Indicates that over the long run, 95 percent of the time, the true pop. value will fall within a range of +/- 3 • Point estimates & confidence interval should be reported together “…but a majority does think the recovery plan will help, according to a new poll. Fifty-three percent said the plan will improve economic conditions, while 44 percent said it won't stimulate the economy. …. The Poll was conducted Wednesday and Thursday (Feb 18-19, 2009), with 1,046 people questioned by telephone. The survey's sampling error is plus or minus 3 percentage points.
Estimation1 : Pick Confidence Level • Confidence LEVEL • Probability that the unknown population parameter falls within the interval • Alpha () • The probability that the parameter is NOT within the interval • is the odds of making an error • Confidence level = 1 - • Conventionally, confidence level values are almost always 95%or 99%
2. Divide the probability of error equally into the upper and lower tails of the distribution (2.5% error in each tail with 95% confidence level) Find the corresponding Z score Procedure for Constructing an Interval Estimate 0.95 .025 .025 -1.96 1.96 Z scores
Procedure for Constructing an Interval Estimate 3. Construct the confidence interval • Proportions (like the eavesdropping poll example): • Sample point estimate (convert % to a proportion): • “Fifty-three percent said the plan will improve economic conditions…” • 0.53 • Sample size (N) = 1,046 • Formula 7.3 in Healey • Numerator = (your proportion) (1- proportion) • 95% confidence level (replicating results from article) • 99% confidence level – intervals widen as level of confidence increases
Example 1: Estimate for the economic recovery poll • p = .53 (53% think it will help) • Z = 1.96 (95% confidence interval) • N = 1046 (sample size) • What happens when we… • Recalculate for N = 10,000 • N back to original, recalculate for p. = .90 • Back to original, but change confidence level to 99%