370 likes | 489 Views
Statistics Workshop Principles of Estimation J-term 2009 Bert Kritzer. Statistical Inference. Inference about populations from samples Inference about underlying processes Could the observed pattern been generated by a random process?
E N D
Statistics Workshop Principles of EstimationJ-term 2009Bert Kritzer
Statistical Inference • Inference about populations from samples • Inference about underlying processes • Could the observed pattern been generated by a random process? • Inference about systematic vs. random (“stochastic”) components Observation = Systematic + Random • Sampling • Process • Observed statistics as random variables
Two Broad Types of Estimation • Point estimation: a single value • the best single value • θ as the generic parameter to be estimated • mean μ • difference of means Δ • proportion or probability π • variance σ2 • correlation ρ(rho) • regression coefficient β • as the generic estimate • Interval estimation: a range of values • Confidence interval • “Margin of Error” • Sampling error
Problems with Estimators Bias Variability
The Lingo • Bias • Expected Value • Efficiency • Standard Error (a special standard deviation) • Impact of sample size • Mean Squared Error (MSE) • Combines bias & efficiency • Consistency
Expected Valueor, the mean of a random variable If you were to roll an honest die many times, and you were paid $1 for a 1, $2 for a 2, etc., what would you expect the payout to average out to be per roll?
Unbiased Estimator: Statistical Bias Defined: Defining and Measuring Bias
Distribution of Sample Means mean of means = 54.93 μ = 54.94 mean of medians = 54.95
Example: Presumes an underlying normal distribution Measuring Relative Efficiency
Evaluating Bias and Efficiency Three estimators of μ:
Algebra of ExpectationThe “sums” rule The expected value of a weighted sum of random variables is the weighted sum of the expected values.
Algebra of ExpectationThe Variance Rule The variance of a weighted sum of independentrandom variables is the sum of the variances each multiplied by the square of the respective weight.
Robust Estimation trimmed mean biweighted mean The goal is to reduce the numerator and the denominator such that the ratio itself is lower.
Methods of Point Estimation • Method of “moments” • Using the sample equivalent for your point estimate • Bias: example of the standard deviation • Minimize error • Least Squares or Least Absolute Error • Weighted and Generalized Least Squares • Maximum Likelihood • Choosing the estimate the maximizes the probability you would see the sample you’ve got.
Interval Estimates • Express estimate as a range rather than a specific point value • True value may lie outside the range you identify, but you know the probability this will happen • Also called • Confidence Interval • Margin of Error • Sampling Error
Central Limit Theorem • The sampling distribution of a sample mean of X approaches normality as the sample size gets large regardless of the distribution of X. • The mean of this sampling distribution is μXand, for a simple random sample, the standard deviation (standard error) is σX/n • The sampling distribution of any statistic that is formed as a weighted sum of N observed variables from a random sample approaches normality as N gets large.
Methods for Interval Estimation • Analytic: Derived from Probability Theory • Empirical • “Bootstrapping” • Combined analytic & empirical for predicted values • “Clarify”
99% 99.9% Simple Example
Solve for n: 99% 95% 99.9% Sample Size Estimation
Interval Estimates and Hypothesis Tests • If null hypothesized value falls outside the interval estimate, you can reject the null hypothesis at the α level corresponding to the significance level • 95% significance level equivalent to α = .05 • It is possible to construct one-sided intervals