160 likes | 301 Views
The Central Limit Theorem. Paul Cornwell March 31, 2011. Statement of Theorem.
E N D
The Central Limit Theorem Paul CornwellMarch 31, 2011
Statement of Theorem • Let X1,…,Xn be independent, identically distributed random variables with positive variance. Averages of these variables will be approximately normally distributed with mean μ and standard deviation σ/√n when n is large.
Questions • How large of a sample size is required for the Central Limit Theorem (CLT) approximation to be good? • What is a ‘good’ approximation?
Importance • Permits analysis of random variables even when underlying distribution is unknown • Estimating parameters • Hypothesis Testing • Polling
Testing for Normality • Performing a hypothesis test to determine if set of data came from normal • Considerations • Power: probability that a test will reject the null hypothesis when it is false • Ease of Use
Testing for Normality • Problems • No test is desirable in every situation (no universally most powerful test) • Some lack ability to verify for composite hypothesis of normality (i.e. nonstandard normal) • The reliability of tests is sensitive to sample size; with enough data, null hypothesis will be rejected
Characteristics of Distribution • Symmetric • Unimodal • Bell-shaped • Continuous
Closeness to Normal • Skewness: Measures the asymmetry of a distribution. • Defined as the third standardized moment • Skew of normal distribution is 0
Closeness to Normal • Kurtosis: Measures peakedness or heaviness of the tails. • Defined as the fourth standardized moment • Kurtosis of normal distribution is 3
Binomial Distribution • Cumulative distribution function:
Binomial Distribution* *from R
Uniform Distribution • Cumulative distribution function:
Uniform Distribution* *from R
Exponential Distribution • Cumulative distribution function:
Exponential Distribution* *from R
For Next Time… • Find n values for more distributions • Refine criteria for quality of approximation • Explore meanless distributions • Classify distributions in order to have more general guidelines for minimum sample size