1 / 29

The Central Limit Theorem

The Central Limit Theorem. Paul Cornwell March 31, 2011. Statement of Theorem.

ora-kerr
Download Presentation

The Central Limit Theorem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Central Limit Theorem Paul CornwellMarch 31, 2011

  2. Statement of Theorem • Let X1,…,Xn be independent, identically distributed random variables with positive variance. Averages of these variables will be approximately normally distributed with mean μ and standard deviation σ/√n when n is large.

  3. Questions • How large of a sample size is required for the Central Limit Theorem (CLT) approximation to be good? • What is a ‘good’ approximation?

  4. Importance • Permits analysis of random variables even when underlying distribution is unknown • Estimating parameters • Hypothesis Testing • Polling

  5. Testing for Normality • Performing a hypothesis test to determine if set of data came from normal • Considerations • Power: probability that a test will reject the null hypothesis when it is false • Ease of Use

  6. Testing for Normality • Problems • No test is desirable in every situation (no universally most powerful test) • Some lack ability to verify for composite hypothesis of normality (i.e. nonstandard normal) • The reliability of tests is sensitive to sample size; with enough data, null hypothesis will be rejected

  7. Characteristics of Distribution • Symmetric • Unimodal • Bell-shaped • Continuous

  8. Closeness to Normal • Skewness: Measures the asymmetry of a distribution. • Defined as the third standardized moment • Skew of normal distribution is 0

  9. Closeness to Normal • Kurtosis: Measures peakedness or heaviness of the tails. • Defined as the fourth standardized moment • Kurtosis of normal distribution is 3

  10. Binomial Distribution • Cumulative distribution function:

  11. Binomial Distribution* *from R

  12. Uniform Distribution • Cumulative distribution function:

  13. Uniform Distribution* *from R

  14. Exponential Distribution • Cumulative distribution function:

  15. Exponential Distribution* *from R

  16. For Next Time… • Find n values for more distributions • Refine criteria for quality of approximation • Explore meanless distributions • Classify distributions in order to have more general guidelines for minimum sample size

  17. The Central Limit Theorem (Pt 2) Paul CornwellMay 2, 2011

  18. Review • Central Limit Theorem: Averages of i.i.d. variables become normally distributed as sample size increases • Rate of converge depends on underlying distribution • What sample size is needed to produce a good approximation from the CLT?

  19. Questions • Real-life applications of the Central Limit Theorem • What does kurtosis tell us about a distribution? • What is the rationale for requiring np ≥ 5? • What about distributions with no mean?

  20. Applications of Theorem • Probability for total distance covered in a random walk tends towards normal • Hypothesis testing • Confidence intervals (polling) • Signal processing, noise cancellation

  21. Kurtosis • Measures the “peakedness” of a distribution • Higher peaks means fatter tails

  22. Why np? • Traditional assumption for normality with binomial is np > 5 or 10 • Skewness of binomial distribution increases as p moves away from .5 • Larger n is required for convergence for skewed distributions

  23. Cauchy Distribution • Has no moments (including mean, variance) • Distribution of averages looks like regular distribution • CLT does not apply

  24. Beta Distribution • α = β = 1/3 • Distribution is symmetric and bimodal • Convergence to normal is fast in averages

  25. Student’s t Distribution • Heavier-tailed, bell-shaped curve • Approaches normal distribution as degrees of freedom increase

  26. Criteria • 4 statistics: K-S distance, tail probabilities, skewness and kurtosis • Different thresholds for “adequate” and “superior” approximations • Both are fairly conservative

  27. Adequate Approximation

  28. Stronger Approximation

  29. Conclusions • Skewness is difficult to shake • Tail probabilities are fairly accurate for small sample sizes • Traditional recommendation is small for many common distributions

More Related