1 / 31

Understanding Normal Distribution in Statistics

Learn about the Normal Distribution, a key concept in statistics, and how to calculate probabilities using Z-scores and standard deviations. Discover how to transform any normal distribution into a standard normal distribution for easier analysis.

eposey
Download Presentation

Understanding Normal Distribution in Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 4.1 - Probability Density Functions • 4.2 - Cumulative Distribution Functions and • Expected Values • 4.3 - The Normal Distribution • 4.4 - The Exponential and Gamma Distributions • 4.5 - Other Continuous Distributions • 4.6 - Probability Plots Chapter 4Continuous Random Variables and Probability Distributions

  2. ~ The Normal Distribution ~(a.k.a. “The Bell Curve”) • Symmetric, unimodal • Models many (but not all) natural systems • Mathematical properties make it useful to work with standard deviation X ~ N(μ, σ) Johann Carl Friedrich Gauss1777-1855 σ X μ mean

  3. Standard Normal Distribution Z ~ N(0, 1) SPECIAL CASE Total Area = 1 1 Z The cumulative distribution function(cdf) is denoted by (z). It is not expressible in explicit, closed form, but is tabulated, and computable in R via the command pnorm.

  4. Example Standard Normal Distribution Z ~ N(0, 1) Find (1.2) = P(Z 1.2). Total Area = 1 1 Z 1.2 “z-score”

  5. Example Standard Normal Distribution Z ~ N(0, 1) Find (1.2) = P(Z 1.2). • Use the included table. Total Area = 1 1 Z 1.2 “z-score”

  6. Lecture Notes Appendix…

  7. Example Standard Normal Distribution Z ~ N(0, 1) Find (1.2) = P(Z 1.2). • Use the included table. • Use R: • > pnorm(1.2) • [1] 0.8849303 Total Area = 1 1 0.88493 P(Z> 1.2) 0.11507 Z 1.2 “z-score” Note: Because this is a continuous distribution, P(Z = 1.2) = 0, so there is no difference between P(Z> 1.2) and P(Z 1.2), etc.

  8. X ~ N(μ, σ) Standard Normal Distribution Z ~ N(0, 1) σ μ 1 Z Why be concerned about this, when most “bell curves” don’t have mean = 0, and standard deviation = 1? Any normal distribution can be transformed to the standard normal distribution via a simple change of variable.

  9. Example POPULATION Question: What proportion of the population had their first child before the age of 27.2 years old? Random Variable X = Age at first birth P(X < 27.2) = ? Year 2010 X ~ N(25.4, 1.5) σ= 1.5 μ= 25.4 27.2

  10. Example POPULATION Question: What proportion of the population had their first child before the age of 27.2 years old? Random Variable X = Age at first birth P(X < 27.2) = ? The x-score = 27.2 must first be transformed to a corresponding z-score. Year 2010 X ~ N(25.4, 1.5) σ= 1.5 μ= μ= 25.4 μ= 25.4 27.2 33

  11. Example POPULATION Question: What proportion of the population had their first child before the age of 27.2 years old? Random Variable X = Age at first birth P(Z < 1.2) = P(X < 27.2) = ? 0.88493 Year 2010 X ~ N(25.4, 1.5) σ= 1.5 • Using R: • > pnorm(27.2, 25.4, 1.5) • [1] 0.8849303 μ= μ= 25.4 27.2 33

  12. Standard Normal Distribution Z ~ N(0, 1) 1 Z What symmetric interval about the mean 0 contains 95% of the population values? That is…

  13. Standard Normal Distribution Z ~ N(0, 1) • Use the included table. 0.95 0.025 0.025 Z -z.025 = ? +z.025 = ? What symmetric interval about the mean 0 contains 95% of the population values? That is…

  14. Lecture Notes Appendix…

  15. Standard Normal Distribution Z ~ N(0, 1) • Use the included table. • Use R: • > qnorm(.025) • [1] -1.959964 • > qnorm(.975) • [1] 1.959964 0.95 0.025 0.025 Z -z.025 = ? “.025 critical values” +z.025 = +1.96 +z.025 = ? -z.025 = -1.96 What symmetric interval about the mean 0 contains 95% of the population values?

  16. Standard Normal Distribution Z ~ N(0, 1) X ~ N(μ, σ) X ~ N(25.4, 1.5) What symmetric interval about the mean age of 25.4 contains 95% of the population values? 22.46  X  28.34 yrs > areas = c(.025, .975) > qnorm(areas, 25.4, 1.5) [1] 22.46005 28.33995 0.95 0.025 0.025 Z -z.025 = ? “.025 critical values” +z.025 = +1.96 +z.025 = ? -z.025 = -1.96 What symmetric interval about the mean 0 contains 95% of the population values?

  17. Standard Normal Distribution Z ~ N(0, 1) • Use the included table. 0.90 0.05 0.05 Z -z.05 = ? +z.05 = ? Similarly… What symmetric interval about the mean 0 contains 90% of the population values?

  18. …so average 1.64 and 1.65 0.95  average of 0.94950 and 0.95053…

  19. Standard Normal Distribution Z ~ N(0, 1) • Use the included table. • Use R: • > qnorm(.05) • [1] -1.644854 • > qnorm(.95) • [1] 1.644854 0.90 0.05 0.05 Z -z.05 = ? -z.05 = -1.645 +z.05 = +1.645 +z.05 = ? “.05 critical values” Similarly… What symmetric interval about the mean 0 contains 90% of the population values?

  20. Standard Normal Distribution Z ~ N(0, 1) In general…. 1 –  0.90 0.05  / 2  / 2 0.05 Z -z / 2 -z.05 = ? -z.05 = -1.645 +z / 2 +z.05 = +1.645 +z.05 = ? “ / 2 critical values” “.05 critical values” Similarly… What symmetric interval about the mean 0 contains 100(1 – )% of the population values?

  21. continuous discrete Normal Approximation to the Binomial Distribution Suppose a certain outcome exists in a population, with constant probability. We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., nBernoulli trials (e.g., coin tosses). Discrete random variable X = # Successes in sample (0, 1, 2, 3, …,, n) Discrete random variable X = # Successes in sample (0, 1, 2, 3, …,, n) P(Success) =  P(Failure) = 1 –  Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” p(x) = , x = 0, 1, 2, …, n.

  22. > dbinom(10, 100, .2) [1] 0.00336282 Area

  23. > pbinom(10, 100, .2) [1] 0.005696381 Area

  24. Therefore, if… • X ~ Bin(n, ) with n  15 and n (1 – )  15, • then… That is… “Sampling Distribution” of

  25. Classical Continuous Probability Distributions • Normal distribution • Log-Normal ~ X is not normally distributed (e.g., skewed), butY = “logarithm of X” is normally distributed • Student’s t-distribution ~ Similar to normal distr, more flexible • F-distribution ~ Used when comparing multiple group means • Chi-squared distribution ~ Used extensively in categoricaldata analysis • Others for specialized applications ~ Gamma, Beta, Weibull…

More Related