1 / 68

Continuous Probability Distributions: The Normal Distribution

Continuous Probability Distributions: The Normal Distribution. Towards the Meaning of Continuous Probability Distribution Functions:. When we introduced probabilities, we spoke of discrete events: S = collection of all possible sample points e i

Download Presentation

Continuous Probability Distributions: The Normal Distribution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Continuous Probability Distributions: The Normal Distribution

  2. Towards the Meaning of Continuous Probability Distribution Functions: When we introduced probabilities, we spoke of discrete events: S = collection of all possible sample points ei 0 £ P(ei) £ 1  Probability of any event is between zero and one P(ei) = 1  Probability of all elementary events sum to 1 (something happens)

  3. In particular, for the binomial distribution: • For the random variable X: • x stands for a particular value  The probability that the random variable X takes the value x is between 0 and 1, inclusive.  The sum of the probabilities over all possible values of x is 1.

  4. A continuous variable has infinitely many possible values: With infinitely many possible values, the probability of observing any one particular value is essentially zero: [Pr(X=x)] = 0 e.g., for x=1.0 vs 1.02 vs 1.0195 vs 1.01947, … Pr(X=x) is meaningless for a continuous random variable – Instead, we consider a range of values for X: Pr(aX b) We can make this range quite broad or very narrow

  5. Comparing Probability Distributions for Discrete vs Continuous Random Variables We need new notation to describe probability distributions for continuous variables. Discrete Continuous List all possible sample points, e.g., S={ei}, i=1 to k. State the range of of possible values of X; e.g., Note:  is the symbol for ‘infinity’

  6. For a continuous Random Variable, X, • P(X=x) = 0 • Instead, we compute the probability of X within some interval: This function is the probability density function of X. Don’t worry – if you don’t know or have forgotten calculus, I won’t be asking you to work with this notation.

  7. Much of statistical inference is based upon a particular choice of a probability density function, fx(x) – • The Normal distribution. • This function is a mathematical model describing one particular pattern of variation of values. • It is appropriate for continuous variables only.

  8. Practically speaking, the normal distribution function is appropriate for: • Many phenomena that occur naturally. • Special cases of other phenomena. e.g., averages of phenomena that, individually are not normally distributed. • For example, the sampling distribution of means may follow a normal distribution even when the underlying data do not.

  9. The Normal Probability Density Function Features to note: • The range of X is –  to  • p is the mathematical constant 3.14159… • e is the mathematical constant 2.71828…

  10. The Normal Probability Density Function Features to note: • m is the mean of the distribution • s is the standard deviation of the distribution • s2 is the variance • (x – m)2 the squared deviation from the mean appears in the function

  11. Notation: • X ~ N(m,s2) • We say • “X follows a Normal Distribution with mean m and variance s2 ” • or • “X is Normally distributed with mean mand variance s2 ”

  12. A Picture of the Normal Distribution x The infamous “Bell-shaped Curve”

  13. There are infinitely many normal distributions, each determined by different values of  and 2. • The Shape of the Normal Distribution is characteristically • Smooth • Defined everywhere on the real axis • Bell-shaped • Symmetric about the mean =  (it is defined in terms of deviations about the mean)

  14. x The area under the curve represents probability, and the total area under the curve = 1

  15. Pr[X < x] m x - The area under the curve up to the value x is often represented by the notation:

  16. A Feeling for the Shape of the Normal distribution:  locates the center, and  measures the spread

  17. IF  alone is changed – by adding a constant c, • the entire curve is shifted in location • but the shape remains the same.

  18. IF s alone is changed – by multiplying by a constant c • the shape of the bell is changed • a larger variance implies a wider spread (or flatter curve) – the area under the curve is always 1 c

  19. Picturing the Normal Probability Density x • As the variance, 2,increases: • Bell flattens (gets wide) • Values close to the mean are less likely • Values farther from the mean more likely. • As the variance decreases: • Bell narrows • Most values are close to the mean • Values close to the mean are more likely

  20. A Very Handy Rough Rule of Thumb: If X follows a Normal Distribution Then: ~68% of the values of X are in the interval m s 68% m - s m + s

  21. m + 1.96s m - 1.96s m - 2.58s m + 2.58s If X follows a Normal Distribution Then: ~95% of the values of X are in the interval m 1.96s ~99% of the values of X are in the interval m 2.576s

  22. Why is the Normal Distribution So Important? • There are two types of data that follow a normal distribution: • A number of naturally occurring phenomena: • For example : • heights of men (or women) • total blood cholesterol of adults • Special functions of some non-normally distributed phenomena, in particular sums and averages: • The sampling distribution of sample means tends to be ~ Normal.

  23. Research often focuses on sample means Example: Blood pressure can vary with time of day, stress, food, illness, etc. One reading may not be a good representation of “typical” Distribution of a single reading of blood pressure for an individual – tends to be skewed, with a few high values

  24. To have a better gauge of an individual’s BP, we might use the average of 5 readings: Sampling Distribution of mean of 5 readings for an individual – tends to be ~ Normal, even when the original distribution is not

  25. 2 3 4 5 6 7 8 9 10 11 12 • A Feeling for the Central Limit Theorem. • Shake a pair of die. • On each roll, note the total of the two die faces. • This total can range from 2 to 12. • The most likely total is 7. (Why?) • How often do the other totals arise? Histogram of die totals for n=100 trials of rolling die pair

  26. 2 3 4 5 6 7 8 9 10 11 12 Histogram of die totals for n=1000 trials of rolling die pair As the sample size n increases the distribution of the sum of the 2 die begins to look more and more normal.

  27. A Statement of the Central Limit Theorem: • For any population with • mean  and finite variance 2, • the sampling distribution of means, x, • from samples of size n from this population, • will be approximately normally distributed • with mean , • and variance 2/n, • for n large. • That is, for n large, and X ~ ?? (, 2) • then Xn ~ N(, 2/n)

  28. This is the main reason for our interest in the normal distribution: • regardless of the underlying distribution • if we take a large enough sample • we can make probability statements about means from such samples • based upon the normal distribution. • This is true, even when the underlying distribution is discrete.

  29. Example: The Central Limit Theorem Works even for VERY non-normal data: A population has only 3 outcomes in it: 1 2 9 P(X=x) 1/3 1 2 9 X { { } : m=4 1, 2, 9 } = 12 sum of 1, 2, 9 mean of { } standard deviation of 1, 2, 9 s=3.6

  30. Experiment: Take sample of size n with replacement. Compute sum of all n. Repeat… Look at Sampling Distribution of Sums n=25 n=50 n=100

  31. To compute probabilities for a normal distribution. • Recall that we are looking at intervalsof values of the random variable, X. • The probability that X has a value in the interval between a and b is the area under the curve corresponding to that interval: Note: since Pr(X=a) or any exact value is zero, this can be written as Pr(aXb) or Pr(a<X<b) a b

  32. The symmetry of the normal distribution can also help in computing probabilities. • The normal distribution is symmetric about the mean µ. • This tells us that the probability of a value less than the mean is .5 or 50%, • and the probability of a value greater than the mean is also .5 or 50% 0.5 0.5

  33. The Standard Normal Distribution The standard normal distribution is just one of infinitely many possible normal distributions. It has mean: m = 0 variance: s2 = 1 =1 =0 By convention we let the letter Z represent a random variable that is distributed Normally with m=0 and s2=1: Z ~ N(0,1)

  34. The standard normal distribution is important for several reasons: • Probabilities of Z within any interval have been computed and tabulated. • It is possible to look up Pr(a  Z  b) for any values of a and b in such tables. • Any other normal distribution can be transformed to a standard normal for computing probabilities. • Distances from the mean are equivalent to number of standard deviations from the mean. • This last is perhaps of greatest interest to us, now that software does much of the transformation and computation for us.

  35. Table 3 in the Appendix of Rosner gives areas under the normal curve, in 4 different ways: • Column A gives values between – ¥ and z, where z is a particular value of the standard normal distribution.(Note: Rosner uses X rather than Z) • That is, column A gives values for Pr(– ¥ Z  z) = Pr(Z  z)z is also known as a standard normal deviate. Pr[Z < z] 0 z -

  36. Table 3 in the Appendix of Rosner: • Column B gives values between z and ¥Pr(z  Z  ¥) = Pr(z  Z) = Pr(Z z) • Column C gives values between 0 and z • Pr(0  Z  z) • Column D gives values between -z and z Pr(-z  Z  z) 0 z 0 z -z 0 z

  37. A probability calculation for any random variable, X~Normal (,2) can be re- expressed as an equivalent probability calculation for a standard Normal (0,1).This is nice because • we have tables for probabilities of the Normal (0,1) distribution. • We can interpret probabilities in terms of # of std deviations from the mean • Of course, we can also use computer programs to compute probabilities for any Normal Distribution – the program does the translation for us.

  38. The Normal (0,1) or Standard Normal Table. Positive values of z are read from the first column (under x in Rosner) The shaded area, which is the probability of Z  z, is shown under Col A of the table: Pr(Z < 0.31) = .6217 z A B C D 0.0 .5000 .5000 .0 .0 0.01 .5040 .4960 .0040 .0080 … 0.30 .6179 .3821 .1179 .2358 0.31 .6217 .3783 .1217 .2434 A check that this makes sense: any positive value of z is above the mean, and should have a probability > .5 Pr[Z < 0.31] z 0.31 0

  39. Note that only positive values of z are tabulated. • We can take advantage of a few important features of the standard normal, to compute probabilities for values of z less than zero: • Symmetry  Pr(Z  -z) = Pr(Z  z) • Zero is the median  Pr(Z  0) = Pr(Z  0) = .50 • Total area is 1  Pr(Z  z) + Pr(Z  z) = 1

  40. For example, we cannot read Pr(Z < -0.31) directly from the tables. We can, however use the property of symmetry: Use the property of symmetry to get this. We can read this probability from Col B Pr(Z > 0.31) = .3783 Pr(Z <- 0.31) = .3783 z = 0.31 z = - 0.31

  41. -z 0 z

  42. Example Word Problem What is the probability of a value of Z more than 1 standard deviation below the mean? Solution: Since m = 0 and s = 1 1 standard deviation below the mean is z = m - (1x s) = 0 - 1 = -1 Pr(Z<-1) = 0.1587 -1 0 The probability of observing a value more than 1 standard deviation below the mean is .1587, or just under 16%.

  43. Example: What is the probability Z is between –1.5 and 1.5? We can read this from Column D of the Table in Rosner: Pr[-1.50 Z  1.50] from the table: 0.8664 Example: What is the probability of Z more than 1.5 standard deviations from the mean in either direction? Since probabilities sum to 1: Pr[ Z-1.50 or 1.50 Z ] = 1 – 0.8664 = 0.1336 By symmetry, half of this or 0.0668 lies at either end. .0668 .0668 -1.50 0 1.50

  44. Exercise Find the area under the standard normal curve between Z = +1 and Z = +2 Solution. It helps to draw pictures! 0 1 2 0 2 0 1 Pr(1<Z<2) = Pr(Z<2) - Pr(Z<1) = 0.9772 - 0.8413 = 0.1359

  45. Notes on using Standard Normal Tables: • These come in a variety of formats. The examples given here are for the version seen in Rosner, Table 3 in the Appendix. • Look at the accompanying picture of the distribution to be clear what probability is listed in the body of the table. • Draw a sketch (paper and pencil) when computing probabilities – it always helps you keep track of what you are doing. • Minitab provides the same probabilities as Column A: Pr(X<x), when Cumulative Probability is selected

  46. Using Minitab: Calc  Probability Distributions  Normal Select for Pr(Z<z) or Pr(X<x) Enter value of z (or x)

  47. Finding Percentiles of the Normal Distribution Example: What is the 75th percentile of N(0,1) ? Solution: Again, it helps to draw a picture! 0.75 0 z.75 We want the area under the curve to be 75% -- The value of z we want is the value, below which 75% of values are found. That is, find z.75 so that Pr(Z < z.75) = .75

  48. Use the Inverse Cumulative Option in Minitab Input desired percentile Inverse Cumulative Distribution Function Normal with mean = 0 and standard deviation = 1.00000 P( X <= x) x 0.7500 0.6745

  49. Standardizing a Normal Random Variate: From N(m,s2) to N(0,1) We can transform any Normal distribution to a standard normal by means of a simple transformation: 

  50. Standardizing a Normal Random Variate: From N(m,s2) to N(0,1) Adding a constant: For X~N(m,s2) (X+b) ~ N(?,?) The mean is shifted over ‘b’ units, but the variance or spread of the data is unchanged by adding a constant: (X+b) ~ N(m+b, s2)

More Related