270 likes | 399 Views
More Chapter 2. The Normal Distribution. Normal distribution. The normal distribution is a continuous probability distribution. Actually there is a whole family of normal distributions, with each one being unique once we know the mean and standard deviation for each.
E N D
More Chapter 2 The Normal Distribution
Normal distribution The normal distribution is a continuous probability distribution. Actually there is a whole family of normal distributions, with each one being unique once we know the mean and standard deviation for each. When I say there is a family of distributions I mean there are many normal distributions. As an analogy, think about circles. There are many circles – different sizes. But they all have certain properties – each has a radius, for example. Knowing the radius helps you know more about the circle. Knowing the mean and standard deviation of a normal distribution helps you know more about the particular normal distribution. On the next slide I show how to think about a normal distribution based on changing a circle. This is not exact or true, but useful to some.
The Normal Distributions - normal dist. and density • Let’s label parts of the normal distributions. This point on the number line is directly below the inflection point. It turns out that the point on the number line is one standard deviation away from the center. s is the notation for the standard deviation. This point is where the bottom part of the circle flipped. Let’s call it the inflection point. There is one on the other side as well. number line for the variable- like test Score, for example This is the center of the distribution. It really is the mean value we call mu
The Normal Distributions - notation • In general, now, we will talk about a variable having a normal distribution. We will say variable X is normally distributed with mean mu and standard deviation s. • More simply, we say X is N(mu,s). • Don’t let the N(---) part fool you, it means N(mean value listed first, then standard deviation value listed).
The Normal Distributions - example with graphical thinking • Say we have a variable X is N(3, 1) Why is this dot, and the one across, above #’s 2 and 4? X is measured on the line 4 2 3 Use the dots as your guide to draw the normal dist. 3 is the mean
The Normal Distributions - another example with graphical thinking • Say we have a variable X is N(3, 2) Why is this dot, and the one across, above #’s 1 and 5? X is measured on the line 4 5 1 2 3 Use the dots as your guide to draw the normal dist. 3 is the mean
The Normal Distributions - compare the two examples • here is what the two examples look like, one on top of the other X is N(3, 1) X is N(3, 2) 4 5 1 2 3
The Normal Distributions - compare the two examples • Note on the previous screen how the X is N(3, 2) had its inflection points wider than on the X is N(3, 1). • Normal distributions have about 68% of the area under the curve between the two inflection points. There is more.
The Normal Distributions - 68-95-99.7 rule • On any normal distribution the inflection points will be 1 standard deviation on either side of the mean. 68% of the area under the curve will be within this one standard dev. • By moving out 1.96 standard deviations on either side of the mean you have 95% of the area under the curve. • By moving out 3 stand. dev.’s you have 99.7 % of the area under the curve.
The Normal Distributions - 68-95-99.7 rule • What is the meaning of the phrase, “1 standard deviation on either side of the mean?” • The answer is best seen by an example. X is (10, 2.5) means X is normal with mean 10 and standard deviation 2.5. Thus 10 - 2.5 = 7.5 is 1 stand. dev. on the low side of the mean and 10 + 2.5 = 12.5 is 1 stand. dev. on the high side of the mean.
The Normal Distributions - 68-95-99.7 rule • Now let’s ask a question about the variable X is N(3, 1). What % of the values are between 2 and 4?............. • The area under the normal curve is a relative frequency!!!! • The 68-95-99.7 rule is a relative frequency or probability rule for the normal distribution.
The Normal Distributions - 68-95-99.7 rule • If X is N(5, 2), what % of values is between 3 and 7?........................ • If X is N(7, 3), what % of values is between 4 and 10?........................
The Normal Distributions - magic • I picked points 3 and 7 for X is N(5, 2). Note (3 - 5)/2 = -1 and (7 - 5)/2 = 1. • I picked points 4 and 10 for X is N(7, 3). Note (4 - 7)/3 = -1 and (10 - 7)/3 = 1. • #’s 3 and 4 are 1 stand. dev. on the low side of their respective means, while 7 and 10 are 1 stand. dev. on the high side.
The Normal Distributions - magic number line where the variable is measured 3 5 7 9 If X is N(5, 2), then 3 is 1 standard deviation below the mean of 5 and 7 is 1 standard deviation above the mean. If X is N(7, 3), then 4 is 1 standard deviation below the mean of 10 and 7 is 1 standard deviation above the mean.
The Normal Distributions - Z score, The standard normal distribution • As we have seen, there are many normal distributions. Each distribution can be characterized by the mean and the standard deviation - X is (mean, std dev). • For any normal distribution we can transform the values of X into values that we call Z. Z values are standard dev. values of the original variable. • The Z formula is (X – mean)/std dev.
The Normal Distributions - the standard normal Z scores 1 0 Any normal distribution can be transformed into the standard normal distribution by the z transformation. In fact that is how we make probability statements about normal random variables.
The Normal Distributions - standard normal table • Look at a standard normal table, like in our book. • note the letter z in the upper left of the table. Z will be • in 3 digits, x.xx. A given Z is broken down into x.x + 0.0x. • x.x is down the column and 0.0x is across the top row. • The value in the table down x.x and over 0.0x is the • probability of having the values less than the z in question. In other words it is the area under the curve and less than Z.
Working with the table Z 0 1.79 In the table we see that if you have a Z of 1.79 then the area to the left of 1.79 and under the curve is .96327. This would mean there is a .96327 probability of getting a value less than 1.79. This also means there is a 1 - .96327 = .03673 probability of getting a value greater than 1.79.
Working with the table Z -1.79 0 What if we want the area to the left of -1.79? This is not in the table. But the distribution is symmetric. The means the area to the left of -1.79 is equal to the area to the right of 1.79. We just saw that as .03673.
Working with the table Z 0 .60 1.79 What if we want the area between .60 and 1.79? Take the area to the left of 1.79 and subtract the area to the left of .60!
Let’s consider an example to highlight some points. Say a company has developed a new tire for cars. In testing the tire it has been determined that the mean tire mileage is 36,500 miles and the standard deviation is 5000 miles. Along the horizontal axis we measure tire mileage. The normal distribution rises above the axis. Note the highest point of the curve occurs above the mean - in our tire example we would be at 36,500. On the curve we have two inflection points, and these occur 1 standard deviation away from the mean. So, mileages 31,500 and 41,500 are 1 standard deviation for the mean and the inflection points occur above them.
26,500 31,500 36,500 41,500 46,500 miles -2 -1 0 1 2 z Remember the concept of a z score from earlier. z = ( a value minus the mean)/standard deviation. So the value 26,500 has a z = (26,500 - 36,500)/5000 = -2. This means 26,500 is 2 standard deviations below the mean. You can check the other values.
The standard normal distribution Remember how we said there are many different circles and many different normal distribution? Sure you do. The z value translates any normally distributed variable into what is called the standard normal variable. Technically the picture I have on the previous screen is misleading because the z’s are a different scale than the miles, but don’t worry. A page of the book has a table with z values and areas under the curve. Let’s see how to use the table. Here is one place where I want you to be extra careful when you calculate z. Round z to 2 decimal places. The z value is broken up into two parts. For example the number 2.13 is broken up into 2.1 and .03
Using the standard normal table The z = 2.13 means we should go down the table to 2.1 and then over to .03. The number in the table is .9834. This means the probability of getting a value less than z = 2.13 is 98.34%. In the tire example if we look at the mean value 36,500, we see the z = (36,500 - 36,500)/5000 = 0.00 and in the table we see the value .5000. Thus, there is a 50% chance the tire mileage will be less than 36,500. So the table has the area under the curve to the left of the value of interest. Plus, the table only gives z values above the mean. We may want other z’s and other areas. What do we do?
Say we want the area to the right of a z that is greater than 0? The table has the area to the left. Whatever the z is, go into the table and get the area and then take 1 minus the area in the table. The z here would be negative. Say we want area b. Since the normal curve is symmetric, the area to the right of a negative z equals the area to the left of the positive z that is the absolute value of the negative z. b a Area a would be found in a similar way to what is above.
Back in the old days when I had to walk to school uphill both ways in three feet of snow, the standard normal table was all we had to calculate probabilities for a normal distribution. Now we have Microsoft Excel to make the calculations. The NORMSDIST function assumes we have a z value and we want to find the area the the left of the z - the area to the left is the cumulative probability. The function has the form =NORMSDIST(z), where z is the value we have. z can be negative in Excel. This is better than our table. The NORMDIST function allows us to just work with the variable without getting the z and we can still have the cumulative probability. The function has the form =NORMDIST(value, mean, standard deviation, TRUE). This is an innovation of Excel over the old days.
Sometimes we may have an area and want to know the z. The function NORMSINV asks us to give an area to the left of a value and the function will give us the z value. The form of the function is =NORMSINV(cumulative probability). The function NORMINV does the same, except not in z value form. It just give the value in the same form as the variable. The form of the function is =NORMINV(cumulative prob, mean, standard deviation)