430 likes | 635 Views
Topics Covered . Discrete probability distributions The Uniform Distribution The Binomial Distribution The Poisson Distribution Each is appropriately applied in certain situations and to particular phenomena. Continuous Random Variables.
E N D
Topics Covered • Discrete probability distributions • The Uniform Distribution • The Binomial Distribution • The Poisson Distribution • Each is appropriately applied in certain situations and to particular phenomena
Continuous Random Variables • Continuous random variable can assume all real number values within an interval (e.g., rainfall, pH) • Some random variables that are technically discrete exhibit such a tremendous range of values, that is it desirable to treat them as if they were continuous variables, e.g. population • Continuous random variables are described by probability density functions
f(x) area=1 b a x Probability Density Functions • Probability density functions are defined using the same rules required of probability mass functions, with some additional requirements: • The function must have a non-negative value throughout the interval a to b, i.e. f(x) >= 0 for a <= x <= b • The area under the curve defined by f(x), within the interval a to b, must equal 1
The Normal Distribution • The most common probability distribution is the normal distribution • The normal distribution is a continuous distribution that is symmetric and bell-shaped Source: http://mathworld.wolfram.com/NormalDistribution.html
The Normal Distribution • Most naturally occurring variables are distributed normally (e.g. heights, weights, annual temperature variations, test scores, IQ scores, etc.) • A normal distribution can also be produced by tracking the errors made in repeated measurements of the same thing; Karl FriedrichGauss was a 19th century astronomer who found that the distribution of the repeated errors of determining the position of the same star formed a normal (or Gaussian) distribution
The Normal Distribution • The probability density function of the normal distribution: • You can see how the value of the distribution at x is a f(x) of the mean and standard deviation
The probability density function of a normal distribution approximating the probability mass function of a binomial distribution Source: http://en.wikipedia.org/wiki/Normal_Distribution
The Normal Distribution • As with all frequency distributions, the area under the curve between any two x values corresponds to the probability of obtaining an x value in that range • The total area under the curve is equal to one Source: http://en.wikipedia.org/wiki/Normal_Distribution
c d f(x) x The Normal Distribution • The way we use the normal distribution is to find the probability of a continuous random variable falling within a range of values ([c, d]) The probability of the variable between c and d is the area under the curve • The areas under normal curves are given in tables such as that found in Table A.2 in Appendix A. Textbook (Rogerson)
Normal Tables • Variables with normal distributions may have an infinite number of possible means and standard deviations Source: http://en.wikipedia.org/wiki/Normal_distribution
Normal Tables • Variables with normal distributions may have an infinite number of possible means and standard deviations • Normal tables are standardized (a standard normal distribution with a mean of zero and a standard deviation of one) • Before using a normal table, we must transform our data to a standardized normal distribution
Standardization of Normal Distributions • The standardization is achieved by converting the data into z-scores z-score • The z-score is the means that is used to transform our normal distribution into a standard normal distribution ( m = 0 & s = 1)
Standardization Original data: Mean = 59.70 Standard deviation = 12.97 Z-score: Mean = 0 Standard deviation = 1
x - m x - 55 Z-score = = s 20 90 - 55 35 x - m Z-score = = = = 1.75 s 20 20 Standardization of Normal Distributions • Example I – A data set with m = 55 and s = 20: • If one of our data values x = 90 then: • Using z-scores in conjunction with standard normal tables (like Table A.2 on page 214 of Rogerson) we can look up areas under the curve associated with intervals, and thus probabilities (P(X>1.75) or P(X<1.75))
Look Up Standard Normal Tables • Using our example z-score of 1.75, we find the position of 1.75 in the table and use the value found there
f(x) .0401 P(X > 90) = 0.0401 P(X <= 90) = 0.9599 μ = 55 +90 Look Up Standard Normal Tables f(x) .0401 P(Z > 1.75) = 0.0401 P(Z <= 1.75) = 0.9599 μ = 0 +1.75
x - m x - 55 Z-score = = s 20 20 - 55 -35 x - m Z-score = = = = -1.75 s 20 20 Standardization of Normal Distributions • Example II – If we have a data set with m = 55 and s = 20, we calculate z-scores using: • If one of our data values x = 20 then: • Using z-scores in conjunction with standard normal tables (like Table A.2 on page 214 of Rogerson) we can look up areas under the curve associated with intervals, and thus probabilities (P(X>20) or P(X<=20))
f(x) -1.75 μ = 0 +1.75
Look Up Standard Normal Tables • Using our example z-score of -1.75, we find the position of 1.75 in the table and use the value found there; because the normal distribution is symmetric the table does not need to repeat positive and negative values
f(x) 4.01% (.0401) 4.01% (.0401) -1.75 μ = 0 +1.75 Look Up Standard Normal Tables
Look Up Standard Normal Tables f(x) P(Z <= -1.75) = 0.0401 4.01% (.0401) P(Z > -1.75) = 0.9599 -1.75 μ = 0 c d P(X <= 20) = 0.0401 4.01% (.0401) f(x) P(X > 20) = 0.9599 μ = 55
3. • P(0 £ Z £a) = [0.5 – (table value)] • Total Area under the curve = 1, thus the area above x is equal to 0.5, and we subtract the area of the tail a 2. a Finding the P(x) for Various Intervals 1. • P(Z ³a) = (table value) • Table gives the value of P(x) in the tail above a a • P(Z £a) = [1 – (table value)] • Total Area under the curve = 1, and we subtract the area of the tail
6. a 5. a Finding the P(x) for Various Intervals 4. • P(Z £a) = (table value) • Table gives the value of P(x) in the tail below a, equivalent to P(Z ³a) when a is positive a • P(Z ³a) = [1 – (table value)] • This is equivalent to P(Z £a) when a is positive • P(a£ Z £ 0) = [0.5 – (table value)] • This is equivalent to P(0 £ Z £ a) when a is positive
7. b a Finding the P(x) for Various Intervals P(a £ Z £ b) if a < 0 and b > 0 = (0.5 – P(Z<a)) + (0.5 – P(Z>b)) = 1 – P(Z<a) – P(Z>b) or = [0.5 – (table value for a)] + [0.5 – (table value for b)] = [1 – {(table value for a) + (table value for b)}] • With this set of building blocks, you should be able to calculate the probability for any interval using a standard normal table
Finding the P(x) – Example • Suppose we are in charge of buying stock for a shoe store. We will assume that the distribution of shoe sizes is normally distributed for a gender. Let’s say the mean of women’s shoe sizes is 20 cm and the standard deviation is 5 cm • If our store sells 300 pairs of a popular style each week, we can make some projections about how many pairs we need to stock of a given size, assuming shoes fit feet that are +/- 0.5 cm of the length of the shoe (because of course we must use intervals here!)
24.5 - 20 25.5 - 20 5.5 4.5 x - m x - m Zlower = Zupper = = = = = = 1.1 = 0.9 s s 5 5 5 5 Finding the P(x) – Example • 1. How many pairs should we stock of the 25 cm size? • We first need to convert this to a range according to the assumption specified above, since we can only evaluate P(x) over an interval P(24.5 cm ≤ x ≤ 25.5 cm) is what we need to find • Now we need to convert the bounds of our interval into z-scores:
0.9 1.1 Finding the P(x) – Example • 1. How many pairs should we stock of the 25 cm size? • We now have our interval expressed in terms of z-scores as P(0.9 ≤ Z ≤ 1.1) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding P(0.9£Z £1.1) = P(Z³0.9) - P(Z³1.1)
0.9 1.1 Finding the P(x) – Example • 1.How many pairs should we stock of the 25 cm size? • We now have our interval expressed in terms of z-scores as P(0.9 ≤ Z ≤ 1.1) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding P(0.9£Z £1.1) = P(Z³0.9) - P(Z³1.1) • = 0.1841 – 0.1357 • = 0.0484 • Now multiply our P(0.9 ≤ Z ≤ 1.1) by total sales per week to get the number of shoes we should stock in that size: 300 х 0.0484 = 14.52 ≈ 15
17.5 - 20 18.5 - 20 -1.5 -2.5 x - m x - m Zlower = Zupper = = = = = = -0.3 = -0.5 s s 5 5 5 5 Finding the P(x) – Example • 2. How many pairs should we stock of the 18 cm size? (assuming shoes fit feet that are +/- 0.5 cm of the length of the shoe) • Again, we convert this to a range according to the assumption specified above, since we can only evaluate P(x) over an interval P(17.5 cm ≤ x ≤ 18.5 cm) is what we need to find • Now we need to convert the bounds of our interval into z-scores:
-0.3 -0.5 Finding the P(x) – Example • 2. How many pairs should we stock of the 18 cm size? • We now have our interval expressed in terms of z-scores as P(-0.5 ≤ Z ≤ -0.3) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding • P(-0.5£Z£-0.3) = P(Z£-0.3) - P(Z£-0.5)
-0.3 -0.5 Finding the P(x) – Example • 2. How many pairs should we stock of the 18 cm size? • We now have our interval expressed in terms of z-scores as P(-0.5 ≤ Z ≤ -0.3) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding • P(-0.5£Z£-0.3) = P(Z£-0.3) - P(Z£-0.5) • = 0.3821 – 0.3085 • = 0.0736 • Now multiply our P(-0.5 ≤ Z ≤ -0.3) by total sales per week to get the number of shoes we should stock in that size: 300 х 0.0736 = 22.08 ≈ 22
17.5 - 20 25.5 - 20 5.5 -2.5 x - m x - m Zlower = Zupper = = = = = = 1.1 = -0.5 s s 5 5 5 5 Finding the P(x) – Example • 3. How many pairs should we stock in the 18 to25 cm size range? • As always, we convert this to a range according to the assumption specified above, since we can only evaluate P(x) over an interval P(17.5 cm ≤ x ≤ 25.5 cm) is what we need to find • We have already found the appropriate Z-scores
-0.5 1.1 Finding the P(x) – Example • 3. How many pairs should we stock in the 18 to 25 cm size range? • We now have our interval expressed in terms of z-scores as P(-0.5 ≤ Z ≤ 1.1) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding • P(-0.5£Z£1.1) = 1 – [P(Z£-0.5) + P(Z³1.1)] • = 1 – [0.3085 + 0.1357] • = 1 – 0.4442 • = 0.5558 • Now multiply our P(-0.5 ≤ Z ≤ 1.1) by total sales per week to get the # of shoes we should stock in that range: 300 х 0.5558 = 166.74 ≈ 167
99.7% 95% 68% Commonly Used Probabilities f(x) -3σ -2σ -1σμ+1σ+2σ +3σ
Commonly Used Probabilities • For a normal distribution, 68% of the observations lie within about one standard deviations of the mean f(x) -σμ +σ • Standard normal distribution: • P(z > 1) = 0.1587 p(-1 ≤ z ≤ 1) = 1 – 2*0.1587 = 0.6826 • Normaldistribution as illustrated in the Figure: • P(μ-σ≤ x ≤ μ-σ) = 0.6826 ≈ 0.68
Commonly Used Probabilities • For a normal distribution, 95% of the observations lie within about two standard deviations of the mean f(x) -2σμ +2σ • Standard normal distribution: • P(z > 2) = 0.0228 p(-2 ≤ z ≤ 2) = 1 – 2*0.0228 = 0.9543 • Normaldistribution as illustrated in the Figure: • P(μ-2σ≤ x ≤ μ-2σ) = 0.9543 ≈ 0.95
Commonly Used Probabilities • For a normal distribution, 99.7% of the observations lie within about two standard deviations of the mean f(x) -3σμ +2σ • Standard normal distribution: • P(z > 3) = 0.0013 p(-3 ≤ z ≤ 3) = 1 – 2*0.0013 = 0.9974 • Normaldistribution as illustrated in the Figure: • P(μ-3σ≤ x ≤ μ-3σ) = 0.9974 ≈ 0.997
99.7% 95% 68% Commonly Used Probabilities f(x) -3σ -2σ -1σμ+1σ+2σ +3σ