Topics Covered

Topics Covered • Discrete probability distributions • The Uniform Distribution • The Binomial Distribution • The Poisson Distribution • Each is appropriately applied in certain situations and to particular phenomena

Continuous Random Variables • Continuous random variable can assume all real number values within an interval (e.g., rainfall, pH) • Some random variables that are technically discrete exhibit such a tremendous range of values, that is it desirable to treat them as if they were continuous variables, e.g. population • Continuous random variables are described by probability density functions

f(x) area=1 b a x Probability Density Functions • Probability density functions are defined using the same rules required of probability mass functions, with some additional requirements: • The function must have a non-negative value throughout the interval a to b, i.e. f(x) >= 0 for a <= x <= b • The area under the curve defined by f(x), within the interval a to b, must equal 1

The Normal Distribution • The most common probability distribution is the normal distribution • The normal distribution is a continuous distribution that is symmetric and bell-shaped Source: http://mathworld.wolfram.com/NormalDistribution.html

The Normal Distribution • Most naturally occurring variables are distributed normally (e.g. heights, weights, annual temperature variations, test scores, IQ scores, etc.) • A normal distribution can also be produced by tracking the errors made in repeated measurements of the same thing; Karl FriedrichGauss was a 19th century astronomer who found that the distribution of the repeated errors of determining the position of the same star formed a normal (or Gaussian) distribution

The Normal Distribution • The probability density function of the normal distribution: • You can see how the value of the distribution at x is a f(x) of the mean and standard deviation

Source: http://en.wikipedia.org/wiki/Normal_distribution

The Poisson Distribution & The Normal Distribution λ

The probability density function of a normal distribution approximating the probability mass function of a binomial distribution Source: http://en.wikipedia.org/wiki/Normal_Distribution

The Normal Distribution • As with all frequency distributions, the area under the curve between any two x values corresponds to the probability of obtaining an x value in that range • The total area under the curve is equal to one Source: http://en.wikipedia.org/wiki/Normal_Distribution

c d f(x) x The Normal Distribution • The way we use the normal distribution is to find the probability of a continuous random variable falling within a range of values ([c, d]) The probability of the variable between c and d is the area under the curve • The areas under normal curves are given in tables such as that found in Table A.2 in Appendix A. Textbook (Rogerson)

Normal Tables • Variables with normal distributions may have an infinite number of possible means and standard deviations Source: http://en.wikipedia.org/wiki/Normal_distribution

Normal Tables • Variables with normal distributions may have an infinite number of possible means and standard deviations • Normal tables are standardized (a standard normal distribution with a mean of zero and a standard deviation of one) • Before using a normal table, we must transform our data to a standardized normal distribution

Standardization of Normal Distributions • The standardization is achieved by converting the data into z-scores z-score • The z-score is the means that is used to transform our normal distribution into a standard normal distribution ( m = 0 & s = 1)

Standardization Original data: Mean = 59.70 Standard deviation = 12.97 Z-score: Mean = 0 Standard deviation = 1

x - m x - 55 Z-score = = s 20 90 - 55 35 x - m Z-score = = = = 1.75 s 20 20 Standardization of Normal Distributions • Example I – A data set with m = 55 and s = 20: • If one of our data values x = 90 then: • Using z-scores in conjunction with standard normal tables (like Table A.2 on page 214 of Rogerson) we can look up areas under the curve associated with intervals, and thus probabilities (P(X>1.75) or P(X<1.75))

Look Up Standard Normal Tables • Using our example z-score of 1.75, we find the position of 1.75 in the table and use the value found there

f(x) .0401 P(X > 90) = 0.0401 P(X <= 90) = 0.9599 μ = 55 +90 Look Up Standard Normal Tables f(x) .0401 P(Z > 1.75) = 0.0401 P(Z <= 1.75) = 0.9599 μ = 0 +1.75

x - m x - 55 Z-score = = s 20 20 - 55 -35 x - m Z-score = = = = -1.75 s 20 20 Standardization of Normal Distributions • Example II – If we have a data set with m = 55 and s = 20, we calculate z-scores using: • If one of our data values x = 20 then: • Using z-scores in conjunction with standard normal tables (like Table A.2 on page 214 of Rogerson) we can look up areas under the curve associated with intervals, and thus probabilities (P(X>20) or P(X<=20))

f(x) -1.75 μ = 0 +1.75

Look Up Standard Normal Tables • Using our example z-score of -1.75, we find the position of 1.75 in the table and use the value found there; because the normal distribution is symmetric the table does not need to repeat positive and negative values

f(x) 4.01% (.0401) 4.01% (.0401) -1.75 μ = 0 +1.75 Look Up Standard Normal Tables

Look Up Standard Normal Tables f(x) P(Z <= -1.75) = 0.0401 4.01% (.0401) P(Z > -1.75) = 0.9599 -1.75 μ = 0 c d P(X <= 20) = 0.0401 4.01% (.0401) f(x) P(X > 20) = 0.9599 μ = 55

3. • P(0 £ Z £a) = [0.5 – (table value)] • Total Area under the curve = 1, thus the area above x is equal to 0.5, and we subtract the area of the tail a 2. a Finding the P(x) for Various Intervals 1. • P(Z ³a) = (table value) • Table gives the value of P(x) in the tail above a a • P(Z £a) = [1 – (table value)] • Total Area under the curve = 1, and we subtract the area of the tail

6. a 5. a Finding the P(x) for Various Intervals 4. • P(Z £a) = (table value) • Table gives the value of P(x) in the tail below a, equivalent to P(Z ³a) when a is positive a • P(Z ³a) = [1 – (table value)] • This is equivalent to P(Z £a) when a is positive • P(a£ Z £ 0) = [0.5 – (table value)] • This is equivalent to P(0 £ Z £ a) when a is positive

7. b a Finding the P(x) for Various Intervals P(a £ Z £ b) if a < 0 and b > 0 = (0.5 – P(Z<a)) + (0.5 – P(Z>b)) = 1 – P(Z<a) – P(Z>b) or = [0.5 – (table value for a)] + [0.5 – (table value for b)] = [1 – {(table value for a) + (table value for b)}] • With this set of building blocks, you should be able to calculate the probability for any interval using a standard normal table

Finding the P(x) – Example • Suppose we are in charge of buying stock for a shoe store. We will assume that the distribution of shoe sizes is normally distributed for a gender. Let’s say the mean of women’s shoe sizes is 20 cm and the standard deviation is 5 cm • If our store sells 300 pairs of a popular style each week, we can make some projections about how many pairs we need to stock of a given size, assuming shoes fit feet that are +/- 0.5 cm of the length of the shoe (because of course we must use intervals here!)

24.5 - 20 25.5 - 20 5.5 4.5 x - m x - m Zlower = Zupper = = = = = = 1.1 = 0.9 s s 5 5 5 5 Finding the P(x) – Example • 1. How many pairs should we stock of the 25 cm size? • We first need to convert this to a range according to the assumption specified above, since we can only evaluate P(x) over an interval P(24.5 cm ≤ x ≤ 25.5 cm) is what we need to find • Now we need to convert the bounds of our interval into z-scores:

0.9 1.1 Finding the P(x) – Example • 1. How many pairs should we stock of the 25 cm size? • We now have our interval expressed in terms of z-scores as P(0.9 ≤ Z ≤ 1.1) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding P(0.9£Z £1.1) = P(Z³0.9) - P(Z³1.1)

0.9 1.1 Finding the P(x) – Example • 1.How many pairs should we stock of the 25 cm size? • We now have our interval expressed in terms of z-scores as P(0.9 ≤ Z ≤ 1.1) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding P(0.9£Z £1.1) = P(Z³0.9) - P(Z³1.1) • = 0.1841 – 0.1357 • = 0.0484 • Now multiply our P(0.9 ≤ Z ≤ 1.1) by total sales per week to get the number of shoes we should stock in that size: 300 х 0.0484 = 14.52 ≈ 15

17.5 - 20 18.5 - 20 -1.5 -2.5 x - m x - m Zlower = Zupper = = = = = = -0.3 = -0.5 s s 5 5 5 5 Finding the P(x) – Example • 2. How many pairs should we stock of the 18 cm size? (assuming shoes fit feet that are +/- 0.5 cm of the length of the shoe) • Again, we convert this to a range according to the assumption specified above, since we can only evaluate P(x) over an interval P(17.5 cm ≤ x ≤ 18.5 cm) is what we need to find • Now we need to convert the bounds of our interval into z-scores:

-0.3 -0.5 Finding the P(x) – Example • 2. How many pairs should we stock of the 18 cm size? • We now have our interval expressed in terms of z-scores as P(-0.5 ≤ Z ≤ -0.3) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding • P(-0.5£Z£-0.3) = P(Z£-0.3) - P(Z£-0.5)

-0.3 -0.5 Finding the P(x) – Example • 2. How many pairs should we stock of the 18 cm size? • We now have our interval expressed in terms of z-scores as P(-0.5 ≤ Z ≤ -0.3) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding • P(-0.5£Z£-0.3) = P(Z£-0.3) - P(Z£-0.5) • = 0.3821 – 0.3085 • = 0.0736 • Now multiply our P(-0.5 ≤ Z ≤ -0.3) by total sales per week to get the number of shoes we should stock in that size: 300 х 0.0736 = 22.08 ≈ 22

17.5 - 20 25.5 - 20 5.5 -2.5 x - m x - m Zlower = Zupper = = = = = = 1.1 = -0.5 s s 5 5 5 5 Finding the P(x) – Example • 3. How many pairs should we stock in the 18 to25 cm size range? • As always, we convert this to a range according to the assumption specified above, since we can only evaluate P(x) over an interval P(17.5 cm ≤ x ≤ 25.5 cm) is what we need to find • We have already found the appropriate Z-scores

-0.5 1.1 Finding the P(x) – Example • 3. How many pairs should we stock in the 18 to 25 cm size range? • We now have our interval expressed in terms of z-scores as P(-0.5 ≤ Z ≤ 1.1) for use in a standard normal distribution, which we can evaluate using our standard normal tables • We can calculate this area by finding • P(-0.5£Z£1.1) = 1 – [P(Z£-0.5) + P(Z³1.1)] • = 1 – [0.3085 + 0.1357] • = 1 – 0.4442 • = 0.5558 • Now multiply our P(-0.5 ≤ Z ≤ 1.1) by total sales per week to get the # of shoes we should stock in that range: 300 х 0.5558 = 166.74 ≈ 167

99.7% 95% 68% Commonly Used Probabilities f(x) -3σ -2σ -1σμ+1σ+2σ +3σ

Commonly Used Probabilities • For a normal distribution, 68% of the observations lie within about one standard deviations of the mean f(x) -σμ +σ • Standard normal distribution: • P(z > 1) = 0.1587 p(-1 ≤ z ≤ 1) = 1 – 2*0.1587 = 0.6826 • Normaldistribution as illustrated in the Figure: • P(μ-σ≤ x ≤ μ-σ) = 0.6826 ≈ 0.68

Commonly Used Probabilities • For a normal distribution, 95% of the observations lie within about two standard deviations of the mean f(x) -2σμ +2σ • Standard normal distribution: • P(z > 2) = 0.0228 p(-2 ≤ z ≤ 2) = 1 – 2*0.0228 = 0.9543 • Normaldistribution as illustrated in the Figure: • P(μ-2σ≤ x ≤ μ-2σ) = 0.9543 ≈ 0.95

Commonly Used Probabilities • For a normal distribution, 99.7% of the observations lie within about two standard deviations of the mean f(x) -3σμ +2σ • Standard normal distribution: • P(z > 3) = 0.0013 p(-3 ≤ z ≤ 3) = 1 – 2*0.0013 = 0.9974 • Normaldistribution as illustrated in the Figure: • P(μ-3σ≤ x ≤ μ-3σ) = 0.9974 ≈ 0.997

99.7% 95% 68% Commonly Used Probabilities f(x) -3σ -2σ -1σμ+1σ+2σ +3σ

Topics Covered

Topics Covered

Presentation Transcript

Topics Covered

Topics Covered

Topics Covered

Topics Covered

Topics Covered

Covered Topics

Topics covered

Topics covered

TOPICS COVERED

Topics Covered

Topics Covered

Topics Covered

Topics Covered

Topics Covered

Topics Covered

Topics Covered

Topics Covered