460 likes | 476 Views
Chapter 7. Continuous Distributions. Continuous Random Variables. Values fall within an interval Measurements Described by density curves. Density Curve. Always on or above the x-axis Area underneath it equals 1 Shows what proportion of data falls within an interval.
E N D
Chapter 7 Continuous Distributions
Continuous Random Variables • Values fall within an interval • Measurements • Described by density curves
Density Curve • Always on or above the x-axis • Area underneath it equals 1 • Shows what proportion of data falls within an interval
Generic continuous distributions Can be any shape Probability = area under the curve Unusual Density Curves
How do we find the area of a triangle? P(X < 2) =
What is the area of a line segment? P(X = 2) = 0 P(X < 2) = .25
In continuous distributions, P(X < 2) & P(X< 2) are the same! Hmmmm… Is this different than discrete distributions?
P(X > 3) = P(1 < X < 3) = .4375 .5
.75 P(X > 1) = .5(2)(.25) = .25 (2)(.25) = .5
.28125 P(0.5 < X < 1.5) =
Uniform Distribution • Evenly (uniformly) distributed Every value has equal probability • Density curve: rectangle • Probability = area under the curve How do you find the area of a rectangle? (a & b are the endpoints of the distribution)
Why 12? , where
1/.12 4.92 4.98 5.04 The Citrus Sugar Company packs sugar in bags labeled 5 pounds. However, the packaging isn’t perfect and the actual weights are uniformly distributed with a mean of 4.98 pounds and a range of .12 pounds. • Construct this uniform distribution. What shape does a uniform distribution have? What is the height of this rectangle? How long is thisrectangle?
1/.12 4.92 4.98 5.04 What is the length of the shaded region? b) What is the probability that a randomly selected bag will weigh more than 4.97 pounds? P(X > 4.97) = .07(1/.12) = .5833
1/.12 4.92 4.98 5.04 c) Find the probability that a randomly selected bag weighs between 4.93 and 5.03 pounds. What is the length of the shaded region? P(4.93 < X < 5.03) = .1(1/.12) = .8333
1/35 5 40 • The time it takes for students to drive to school is evenly distributed with a minimum of 5 minutes and a range of 35 minutes. • Draw the distribution. What is the height of the rectangle? Where should the rectangle end?
1/35 5 40 b) What is the probability that it takes less than 20 minutes to drive to school? P(X < 20) = (15)(1/35) = .4286
c) What are the mean and standard deviation of this distribution? μ = (5 + 40)/2 = 22.5 2 = (40 – 5)2/12 = 102.083 = 10.104
Normal Distribution How is this done mathematically? • Symmetrical, bell-shaped density curve • Parameters: μ, • Probability = area under the curve • As increases, curve flattens & spreads out • As decreases, curve gets taller and thinner
Normal distributions occur frequently. • Quantum harmonic oscillators • Diffusion • Thermal light • Size of living tissue • Blood pressure • Compound interest • Exchange rates • Stock market indices • Length of infants • Height • Weight • ACT scores • Intelligence • Number of typing errors • Velocities of ideal gas molecules • Yearly rainfall
A B 6 Do these two normal curves have the same mean? If so, what is it? Which normal curve has a standard deviation of 3? Which normal curve has a standard deviation of 1? YES B A
Empirical Rule • Approx. 68% of the data fall within 1 of μ • Approx. 95% of the data fall within 2 of μ • Approx. 99.7% of the data fall within 3 of μ
68% 71 Suppose the height of male GBHS students is normally distributed with a mean of 71 inches and standard deviation of 2.5 inches. What is the probability that a randomly selected male student is taller than 73.5 inches? 1 – .68 = .32 P(X > 73.5) = 0.16
Standard Normal Density Curve μ = 0 & = 1 To standardize any normal data: Make your life easier – memorize this!
To find normal probabilities/proportions: • Write the probability statement • Draw a picture • Calculate the z-score • Look up the probability in the table
The lifetime of a certain type of battery is normally distributed with a mean of 200 hours and a standard deviation of 15 hours. What proportion of these batteries can be expected to last less than 220 hours? Write the probability statement Draw & shade the curve P(X < 220) = .9082 Look up z-score in table Calculate z-score
The lifetime of a certain type of battery is normally distributed with a mean of 200 hours and a standard deviation of 15 hours. What proportion of these batteries can be expected to last more than 220 hours? P(X > 220) = 1 – .9082 = .0918
The lifetime of a certain type of battery is normally distributed with a mean of 200 hours and a standard deviation of 15 hours. How long must a battery last to be in the top 5%? Look up 0.95 in table to find z-score P(X > ?) = .05 .95 .05 1.645
The heights of the female GBHS students are normally distributed with a mean of 65 inches. What is the standard deviation of this distribution if 18.5% of the female students are shorter than 63 inches? What is the z-score for 63? P(X < 63) = .185 -0.9 63
The heights of female GBHS teachers are normally distributed with mean 65.5 inches and standard deviation 2.25 inches. The heights of male GBHS teachers are normally distributed with mean 70 inches and standard deviation 2.5 inches. • Describe the distribution of differences of teacher heights (male – female) Normal distribution with μ = 4.5, = 3.3634
4.5 What is the probability that a randomly selected male teacher is shorter than a randomly selected female teacher? P(X < 0) = .0901
Will my calculator do any of this stuff? • Normalpdf: Doesn't make sense P(X = x) = 0! Used for graphing ONLY • Normalcdf: Calculates probability normalcdf(lower bound, upper bound) • Invnorm (inverse normal): Finds z-score for a probability to the left
Ways to Assess Normality Dotplots, boxplots, histograms Normal Probability (Quantile) Plot
Normal Scores Widths of Contact Windows Suppose we have the following observations of widths of contact windows in integrated circuit chips: 3.21 2.49 2.94 4.38 4.02 3.62 3.30 2.85 3.34 3.81 Think of selecting sample after sample of size 10 from a standard normal distribution. Then -1.539 is the average of the smallest value from each sample, -1.001 is the average of the next smallest value from each sample, etc. Sketch a scatterplot by pairing the smallest normal score with the smallest data value, 2nd normal score with 2nd data value, and so on What should happen if our data set is normally distributed? To construct a normal probability plot, we can use quantities called normal scores. The values of the normal scores depend on the sample size n. The normal scores when n = 10 are below: -1.539 -1.001 -0.656 -0.376 -0.123 0.123 0.376 0.656 1.001 1.539
Normal Probability (Quantile) Plots Plot data against known normal z-scores Points form a straight line data is normally distributed Stacks of points (repeat observations): granularity
Are these approximately normally distributed? 50 48 54 47 51 52 46 53 52 51 48 48 54 55 57 45 53 50 47 49 50 56 53 52 The normal probability plot is approx. linear, so the data are approx. normal. The histogram/boxplot is approx. symmetrical, so the data are approx. normal.
Premature babies are those born more than 3 weeks early. Newsweek (May 16, 1988) reported that 10% of the live births in the U.S. are premature. Suppose 250 live births are randomly selected and X = the number of “preemies” is determined. What is the probability that there are between 15 and 30 preemies, inclusive? • Find this probability using the binomial distribution. 2) What is the mean and standard deviation of this distribution? P(15 < X < 30) = binomcdf(250, .1, 30) – binomcdf(250, .1, 14) = .866 μ = 25, = 4.743
Let’s graph this distribution: • L1: seq(X, X, 0, 45) • L2: use binompdf to find the binomial probabilities • xmin = -0.5, xmax = 45, xscl = 1 • ymin = 0, ymax = 0.2, yscl = 1 Premature babies are those born more than 3 weeks early. Newsweek (May 16, 1988) reported that 10% of the live births in the U.S. are premature. Suppose 250 live births are randomly selected and X = the number of “preemies” is determined. 3) If we were to graph a histogram for the above binomial distribution, what shape would it have? 4) What do you notice about the shape? p is only 10% should be skewed right • Overlay a normal curve on your histogram: • Y1 = normalpdf(X, 25, 4.743)
We can estimate binomial probabilities with the normal distribution IF… 1) p is close to .5 or 2) n is sufficiently large np > 10 & n(1 –p) > 10 Why 10?
Normaldistributions extend infinitely in both directions • Binomial distributions go from 0 to n • If we use normal to estimate binomial, we have to cut offthe tails of the normal distribution • This is okay if the mean of the normal distribution (which we use the mean of the binomial for) isat least three standard deviations(3) from 0 and from n
We require: Or As binomial: Square it: Simplify: Since (1 – p) < 1: And since p < 1: Therefore, np should be at least 10 and n(1 – p) should be at least 10.
Continuity Correction • Discrete histograms: Each bar is centered over a discrete value • Bar for "1" actually goes from 0.5 to 1.5, bar for "2" goes from 1.5 to 2.5, etc. • So if we want to estimate a discrete distribution with a continuous one… Add/subtract 0.5 from each discrete value
np = 250(.1) = 25 > 10 n(1 – p) = 250(.9) = 225 > 10 We can use normal to approximate binomial 5) Since P(preemie) = .1 which is not close to .5, is n large enough? 6) Use a normal distribution to estimate the probability that between 15 and 30 preemies, inclusive, are born in the 250 randomly selected babies. Binomial Normal(w/ cont. correction) P(15 < X < 30) P(14.5 < X < 30.5)
P(14.5 < X < 30.5) = normalcdf(14.5, 30.5, 25, 4.743) = .8634 7) How does the normal answer compare to the binomial answer? Pretty darn close!
Estimate each probability using the normal distribution: a) What is the probability that less than 20 preemies are born out of the 250 babies? b) What is the probability that at least 30 preemies are born out of the 250 babies? c) What is the probability that less than 35 preemies but more than 20 preemies are born out of the 250 babies?