110 likes | 199 Views
Hypergeometric Random Variables. Sampling without replacement. When sampling with replacement, each trial remains independent. For example,… If balls are replaced, P(red ball on 2 nd draw) = P(red ball on 2 nd draw | first ball was red).
E N D
Sampling without replacement • When sampling with replacement, each trial remains independent. For example,… • If balls are replaced, P(red ball on 2nd draw) =P(red ball on 2nd draw | first ball was red). • If balls not replaced, then given the first ball is red, there is less chance of a red ball on the 2nd draw. Though for a large population of balls, the effect may be minimal.
n trials, y red balls • Suppose there are r red balls, and N – r other balls. • Consider Y, the number of red balls in n selections,where now the trials may be dependent.(for sampling without replacement, when sample size is significant relative to the population) • The probability y of the n selected balls are red is
Hypergeometric R. V. • A random variable has a hypergeometric distribution with parameters N, n, and r if its probability function is given by where 0 <y< min( n, r ).
Hypergeometric mean, variance • If Y is a hypergeometric random variable with parameter p the expected value and variance for Y are given by ( Proof not as easy as previous distributions and is not given at this time. )
Sounds like… • If we let p = r/N and q = 1- p = (N - r)/N, then the hypergeometric measures Look quite similar to the expressions for the binomial distribution, E(Y) = np and V(Y) = npq.
Rule of Thumb • For cases when n / N< 0.05, it may be reasonable to approximate the hypergeometric probabilities using a binomial distribution. • Suppose each hour, 1000 bottles are filled by a machine and on average 10% are “underfilled”. • Each hour 20 of the bottles are randomly selected. Find probability at least 3 of the 20 are underfilled. • Since 20/1000 = 0.02, perhaps we could use the binomial distribution to approximate the answer.
Easy binomial probability? • Let p = 0.10, the “success of underfilling” • P( at least 3 underfilled ) = • 1 – P( 0, 1, or 2 underfilled) =1 – [ P(Y = 0) + P(Y = 1) + P(Y = 2)] • Approximately equal to1 – binomialcdf(20, 0.10, 2) = 0.32307how close is this to actual hypergeometric?
A hypergeometric probability • P( at least 3 underfilled ) = • 1 – P( 0, 1, or 2 underfilled) = 1 – [ P(Y = 0) + P(Y = 1) + P(Y = 2)] As compared to 0.32307 using a binomial approx.
The Binomial Approximation The hypergeometric distribution …and a very similar binomial distribution
As population increases • Let N get large as n and p=r/N remain constant, and we would see that Hypergeometric probabilities converge to the binomial probabilities, as the events become “almost independent”. Proof ?