620 likes | 673 Views
STATS 361. Software Packages. UNIT 2. Emmanuel Harris. emmaharris2002@yahoo.com // 020 2470867. Feb 2016. UNIT 2. DISCRETE DISTRIBUTIONS BINOMIAL DISTRIBUTION POISSON DISTRIBUTION POISSON APPROXIMATION TO BINOMIAL GEOMETRIC DISTRIBUTION HYPERGEOMETRIC DISTRIBUTION
E N D
STATS 361 Software Packages UNIT 2 Emmanuel Harris emmaharris2002@yahoo.com // 020 2470867 Feb 2016
UNIT 2 • DISCRETE DISTRIBUTIONS • BINOMIAL DISTRIBUTION • POISSON DISTRIBUTION • POISSON APPROXIMATION TO BINOMIAL • GEOMETRIC DISTRIBUTION • HYPERGEOMETRIC DISTRIBUTION • NEGATIVE BINOMIAL DISTRIBUTION 2
DISCRETE PROBABILITY DISTRIBUTIONS • A probability distribution for a discrete random variable X is a formula, graph, table or any device that specifies the probability associated with each possible value of X. • Discrete Distributions include: • BINOMIAL DISTRIBUTION • POISSON DISTRIBUTION • GEOMETRIC DISTRIBUTION • HYPERGEOMETRIC DISTRIBUTION • NEGATIVE BINOMIAL 3
BINOMIAL DISTRIBUTION Its used to model experiments consisting of a sequence of observations of identical and independent trials, each of which results in one of the possible outcomes. A random variable X is said to have a Binomial distribution based on the n independent trials with probability of success p if and only if Where . In R, we use >dbinom(x,n,p) when finding the exact probability of outcomes and we use >pbinom(x,n,p) when finding the probability of at most X outcomes. 4
Properties of Binomial Distribution • The outcomes of each trial are independent of each other. • There is a fixed number of trials. • Each trial can have only two outcomes or outcomes that can be reduced to two outcomes. The outcomes are usually considered as a success or a failure. • The probability of a success must remain the same for each trial. 5
EXAMPLE 1 • For a random variable Find the following: • i) • ii) • iii) • iv) • v) • vi) • vii) • viii) • ix) 6
SOLUTION R-Code >dbinom(4,9,0.2) [1] 0.06606029 7
R-Code >pbinom(4,9,0.2) [1] 0.9804186 OR R-Code >sum(dbinom(0:4,9,0.2)) [1] 0.9804186 OR R-Code >dbinom(0,9,0.2)+dbinom(1,9,0.2)+dbinom(2,9,0.2)+dbinom(3,9,0.2)+dbinom(4,9,0.2) [1] 0.9804186 8
R-Code >pbinom(3,9,0.2) [1] 0.9143583 9
OR R-Code > sum(dbinom(0:3,9,0.2)) [1] 0.9143583 OR R-Code >dbinom(0,9,0.2)+dbinom(1,9,0.2)+dbinom(2,9,0.2)+ dbinom(3,9,0.2) [1] 0.9143583 10
} R-Code > 1-pbinom(3,9,0.2) [1] 0.08564173 11
OR R-Code >1- sum(dbinom(0:3,9,0.2)) [1] 0.08564173 OR R-Code >1-(dbinom(0,9,0.2)+dbinom(1,9,0.2)+dbinom(2,9,0.2)+ dbinom(3,9,0.2)) [1] 0.08564173 12
R-Code >sum(dbinom(3:9,9,0.2)) [1] 0.2618025 13
OR R-Code >1-pbinom(2,9,0.2) [1] 0.2618025 OR R-Code >1-sum(dbinom(0:2,9,0.2)) [1] 0.2618025 OR R-Code >1-(dbinom(0,9,0.2)+dbinom(1,9,0.2)+dbinom(2,9,0.2)) [1] 0.2618025 14
vi) R-Code >sum(dbinom(2:6,9,0.2)) [1] 0.5634785 15
R-Code >sum(dbinom(1:7,9,0.2)) [1] 0.8657633 16
R-Code >sum(dbinom(3:6,9,0.2)) [1] 0.2614886 17
ix) R-Code >sum(dbinom(2:5,9,0.2)) [1] 0.560726 18
EXAMPLE 2 In a certain town, 38% of the population are females. If five people in the town are selected at random, find the probability that: (A) three of people selected are males (B) more than two of the people selected are males (C) at most three are males (D) Between 3 and 5 males inclusive (E) Fewer than 2 are males (F) More than one but less than four are males 19
Solution a) R-Code >dbinom(3,5,0.62) [1] 0.3441456 b) R-Code >1-sum(dbinom(0:2,5,0.62)) [1] 0.7165093 20
R-Code >pbinom(3,5,0.62)) [1] 0.6276363
d) R-Code >sum(dbinom(3:5,5,0.62)) [1] 0.7165093
e) R-Code >pbinom(1,5,0.62) [1] 0.07256273
f) R-Code >sum(dbinom(2:3,5,0.62)) [1] 0.5550736
POISSON DISTRIBUTION The Poisson distribution for a random variable X, represents the number of occurrence of an event in a given interval of time, space or volume. It is defined by Where, the mean and variance are equal, In R, we use >dpois(x,)when finding the exact probability of outcomes and we use >ppois(x,)when finding the probability of at most X outcomes. 25
PROPERTIES • It consist of counting the number of times a particular event occurs during a given unit of time, area, volume or space • The occurrence or the non-occurrence of the event in any interval of time, space or volume is random and independent of the occurrence or non-occurrence of the event in any other interval • The probability of the occurrence of an event in a given interval is proportional to the length of the interval 26
EXAMPLE 1 • The number of customers arriving at a certain bank assumes a poison distribution with . find the following, 27
SOLUTION • R-Code • >dpois(2,3) • [1] 0.2240418 • R-Code • >ppois(3,3) • [1] 0.6472319 28
R-Code • >ppios(2,3) • [1] 0.4231901 • R-Code • >1-ppios(1,3) • [1] 0.8008517 29
R-Code • >1-dpois(1,3) • [1] 0.8506388 • R-Code • >sum(dpois(3:4,3)) • [1] 0.3920732 30
+ • R-Code • >sum(dpois(1:3,3)) • [1] 0.597442 • + • R-Code • >sum(dpois(4:5,3)) • [1] 0.2688502 31
R-Code • >sum(dpois(3:4,3)) • [1] 0.3920732 32
EXAMPLE 2 • On average the school photocopier breaks down eight times during the school week (Mon-Fri). Assuming that the number of breakdowns can be modeled by a Poisson distribution. Find the probability that it breaks down. • Five times in a given week. • Once on Monday. • Eight times in a fortnight 33
SOLUTION • a) • R-Code • >dpois(5,8) • [1] 0.09160366 • b)The mean number of breakdowns in a day is so X 34
R-Code • >dpois(1,1.6) • [1] 0.3230344 • c) The mean number of breakdowns in a fortnight is, • 16. Therefore X • R-Code • >dpois(8,16) • [1] 0.01198747 35
POISSON APPROXIMATION TO THE BINOMIAL The use of binomial for the calculation of certain probabilities becomes very cumbersome to handle when n is very large and the probability of success p, becomes very small. Hence, the use of Poisson distribution to approximate the binomial distribution in such instances. When n is large and p is small the binomial distribution Then the binomial distribution can be approximated using a Poisson distribution with the same mean, that is . The approximation gets better as n gets larger and p gets smaller. 36
EXAMPLE In a manufacturing process where iron products are produced, defects occur, occasionally rendering the piece undesirable for marketing. It is known that on the average, 1 in every 1000 of these items produced has one or more defects. What is the probability that a random sample of 6000 will yield; (a) fewer than 2items possessing defects (b) Exactly 12 possess defects (c) more than 1 item possessing defects (d) Between 2 and 5(inclusive)items possess defects 37
SOLUTION The problem is essentially a binomial experiment with since is close to 0 and is quite large, we approximate with the Poisson distribution using a) R-Code >ppois(1,6) [1] 0.01735127 38
b) • R-Code • >dpios(12,6) • [1] 0.01126448 • c) • R-Code • >1-ppois(1,6) • [1] 0.9826487 39
d) + • R-Code • >sum(dpois(2:5,6)) • [1] 0.4283284 40
GEOMETRIC DISTRIBUTION • Suppose you flip a coin several times. What is the probability that the first head appears on the third toss?In order to answer this question and other similar probability questions, the geometric distribution can be used. The formula for the probability that the first success occurs on the xthtrial is, • Where p is the probability of a success and x is the trial number of the first success. • In R, we use >dgeom(x-1,p)when finding exactly the probability of outcomes and we use >pgeom(x-1,p)when finding the probability of at most X outcomes. 41
EXAMPLE 1 Given that X assumes the geometric distribution as .find the following probabilities, P(X=5) P(X) P(X<4) P(X>4) 42
SOLUTION i) R-Code > dgeom(4,0.35) [1] 0.062477 (ii) R-Code > pgeom(1,0.35) [1] 0.5775 43
R-Code > pgeom(2,0.35) [1] 0.725375 (iv) R-Code > 1-pgeom(3,0.35) [1] 0.1785063 • iii) 44
EXAMPLE 2 A box of balls contains 30% that are defective. Balls are drawn until a defective item is found. Find the probability that; (a) 3 draws are required (b) Less than 2 draws are required (c) At least 3draws are required (d)3 to 6 draws are required 45
SOLUTION a) R-Code >dgeom(2,0.3) [1] 0.147 b) R-Code > dgeom(0,0.3) [1] 0.3 46
R-code > 1-pgeom(1,0.3) [1] 0.49 (d) R-Code > sum(dgeom(2:5,0.3)) [1] 0.372351 (c) 47
THE HYPERGEOMETRIC DISTRIBUTION When a probability experiment has two outcomes and the items are selected without replacement. When there are two groups of items such that there are a items in the first group and b items in the second group, so that the total number of items is a+b=N, the probability of selecting x items from the first group and k-x items from the second group is where k is the total number of items selected without replacement. 48
In R, we use >dhyper(x,k,N-k,n)when finding exactly the probability of outcomes and we use >phyper(x,k,N-k,n)when finding the probability of at most X outcomes. • EXAMPLE • From a of batch 10 missiles, 4 are selected at random and fired. If the batch contains 30% of defective missiles that will not fire, what is the probability that • All 4 will fire • (b) At most 2 will not fire • (c) At least 2 will not fire • (d) Three will fire 49
SOLUTION • a. x=0, k=3, N=10, n=4 • R-Code • > dhyper(0,3,7,4) • [1] 0.16666667 50