STAT 111 Introductory Statistics

STAT 111 Introductory Statistics Lecture 6: Random Variables May 26, 2004

Today’s Topics • Special random variables • Expected value of a random variable • Variance of a random variable • Conditional Probability • Multiplication Rule and Independence

Review: Random Variables • Recall that a random variable is shorthand notation to denote possible numerical outcomes of a random phenomenon. • For repetitions of this phenomenon, we would expect to get different values for this random variables. • Random variables can be either discrete or continuous.

Discrete Random Variables • Suppose we have an experiment with only two possible outcomes, success and failure. • Examples of this type of experiment: • Getting heads on a coin flip • Rolling a 6 on a standard die • Suppose we have a random variable X such that • X = 1 if the experiment produces a success • X = 0 if the experiment produces a failure • Then X is a Bernoulli random variable.

Discrete Random Variables (cont.) • If X is a Bernoulli random variable, then the probability distribution is given by • P(X = 1) = p • P(X = 0) = 1 – p • Going back to our two aforementioned examples: • P( successful coin flip ) = 1/2 • P( successful roll of the die ) = 1/6

Discrete Random Variables (cont.) • Suppose we have n independent and identically distributed Bernoulli random variables Xi. • Then the sum X of these n Bernoulli random variables is what we call a binomial random variable. • Examples: • Total number of heads from flipping a coin 10 times • Total number of 6’s from rolling a die 100 times

Discrete Random Variables (cont.) • Our binomial random variable X can take on any values between (and including) 0 and n. • The probability of X = k, where k = 0, 1, …, n, is given by • Here,

Discrete Random Variables (cont.) • The binomial random variable is fairly important when we start dealing with sample and population proportions. • Suppose the number of trials n becomes very large. Then, for some fixed number λ, • A random variable X that has the probabilities on the right for k = 0, 1, …, n, is a Poisson random variable.

Discrete Random Variables (cont.) • The Poisson distribution was originally used to approximate the binomial distribution for very large numbers of trials, but the Poisson was also found to be good at fitting a variety of random phenomena. • One of the earliest phenomena that was fit using a Poisson random variable was data on Prussian cavalry.

Discrete Random Variables (cont.) • During the latter part of the 19th century, Prussian officials gathered information on the hazards that horses posed to cavalry soldiers. • The number of fatalities due to kicks, X, were recorded for each year (20) and each corps (10). • The table shows observed counts and also expected counts obtained by letting λ = 0.61 and multiplying the probabilities by the total number of corps-years observed (200).

Discrete Random Variables (cont.)

Discrete Random Variables (cont.) • The Poisson distribution has turned out to be an excellent model for describing phenomena as disparate as • Radioactive emission • Outbreak of wars • Positioning of stars in space • Telephone calls from pay phones • Traffic accidents along given stretches of road • Mine disasters • Misprints in books

Discrete Random Variables (cont.) • Go back to the scenario of independent Bernoulli trials, but instead of asking how many successes we obtain in n trials, we ask how many trials we need in order to obtain the first success. • So if X is a random variable representing the trial number of the first success, then for x = 1, 2, …, P(X = x) = p * (1 – p)x – 1 • A random variable with this probability distribution is a geometric random variable.

Discrete Random Variables (cont.) • Suppose now that instead of looking at the number of trials required for the first success, we want to look at the number of trials required for the r-th success. • The random variable in this scenario comes from the negative binomial distribution. • The geometric distribution is the special case where r = 1.

Continuous Random Variables • Previously discussed continuous random variables: • Uniform on an interval [a, b] • Normal with mean µ and standard deviation σ • One other continuous probability distribution that is commonly used to model observed phenomena is the exponential distribution.

Continuous Random Variables (cont.) • If a random variable X has the exponential distribution with parameter λ > 0, then the height of its density curve at any x > 0 is given by λe-λx • A random variable X that has this density curve is an exponential random variable with parameter λ.

Continuous Random Variables (cont.) • The exponential distribution is useful for describing random phenomena such as • Distance traveled by a molecule of gas before it collides with another molecule • Lifetime of mechanical and electrical equipment (equivalently, failure time) • Demand for a product

Expected Values and Variances • Suppose X is a discrete random variable whose distribution is given as follows: • Then the mean of X, or the expected value, is

Expected Values and Variances • The variance of X under the same probability distribution is • The standard deviationσX of X is the square root of the variance. • There is an identical expression for the variance that may be somewhat faster to calculate in many cases.

Example: Roll a die • Let X be the number we obtain as a result of our roll. The distribution of X, then, is • Determine the expected value and variance of X.

Example: Pick 3 Ticket • The payoff X of a $1 ticket in the Tri-State Pick 3 Game is $500 with probability 1/1000 and 0 the rest of the time. Calculate the mean and variance of the payoff.

Rules for Means • Let X and Y be (not necessarily independent) random variables and let a and b be constants. Let Z = a + bX be a linear transformation of X. • Rule 1 E(Z) = µZ = E( a + bX ) = a + b E(X) = a + b µX • Rule 2 E( a X + b Y ) = a E(X) + b E(Y) = a µX + b µY • On a side note, if X and Y are independent, then E(XY) = E(X) E(Y)

Rules for Variances • Let X and Y be random variables, and let a and b again be constants. Again, let Z = a + b X be a linear transformation of X. • Rule 1 Var(Z) = Var(a + b X) = b2 Var(X) • Rule 2 If X and Y are independent, then Var(X ± Y) = Var(X) + Var(Y) • Rule 3 If X and Y have correlation ρ, then Var(X ± Y) = Var(X) + Var(Y) ± 2ρσX σY

Rules for Variances (cont.) • The quantity ρσXσY is also known as the covariance of X and Y. • The covariance is a measure of relationship. • Covariance can also be calculated as follows: Cov(X, Y) = E(XY) – E(X) E(Y) • For X and Y independent, the covariance is 0. • Covariance depends upon the units of the variables, so what is considered a large covariance varies.

Example: Pick 3 Ticket • Suppose you buy a $1 Pick 3 ticket on each of two different days. The payoffs X and Y on the two tickets are independent. Let X + Y be the total payoff. Calculate • Expected value of the total payoff • Variance of the total payoff • Standard deviation of the total payoff

Example: Heights of Women • The height of young women between 18 and 24 in America is approximately normally distributed with mean µ = 64.5 and s.d. σ = 2.5. • Two women are randomly chosen from this age group. • What are the mean and s.d. of the difference in their heights? • What is the probability that one is at least 5” taller than the other? • What is the IQR of heights in this age group?

Conditional Probability • The probability of an event can change if we know some other event has occurred. • The conditional probability of an event gives us the probability of one event under the condition that we know the outcome of another event. • Let A and B be any two events such that P(B) > 0. The conditional probability of A assuming that B has already occurred is written P(A | B):

Example: Rolling Dice • Let A be the event that a 4 appears on a single roll of a fair 6-sided die, and let B be the event that an even number appears. • Find P(A | B) and P(B | A). • Suppose we add another (different-colored so we can distinguish between the two) die to the mix, and let C be the event that the sum of the two dice is greater than 8. • Find P(A | C) and P(C | A).

Example: Gender of Children • Suppose we have a family with two children. Assume all four possible outcomes ({older boy, younger girl},…) are equally likely. What is the probability that both are girls given that at least one is a girl? • Suppose instead that we ignored the age of the children and distinguished only three family types. How would this change the above probability?

Example: Drawing Cards • Draw 2 cards off the top of a well-shuffled deck. • What is the probability that the second card is an Ace, given that the first card was an Ace? • On the other hand, consider only the first card for a minute. Suppose you do not see what the card is, and your friend tells you the card is a King. What is the probability that the card is a diamond?

Multiplication Rule • The probability that both event A and event B occur is given by P(A and B) = P(A) P(B | A) = P(B) P(A | B) • Here, P(A | B) and P(B | A) have the usual meaning of being conditional probabilities.

Example: Home Security • House security experts estimate that an untrained house dog has a 70% probability of detecting an intruder – and, given detection, a 50% chance of scaring the intruder away. • What is the probability that Fido successfully thwarts a burglar? (The probability of a trained watchdog detecting and running off an intruder is estimated to be around 0.75)

Example: Drawing Chips from an Urn • An urn contains 5 white chips and 4 blue chips. Two chips are drawn sequentially and without replacement. What is the probability of obtaining the sequence (W, B)? • The multiplication rule can be extended to higher-order intersections. For example, suppose we throw 3 red chips and 5 yellow chips into our urn. Five chips are drawn sequentially and without replacement. What is the probability of obtaining the sequence (W, R, W, B, Y)?

Independence • Recall that two events A and B are independent if knowing one occurs does not change the probability that the other occurs. • When two events are independent, we have that P(B | A) = P(B) and P(A | B) = P(A) • Recall our example about the probability of a single card draw being a diamond given that we are told it is a King.

Tree Diagrams • A tree diagram is often helpful for solving more elaborate calculations, and in particular, problems that have several stages. • In a tree diagram, each segment in the tree represents one stage of the problem. • Each complete branch shows a possible path. • Tree diagrams combine both the addition and multiplication rules.

HH HT TH TT Example: Tossing a Coin Twice P(HH) = P(H)P(H) = (1/2)*(1/2) = 1/4 P(HT) = P(H)P(T) = (1/2)*(1/2) = 1/4 P(TH) = P(T)P(H) = (1/2)*(1/2) = 1/4 P(TT) = P(T)P(T) = (1/2)*(1/2) = 1/4 H H Second flip T First flip H Second flip T T

Many Independent Events • Suppose we have n independent events A1, A2, …, An. Then the multiplication rule is

Example: Ten Dice Rolls • Roll a dice ten times. • Find P(2 appears 10 times). • Find P(2 appears at least once). • Find P(2 appears less than 5 times).

Example: Height of Women • Randomly select 8 American young women aged 18 to 24. • Find P(less than 6 women are taller than 65”). • Find P(at least 3 women between 62” and 67”).

STAT 111 Introductory Statistics