190 likes | 206 Views
Learn about random variables, expected value, Bernoulli trials, Binomial probabilities, Normal model, significance testing, common pitfalls.
E N D
Chapter 15 Probability Models
Random Variables • A random variable assumes a value based on the outcome of a random event. • We use a capital letter, like X, to denote a random variable. • A particular value of a random variable will be denoted with the corresponding lower case letter, in this case x.
Random Variables (cont.) • A probability model for a random variable consists of: • The collection of all possible values of a random variable, and • the probabilities that the values occur.
Expected Value: Center • Of particular interest is the value we expect a random variable to take on, notated μ (for population mean) or E(X) for expected value.
Expected Value: Center (cont.) • To calculate the expected value of a (discrete) random variable – multiply each possible value by the probability that it occurs, and find the sum: • Note: Be sure that every possible outcome is included in the sum and verify that you have a valid probability model to start with.
First Center, Now Spread… • For data, we calculated the standard deviation by first computing the deviation from the mean and squaring it. We do that with discrete random variables as well. • The variance for a random variable is: • The standard deviation for a random variable is:
Bernoulli Trials • The basis for the probability models we will examine in this chapter is the Bernoulli trial. • We have Bernoulli trials if: • there are only two possible outcomes (success and failure). • the probability of success, p, is constant. • the trials are independent.
Binomial Probabilities • A Binomial model tells us the probability for a random variable that counts the number of successes in a fixed number of Bernoulli trials. • Two parameters define the Binomial model: n, the number of trials; and, p, the probability of success. We denote this Binom(n, p).
Binomial Probabilities (cont.) • In n trials, there are ways to have k successes. • You did this in Chapter 12, remember? • In general, the probability of exactly k successes in n trials is
The Binomial Model Binomial probability model for Bernoulli trials: Binom(n,p) n = number of trials p = probability of success q = 1 – p = probability of failure x= # of successes in n trials P(x) = nCxpxqn–x where
The Binomial Model (cont.) • The binomial has an easy to find center or mean: • And the standard deviation of a binomial model is also fairly simple:
Independence and the 10% Rule • Binomial models require that each selection is independent. • However, often we are making selections without replacement, which technically violates independence. • However, when we are sampling from a very large population, our selections are actually close to independent.
Independence (cont.) • This leads to the 10% condition: • As long as our sample is less than 10% of the population, it is okay to act as if the selections are independent.
The Normal Model to the Rescue! • Binomial problems sometimes cover too many options. • But when a binomial problem grows to be big and unwieldy, we can use a Normal Model! • The Success/Failure Condition states that a Binomial model is approximately Normal if: • np ≥ 10 and nq≥ 10. • That is to say, if we expect at least 10 successes and at least 10 failures.
Statistically Significant • When is something unusual? • When is something unlikely to happen just by chance? • Now we’re in a position to finally answer this question! • Roughly speaking, when we’re using a Normal model and something is 2 or more standard deviations from what we expect… • We have statistical significance! • More on this in upcoming chapters!
What Can Go Wrong? • Probability models are still just models. • Models can be useful, but they are not reality. • Question probabilities as you would data, and think about the assumptions behind your models. • If the model is wrong, so is everything else.
What Can Go Wrong? (cont.) • Be sure you have Bernoulli trials. • You need two outcomes per trial, a constant probability of success, and independence. • Remember that the 10% Condition provides a reasonable substitute for independence. • Don’t assume everything is Normal. • Don’t use the Normal approximation with small n. • You need at least 10 successes and 10 failures to use the Normal approximation.
What have we learned? • We can use a probability model for a random variable to find its expected value and standard deviation. • We’ve learned that Bernoulli trials show up in lots of places. • We’ve learned how to calculate Binomial probabilities, as well as the mean and standard deviation.
What have we learned? (cont.) • We’ve learned how to use a Normal model for some Binomial situations (when you expect at least 10 successes and 10 failures). • We learned to use a Normal model to start thinking about statistically significant results: results that are more than 2 standard deviations from what we expect.