510 likes | 530 Views
Discrete Math CS 2800. Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d). 1) Probability Distributions 2) Markov and Chebyshev Bounds. Discrete Random variable. Discrete random variable Takes on one of a finite (or at least countable) number of different values.
E N D
Discrete MathCS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds
Discrete Random variable • Discrete random variable • Takes on one of a finite (or at least countable) number of different values. • X = 1 if heads, 0 if tails • Y = 1 if male, 0 if female (phone survey) • Z = # of spots on face of thrown die
Continuous Random variable • Continuous random variable (r.v.) • Takes on one in an infinite range of different values • W = % GDP grows (shrinks?) this year • V = hours until light bulb fails • For a discrete r.v., we have Prob(X=x), i.e., the probability that • r.v. X takes on a given value x. • What is the probability that a continuous r.v. takes on a specific value? E.g. Prob(X_light_bulb_fails = 3.14159265 hrs) = ?? • However, ranges of values can have non-zero probability. • E.g. Prob(3 hrs <= X_light_bulb_fails <= 4 hrs) = 0.1 • Ranges of values have a probability 0
Probability Distribution • The probability distribution is a complete probabilistic description of a random variable. • All other statistical concepts (expectation, variance, etc) are derived from it. • Once we know the probability distribution of a random variable, we know everything we can learn about it from statistics.
Probability Distribution • Probability function • One form the probability distribution of a discrete random variable may be expressed in. • Expresses the probability that X takes the value x as a function of x (as we saw before):
Probability Distribution • The probability function • May be tabular:
Probability Distribution • The probability function • May be graphical: .50 .33 .17 1 2 3
Probability Distribution • The probability function • May be formulaic:
Probability Distribution: Fair die .50 .33 .17 1 2 3 4 5 6
Probability Distribution • The probability function, properties
Cumulative Probability Distribution • Cumulative probability distribution • The cdf is a function which describes the probability that a random variable does not exceed a value. Yes! Does this make sense for a continuous r.v.?
Cumulative Probability Distribution • Cumulative probability distribution • The relationship between the cdf and the probability function:
Cumulative Probability Distribution • Die-throwing tabular graphical 1 1 2 3 4 5 6
Cumulative Probability Distribution • The cumulative distribution function • May be formulaic (die-throwing):
Cumulative Probability Distribution • The cdf, properties
Example CDFs Of a discrete probability distribution Of a continuous probability distribution Of a distribution which has both a continuous part and a discrete part.
Functions of a random variable • It is possible to calculate expectations and variances of functions of random variables
x P(X=x) Product 1 1 1/6 0.167 2 1.414 1/6 0.236 3 1.732 1/6 0.289 4 2 1/6 0.333 5 2.231 1/6 0.372 6 2.449 1/6 0.408 Tot 1.804 Functions of a random variable • Example • You are paid a number of dollars equal to the square root of the number of spots on a die. • What is a fair bet to get into this game?
Functions of a random variable • Linear functions • If a and b are constants and X is a random variable • It can be shown that: Intuitively, why does b not appear in variance? And, why a2 ?
The Most Common • Discrete Probability Distributions • (some discussed before) 1) --- Bernoulli distribution 2) --- Binomial 3) --- Geometric 4) --- Poisson
Bernoulli distribution • The Bernoulli distribution is the “coin flip” distribution. • X is Bernoulli if its probability function is: • X=1 is usually interpreted as a “success.” E.g.: • X=1 for heads in coin toss • X=1 for male in survey • X=1 for defective in a test of product • X=1 for “made the sale” tracking performance
Bernoulli distribution • Expectation: • Variance:
Binomial distribution • The binomial distribution is just n independent Bernoullis • added up. • It is the number of “successes” in n trials. • If Z1, Z2, …, Zn are Bernoulli, then X is binomial:
Binomial distribution • The binomial distribution is just n independent Bernoullis • added up. Testing for defects “with replacement.” • Have many light bulbs • Pick one at random, test for defect, put it back • Pick one at random, test for defect, put it back • If there are many light bulbs, do not have to replace
Binomial distribution • Let’s figure out a binomial r.v.’s probability function. • Suppose we are looking at a binomial with n=3. • We want P(X=0): • Can happen one way: 000 • (1-p)(1-p)(1-p) = (1-p)3 • We want P(X=1): • Can happen three ways: 100, 010, 001 • p(1-p)(1-p)+(1-p)p(1-p)+(1-p)(1-p)p = 3p(1-p)2 • We want P(X=2): • Can happen three ways: 110, 011, 101 • pp(1-p)+(1-p)pp+p(1-p)p = 3p2(1-p) • We want P(X=3): • Can happen one way: 111 • ppp = p3
Binomial distribution • So, binomial r.v.’s probability function
Binomial distribution • Typical shape of binomial: • Symmetric
Variance: • Expectation: Aside: If V(X) = V(Y). And? But Hmm…
Binomial distribution • A salesman claims that he closes the deal 40% of the time. • This month, he closed 1 out of 10 deals. • How likely is it that he did 1/10 or worse given his claim?
Binomial distribution Less than 5% or 1 in 20. So, it’s unlikely that his success rate is 0.4. Note:
Binomial and normal / Gaussian distribution The normal distribution is a good approximation to the binomial distribution. (“large” n, small skew.) B(n, p) Prob. density function:
Geometric Distribution • A geometric distribution is usually interpreted as number of time periods until a failure occurs. • Imagine a sequence of coin flips, and the random variable X is the flip number on which the first tails occurs. • The probability of a head (a success) is p.
Geometric • Let’s find the probability function for the geometric distribution: etc. So, (x is a positive integer)
Geometric • Notice, there is no upper limit on how large X can be • Let’s check that these probabilities add to 1: Geometric series
Geometric differentiate both sides w.r.t. p: See Rosen page 158, example 17. • Expectation: Variance:
Poisson distribution • The Poisson distribution is typical of random variables which represent counts. • Number of murders in Ithaca next year. • Number of requests to a server in 1 hour. • Number of sick days in a year for an employee. ?!
The Poisson distribution is derived from the following underlying arrival time model: • The probability of an unit arriving is uniform through time. • Two items never arrive at exactly the same time. • Arrivals are independent --- the arrival of one unit does not make the next unit more or less likely to arrive quickly.
Poisson distribution • The probability function for the Poisson distribution with parameter is: • is like the arrival rate --- higher means more/faster arrivals
Poisson distribution • Shape Low Med High
Often, you don’t know the exact probability distribution • of a random variable. • We still would like to say something about the probabilities involving that random variable… • E.g., what is the probability of X being larger (or smaller) than some given value. • We often can by bounding the probability of events based on partial information about the underlying probability distribution • Markov and Chebyshev bounds.
Note: relates cumulative distribution to expected value. Theorem Markov Inequality Let X be a nonnegative random variable with E[X] = . Then, for any t > 0, Hmm. What if ? Sure! “Can’t have too much prob. to the right of E[X]” gives But
Proof I.e. Where did we use X >= 0? 3rd line
A discrete random variable Alt. proof Markov Inequality Define E[Y] E[X]
Example: • Consider a system with mean time to failure = 100 hours. • Use the Markov inequality to bound the reliability of the system, • R(t) for t = 90, 100, 110, 200 X – time to failure of the system; E[X]=100 R(t)= P[X>t] , with t =90, 100, 110 , 200 By Markov Markov inequality is somewhat crude, since only the mean is assumed to be known.
Theorem Chebyshev's Inequality • Assume that mean and variance are given. • We can obtain a better estimate of probability of • events of interest by using Chebyshev’s inequality:
Theorem Chebyshev's Inequality Proof: Markov Ineq. applied to r.v.
Chebyshev inequality: Alternate forms • Yet two other forms of Chebyshev’s ineqaulity: Says something about the probability of being “k standard deviations from the mean”.
Theorem Chebyshev's Inequality Facts: 1-1/4 = .75 1-1/9 = .889 1-1/16=0.93