Module #16: Probability Theory

Module #16:Probability Theory Rosen 5th ed., ch. 5 Let’s move on to probability, ch. 5.

Terminology • A (stochastic) experiment is a procedure that yields one of a given set of possible outcomes • The sample spaceS of the experiment is the set of possible outcomes. • An event is a subset of sample space. • A random variable is a function that assigns a real value to each outcome of an experiment Normally, a probability is related to an experiment or a trial. Let’s take flipping a coin for example, what are the possible outcomes? Headsor tails (front or back side) of the coin will be shown upwards. After a sufficient number of tossing, we can “statistically” conclude that the probability of head is 0.5. In rolling a dice, there are 6 outcomes. Suppose we want to calculate the prob. of the event of odd numbers of a dice. What is that probability?

Probability: Laplacian Definition • First, assume that all outcomes in the sample space are equally likely • This term still needs to be defined. • Then, the probability of event E in sample space S is given by Pr[E] = |E|/|S|. Even though there are many definitions of probability, I would like to use the one from Laplace. The expression “equally likely” may be a little bit vague from the perspective of pure mathematics. But in engineering viewpoint, I think that is ok.

Probability of Complementary Events • Let E be an event in a sample space S. • Then, E represents the complementary event. • Pr[E] = 1 − Pr[E] • Pr[S] = 1

Probability of Unions of Events • Let E1,E2 S • Then: Pr[E1 E2] = Pr[E1] + Pr[E2] − Pr[E1E2] • By the inclusion-exclusion principle.

Mutually Exclusive Events • Two events E1, E2 are called mutually exclusive if they are disjoint: E1E2 =  • Note that two mutually exclusive events cannot both occur in the same instance of a given experiment. • For mutually exclusive events, Pr[E1  E2] = Pr[E1] + Pr[E2].

Exhaustive Sets of Events • A set E = {E1, E2, …} of events in the sample space S is exhaustive if • An exhaustive set of events that are all mutually exclusive with each other has the property that

Independent Events • Two events E,F are independent if Pr[EF] = Pr[E]·Pr[F]. • Relates to product rule for number of ways of doing two independent tasks • Example: Flip a coin, and roll a die. Pr[ quarter is heads  die is 1 ] = Pr[quarter is heads] × Pr[die is 1] Now the question is: how we can figure out whether two events are independent or not

Conditional Probability • Let E,Fbe events such that Pr[F]>0. • Then, the conditional probabilityof E given F, written Pr[E|F], is defined as Pr[EF]/Pr[F]. • This is the probability that E would turn out to be true, given just the information that F is true. • If E and F are independent, Pr[E|F] = Pr[E]. Here is the most important part in the probability, the cond. prob. By the cond. prob., we can figure out whether there is a correlation or dependency between two probabilities.

Bayes’ Theorem • Allows one to compute the probability that a hypothesis H is correct, given data D: Set of Hj is exhaustive

Bayes’ theorem: example • Suppose 1% of population has AIDS • Prob. that the positive result is right: 95% • Prob. that the negative result is right: 90% • What is the probability that someone who has the positive result is actually an AIDS patient? • H: event that a person has AIDS • D: event of positive result • P[D] = P[D|H]P[H]+P[D|H]P[H ] = 0.95*0.01+0.1*0.99=0.1085 • P[H|D] = 0.95*0.01/0.1085=0.0876

Expectation Values • For a random variable X(s) having a numeric domain, its expectation value or expected value or weighted average value or arithmetic mean valueE[X] is defined as

Linearity of Expectation • Let X1, X2 be any two random variables derived from the same sample space. Then: • E[X1+X2] = E[X1] + E[X2] • E[aX1 + b] = aE[X1] + b

Variance • The varianceVar[X] = σ2(X) of a random variable X is the expected value of the square of the difference between the value of X and its expectation value E[X]: • The standard deviation or root-mean-square (RMS) difference of X, σ(X) :≡ (Var[X])1/2.

Visualizing Sample Space • 1. Listing • S = {Head, Tail} • 2. Venn Diagram • 3. Contingency Table • 4. Decision Tree Diagram

Venn Diagram Experiment: Toss 2 Coins. Note Faces. Tail Event TH HT HH Outcome TT S Sample Space S = {HH, HT, TH, TT}

Contingency Table Experiment: Toss 2 Coins. Note Faces. nd 2 Coin st Head Tail 1 Coin Total Outcome SimpleEvent (Head on1st Coin) Head HH HT HH, HT Tail TH TT TH, TT Total HH, TH HT, TT S S = {HH, HT, TH, TT} Sample Space

Event Probability Using Contingency Table Event Event B B Total 1 2 P(A ) A P(A  B ) P(A  B ) 1 1 1 1 1 2 P(A ) A P(A  B ) P(A  B ) 2 2 2 1 2 2 P(B ) P(B ) 1 Total 1 2 Joint Probability Marginal (Simple) Probability

Marginal probability • Let S be partitioned into m x n disjoint sets Ei and Fj where the general subset is denoted Ei Fj . Then the marginal probability of Ei is

Tree Diagram Experiment: Toss 2 Coins. Note Faces. H HH H T HT Outcome H TH T T TT S = {HH, HT, TH, TT} Sample Space

Discrete Random Variable • Possible values (outcomes) are discrete • E.g., natural number (0, 1, 2, 3 etc.) • Obtained by Counting • Usually Finite Number of Values • But could be infinite (must be “countable”)

Discrete Probability Distribution 1. List of All possible [x, p(x)] pairs • x = Value of Random Variable (Outcome) • p(x) = Probability Associated with Value 2. Mutually Exclusive (No Overlap) 3. Collectively Exhaustive (Nothing Left Out) 4. 0 p(x)  1 5. p(x) = 1

Visualizing Discrete Probability Distributions { (0, .25), (1, .50), (2, .25) } Table Listing # Tails f(x ) p(x ) Count 0 1 .25 1 2 .50 2 1 .25 p(x) Graph Equation .50 n ! x n  x p ( x )  p ( 1  p ) .25 x ! ( n  x ) ! x .00 0 1 2

Cumulative Distribution Function (CDF)

Binomial Distribution 1. Sequence of n Identical Trials 2. Each Trial Has 2 Outcomes • ‘Success’ (Desired/specified Outcome) or ‘Failure’ 3. Constant Trial Probability • Trials Are Independent • # of successes in n trials is a binomial random variable

Binomial Probability Distribution Function p(x) = Probability of x ‘Successes’ n = Sample Size p = Probability of ‘Success’ x = Number of ‘Successes’ in Sample (x = 0, 1, 2, ..., n)

Binomial Distribution Characteristics Mean n = 5 p = 0.1 Standard Deviation n = 5 p = 0.5

Useful Observation 1 • For any X and Y

One Binary Outcome • Random variable X, one binary outcome • Code success as 1, failure as 0 • P(success)=p, P(failure)=(1-p)=q • E(X) = p

Mean of a Binomial • Independent, identically distributed • X1, …, Xn; E(Xi)=p; Binomial X = By useful observation 1

Useful Observation 2 • For independent X and Y

Useful Observation 3 • For independent X and Y cancelled by obs. 2

Variance of Binomial • Independent, identically distributed • X1, …, Xn; E(Xi)=p; Binomial X =

Useful Observation 4 • For any X

Continuous random variable

Continuous Prob. Density Function 1. Mathematical Formula 2. Shows All Values, x, and Frequencies, f(x) • f(x) Is Not Probability 3. Properties (Value, Frequency) f(x)  f ( x ) dx  1 x a b All x (Area Under Curve) Value f ( x )  0, a  x  b

Continuous Random Variable Probability d  P ( c  x  d )  f ( x ) dx c f(x) Probability Is Area Under Curve! X c d

Uniform Distribution f(x) 1. Equally Likely Outcomes 2. Probability Density 3. Mean & Standard Deviation x c d Mean Median

Uniform Distribution Example • You’re production manager of a soft drink bottling company. You believe that when a machine is set to dispense 12 oz., it really dispenses 11.5 to 12.5 oz. inclusive. • Suppose the amount dispensed has a uniform distribution. • What is the probability that lessthan11.8 oz. is dispensed?

Uniform Distribution Solution f(x) P(11.5 x 11.8) = (Base)(Height) = (11.8 - 11.5)(1) = 0.30 1.0 x 11.5 11.8 12.5

Normal Distribution 1. Describes Many Random Processes or Continuous Phenomena 2. Can Be Used to Approximate Discrete Probability Distributions • Example: Binomial • Basis for Classical Statistical Inference • A.k.a. Gaussian distribution

Normal Distribution 1. ‘Bell-Shaped’ & Symmetrical 2. Mean, Median, Mode Are Equal 4. Random Variable Has Infinite Range Mean: 평균 Median: 중간값 Mode: 최빈값 * light-tailed distribution

Probability Density Function f(x) = Frequency of Random Variable x  = Population Standard Deviation  = 3.14159; e = 2.71828 x = Value of Random Variable (-< x < )  = Population Mean

Effect of Varying Parameters ( & )

Normal Distribution Probability Probability is area under curve!

Infinite Number of Tables Normal distributions differ by mean & standard deviation. Each distribution would require its own table. That’s an infinite number!

Standardize theNormal Distribution Normal Distribution Standardized Normal Distribution One table!

Intuitions on Standardizing • Subtracting  from each value X just moves the curve around, so values are centered on 0 instead of on  • Once the curve is centered, dividing each value by >1 moves all values toward 0, pressing the curve

Standardizing Example Normal Distribution

Standardizing Example Normal Distribution Standardized Normal Distribution

Module #16: Probability Theory

Module #16: Probability Theory

Presentation Transcript

Module

Module:

Module

Module

Module # Title of Module

Module:

MODULE

Module

Module