500 likes | 507 Views
This course covers probability trees, probability distributions, special distributions, and statistical modeling. Topics include ways of determining probabilities, combining events, conditional probabilities, and statistical independence.
E N D
Statistics & Data Analysis Course Number B01.1305 Course Section 31 Meeting Time Wednesday 6-8:50 pm CLASS #3
Class #3 Outline • Brief review of last class + Probability Trees • Questions on homework • Chapter 4: Probability Distributions • Chapter 5: Some Special Distributions
Review of Last Class • Introduction to probability • Ways of determining probabilities • Rules for combining probabilities • Conditional probabilities • Probability Trees
Combining Events • The union A B is the event consisting of all outcomes in A or in B or in both. • The intersection A B is the event consisting of all outcomes in both A and B. • If A B contains no outcomes then A, B are said to be mutually exclusive . • The Complement of the event A consists of all outcomes in the sample space S which are not in A.
Statistical Independence Events A and B are statistically independent if and only if P(B|A) = P(B). Otherwise, they are dependent. If events A and B are independent, then P(A B) = P(A)P(B)
Constructing Probability Trees • Events forming the first set of branches must have known marginal probabilities, must be mutually exclusive, and should exhaust all possibilities • Events forming the second set of branches must be entered at the tip of each of the sets of first branches. Conditional probabilities, given the relevant first branch, must be entered, unless assumed independence allows the use of unconditional probabilities • Branches must always be mutually exclusive and exhaustive
Let’s Make a Deal • In the show Let’s Make a Deal, a prize is hidden behind on of three doors. The contestant picks one of the doors. • Before opening it, one of the other two doors is opened and it is shown that the prize isn’t behind that door. • The contestant is offered the chance to switch to the remaining door. • Should the contestant switch? • Solve by making a tree…
Employee Drug Testing • A firm has a mandatory, random drug testing policy • The testing procedure is not perfect. • If an employee uses drugs, the test will be positive with probability 0.90. • If an employee does not use drugs, the test will be negative 95% of the time. • Confidential sources say that 8% of the employees are drug users • 8% is an unconditional probability; 90 and 95% are conditional probabilities
Employee Drug Testing (cont.) • Create a probability tree and verify the following probabilities: • Probability of randomly selecting a drug user who tests positive = 0.072 • Probability of randomly selecting a non-user who tests positive = 0.046 • Probability of randomly selecting someone who tests positive = 0.118 • Conditional probability of testing positive given a non-drug user = 0.05
Statistical Modeling • Statistical modeling is the process of creating mathematical representations that reflect physical phenomena and make use of any available data or information • Possible uses: • Description • Prediction • Optimization • uncertainty analysis • Statistical models may integrate real-world data with results from physical experiments and computer codes.
Chapter Goals • To understand the concepts of probability distributions • To be able to calculate the expected value and standard deviation of a distribution • To understand the difference between sample mean and expected value
Random Variables • A quantity that takes on different values depending on chance • Next quarter’s sales for a given company • The proportion of interviewees that express and intention to buy • Your day- trading profits for next year • The number of free throws out of 10 a player makes • Number of defective products produced next week • A random variable is the result of a random experiment in the abstract sense, before the experiment is performed • The value the random variable actually assumes is called an observation • A probability distribution is the pattern of probabilities a random variable assumes
Random Variables (cont) • You can think of your data set as observations of a random variable resulting from several repetitions of a random experiment • We associate the random variable with a population and view observations of the random variable as data • Example: • Suppose we toss a coin five times. The observed data set is a sequence of zeros and ones, such as 1 1 0 1 0. Each of the five digits in this sequence represents the outcome of the random experiment of tossing a coin once, where 1 denotes Heads and 0 denotes Tails. We have five repetitions of the experiment.
Discrete Probability Distribution • A list of the possible values of a discrete random variable, together with their associated probabilities • The probability distribution tells us everything we can know about a random variable, before it becomes an observation • Example: Distribution of # Heads in Two Tosses • S = {HH, HT, TH, TT}
Example: Discrete Distribution • When Quality Control testing entails destroying the tested product, for obvious economic reasons, a sample of items are tested. • A plant that produces cell phones in equal quantities on two production lines. • Quality control experts determined the plant should test three randomly selected phones; 2 from one line and 1 from the other, where the number from each line is chosen by flipping a coin three times • Construct the probability distribution of the number of phones chosen from Line 1 • Construct sample space • Probability • Random variable value
Example: Discrete Distribution • We want to conduct two one-on-one interviews with neurologists to get their opinions on an existing drug • Suppose that a random sample of two neuros is to be selected from all neuros consisting of 70% who have ever prescribed the drug and 30% who have not • Questions • List all possible outcomes of the selection • Assign probabilities • Define the quantitative variable Y as the number of neuros who have prescribed the drug in the sample. Specify the possible values that the random variable assume and determine the probability of each
Example: CDF • Suppose a financial firm plans to release a new fund. The fund manager has assessed the following subjective probabilities for the first-year return for a $10,000 investment • Find the following probabilities, as assessed by the fund manager:
Example: Expected Value • A firm is considering two possible investments. The firm assigns rough probabilities of losing 20% per dollar invested, losing 10%, breaking even, gaining 10%, and gaining 20%. • Let Y be the return per dollar invested in the first project and Z the return per dollar invested in the second.
Standard Deviation • Measure of probability dispersion, variability, or risk of a random variable
Example: Standard Deviation • A firm is considering two possible investments. The firm assigns rough probabilities of losing 20% per dollar invested, losing 10%, breaking even, gaining 10%, and gaining 20%. • Let Y be the return per dollar invested in the first project and Z the return per dollar invested in the second.
Standard Deviation (cont) • Note: σ2 and σ defined above are theoretical variance and standard deviation of X. You don’t need any data to compute them. You just need to know the distribution of X. • The mean of a random variable is NOT the same thing as a sample mean. The variance of a random variable is NOT the same thing as a sample variance.
Continuous Random Variables • As with discrete probability distribution functions, one can also determine the expected value and standard deviation of continue PDFs • Mathematical definitions for continuous random variables necessarily involve calculus • Sums are simply replaced by integrals • This is beyond the expectations for this class
Example • A call option on a stock is being evaluated. • If the stock goes down, the option expires and is worthless. If it goes up, the payoff depends on how high the stock goes. • Assume a discrete payoff distribution: • Questions: • What is the expected value of the payoff? • What is the standard deviation of the payoff? • Find the probability that the option will pay at least $15 • Find the probability that the option will pay less than $20
Chapter 5 Some Special Probability Distributions
Chapter Goals • Introduce some special, often used distributions • Understand methods for counting the number of sequences • Understand situations consisting of a specified number of distinct success/failure trials • Understanding random variables that follow a bell-shaped distribution
Counting Possible Outcomes • In order to calculate probabilities, we often need to count how many different ways there are to do some activity • For example, how many different outcomes are there from tossing a coin three times? • To help us to count accurately, we need to learn some counting rules • Multiplication Rule : If there are m ways of doing one thing and n ways of doing another thing, there are m times n ways of doing both
Example • An auto dealer wants to advertise that for $20G you can buy either a convertible or 4-door car with your choice of either wire or solid wheel covers. • How many different arrangements of models and wheel covers can the dealer offer?
Counting Rules • Recall the classical interpretation of probability:P(event) = number of outcomes favoring event / total number of outcomes • Need methods for counting possible outcomes without the labor of listing entire sample space • Counting methods arise as answers to: • How many sequences of k symbols can be formed from a set of r distinct symbols using each symbol no more than once? • How many subsets of k symbols can be formed from a set of r distinct symbols using each symbol no more than once? • Difference between a sequence and a subset is that order matters for a sequence, but not for a subset
Counting Rules (cont) • Create all k=3 letter subsets and sequences of the r=5 letters: A, B, C, D and E • How many sequences are there? • How many subsets are there?
Example • A group of three electronic parts is to be assembled into a plug-in unit for a TV set • The parts can be assembled in any order • How many different ways can they be assembled? • There are eight machines but only three spaces on the machine shop floor. • How many different ways can eight machines be arranged in the three available spaces? • The paint department needs to assign color codes for 42 different parts. Three colors are to be used for each part. How many colors, taken three at a time would be adequate to color-code the 42 parts?
Review: Sequence and Subset • For a sequence, the order of the objects for each possible outcome is different • For a subset, order of the objects is not important
Binomial Distribution • Percentages play a major role in business • When percentage is determined by counting the number of times something happens out of the total possibilities, the occurrences might following a binomial distribution • Examples: • Number of defective products out of 10 items • Of 100 people interviewed, number who expressed intention to buy • Number of female employees in a group of 75 people • Number of Independent Party votes cast in the next election
Binomial Distribution (cont) • Each time the random experiment is run, either the event happens or it doesn’t • The random variable X, defined as the number of occurrences of a particular event out of n trials has a binomial distribution if: • For each of the n trials, the event always has the same probability of happening • The trials are independent of one another
Example: Binomial Distribution • You are interested in the next n=3 calls to a catalog order desk and know from experience that 60% of calls will result in an order • What can we say about the number of calls that will result in an order? • Create a probability tree • Questions: • What is the expected number of calls resulting in an order • What is the standard deviation
Example: Binomial Probabilities • How many of your n=6 major customers will call tomorrow? • There is a 25% chance that each will call • Questions: • How many do you expect to call? • What is the standard deviation? • What is the probability that exactly 2 call? • What is the probability that more than 4 call?
Example • It’s been a terrible day for the capital markets with losers beating winners 4 to 1 • You are evaluating a mutual fund comprised of 15 randomly selected stocks and will assume a binomial distribution for the number of securities that lost value • Questions: • What assumptions are being made? • How many securities do you expect to lose value? • Find the probability that 8 securities lose value • What is the probability that 12 or more lose value?
Computing Tutorial • Simulation • Calculating probabilities
Homework #3 To be handed out in class
Next Time • Normal distribution • Statistical Inference