870 likes | 882 Views
Chapter 6. Random Variables and Probability Distributions. Created by Kathy Fritz. What are possible values for x ?. Consider the chance experiment of randomly selecting a customer who is leaving a store.
E N D
Chapter 6 Random Variables and Probability Distributions Created by Kathy Fritz
What are possible values for x? Consider the chance experiment of randomly selecting a customer who is leaving a store. One numerical variable of interest to the store manager might be the number of items purchased by the customer. Let’s use the letter x to denote this variable. In this example, the values of x are isolated points. Another variable of interest might be y = number of minutes spent in a checkout line. The possible y values form an entire interval on the number line. One possible value of y is 3.0 minutes and another 4.0 minutes, but any other number between 3.0 and 4.0 is also a possibility. Until a customer is selected and the number of items counted, the value of x is uncertain.
Random Variable In this chapter, we will look at different distributions of discrete and continuous random variables. A random variable is a numerical variable whose value depends on the outcome of a chance experiment. A random variable associates a numerical value with each outcome of a chance experiment. • A random variable is discrete if its possible values are isolated points along the number line. • A random variable is continuous if its possible values are all points in some interval. This is typically a “count” of something. This is typically a “measure” of something
Identify the following variables as discrete or continuous • The number of items purchased by each customer • The amount of time spent in the checkout line by each customer • The weight of a pineapple • The number of gas pumps in use Discrete Continuous Continuous Discrete
Probability Distributions for Discrete Random Variables Properties
In Wolf City (a fictional place), regulations prohibit more than five dogs or cats per household. Let x = the number of dogs or cats per household in Wolf City X 0 1 2 3 4 5 Is this variable discrete or continuous? What are the possible values for x? Although you know what the possible values for x are, it would also be useful to know how this variable would behave if it were observed for many houses. A discrete probability distribution provides this information.
Discrete Probability Distribution The probability distribution of a discrete random variable x gives the probability associated with each possible xvalue. Each probability is the long-run proportion of the time that the corresponding xvalue will occur. Common ways to display a probability distribution for a discrete random variable are a table, probability histogram,or formula. If one possible value of x is 2, it is common to write p(2) in place of P(x = 2).
Properties of Discrete Probability Distributions • For every possible x value, 0 <P(x) < 1. 2) The sum of P(x) over all values of xis equal to one. SP(x) = 1.
Suppose that each of four randomly selected customers purchasing a refrigerator at an appliance store chooses either an energy-efficient model (E) or one from a less expensive group of models (G) that do not have an energy-efficient rating. Assume that these customers make their choices independently of one another and that 40% of all customers select an energy-efficient model. Consider the next four customers. Let: x = the number of energy efficient refrigerators purchased by the four customers What are the possible values for x? 1 0 3 4 2
Refrigerators continued . . . x = the number of energy efficient refrigerators purchased by the four customers P(0) = P(GGGG) = 0.6(0.6)(0.6)(0.6) = 0.1296 P(1) = P(EGGG) + P(GEGG) + P(GGEG) + P(GGGE) = 0.0864 + 0.0864 + 0.0864 + 0.0864 = 0.3456 Similarly,P(2) = 0.3459 P(3) = 0.1536 P(4) = 0.0256 The probability distribution of x is summarized in the following table:
Refrigerators continued . . . The probability distribution can be used to determine probabilities of various events involving x. This means that in the long run, a group of four refrigerator purchasers will include at least two who select energy-efficient models about 52.48% of the time.
Refrigerators continued . . . Does this include the x value of 2? In discrete probability distributions, pay close attention to whether the value in the probability statement is included (≤ or ≥) or the value is not included (< or >).
Refrigerators continued . . . A probability histogram is a graphical representation of a discrete probability distribution. The graph has a rectangle centered above each possible value of x. The area of each rectangle is proportional to the probability of the corresponding value.
Probability Distributions for Continuous Random Variables Properties
Consider the random variable: x = the weight (in pounds) of a full-term newborn child Suppose that weight is reported to the nearest pound. The following probability histogram displays the distribution of weights. Now suppose that weight is reported to the nearest 0.1 pound. This would be the probability histogram. What type of variable is this? What is the sum of the areas of all the rectangles? This is an example of a density curve. Notice that the rectangles are narrower and the histogram begins to have a smoother appearance. If weight is measured with greater and greater accuracy, the histogram approaches a smooth curve. The area of the rectangle centered over 7 pounds represents the probability 6.5 <x< 7.5 The shaded area represents the probability 6 <x< 8.
Probability Distributions for Continuous Variables A probability distribution for a continuous random variable x is specified by a curve called a density curve. The function that describes this curve is denoted by f(x) and is called the density function. The probability that x falls in any particular interval is the area under the density curve and above the interval.
Properties of continuous probability distributions 1.f(x) > 0 (the curve cannot dip below the horizontal axis) 2. The total area under the density curve equals one.
Density Time (in minutes) Suppose x is a continuous random variable defined as the amount of time (in minutes) taken by a clerk to process a certain type of application form. Suppose x has a probability distribution with density function: The following is the graph of f(x), the density curve: When the density is constant over an interval (resulting in a horizontal density curve), the probability distribution is called a uniform distribution. Why is the height of this density curve 0.5?
Application Problem Continued . . . What is the probability that it takes at least 5.5 minutes to process the application form? P(x ≥ 5.5) = (6 - 5.5)(.5) = .25 Find the probability by calculating the area of the shaded region (base × height). Density Time (in minutes)
Application Problem Continued . . . What is the probability that it takes exactly 5.5 minutes to process the application form? P(x = 5.5) = 0 x = 5.5 is represented by a line segment. What is the area of this line segment? Density Time (in minutes)
Application Problem Continued . . . What is the probability that it takes more than 5.5 minutes to process the application form? P(x > 5.5) = (6 - 5.5)(.5) = .25 In continuous probability distributions, P(x > a) and P(x ≥ a) are equal! Density Time (in minutes)
Two hundred packages shipped using the Priority Mail rate for packages less than 2 pounds were weighed, resulting in a sample of 200 observations of the variable x = package weight (in pounds) from the population of all Priority Mail packages under 2 pounds. A histogram (using the density scale, where height = (relative frequency)/(interval width)) of 200 weights is shown below. The shape of this histogram suggests that a reasonable model for the distribution of x might be a triangular distribution.
Two hundred packages shipped using the Priority Mail rate for packages less than 2 pounds were weighed, resulting in a sample of 200 observations of the variable x = package weight (in pounds) from the population of all Priority Mail packages under 2 pounds. The easiest way to find the area of the shaded region is to find 1 – the area of x ≤ 1.5. What proportion of the packages weigh over 1.5 pounds? h = 0.75 b = 1.5
Students at a university use an online registration system to register for courses. The variable x = length of time (in minutes) required for a student to register was recorded for a large number of students using the system. The resulting values were used to construct a probability histogram (below). How can you find the area under this smooth curve? A smooth curve has been superimposed on the histogram and is a reasonable model for the probability distribution of x. The general form of the histogram can be described as bell shaped and symmetric.
Don’t worry – we will use tables (with the values already calculated). We can also use calculators or statistical software to find the area. Some density curves resemble the one below. Integral calculus is used to find the area under these curves.
The probability that a continuous random variable x lies between a lower limit aand an upper limit bis P(a < x < b) = (cumulative area to the left of b) – (cumulative area to the left of a) P(a < x < b) = P(x < b) – P(x < a) = -
Mean and Standard Deviation of a Random Variable Of Discrete Random Variables Of Continuous Random Variables
Means and Standard Deviations of Probability Distributions The mean value of a random variable x, denoted by mx, describes where the probability distribution of x is centered. The standard deviation of a random variable x, denoted by sx, describes variability in the probability distribution. When the value of sx is small, observed values of x will tend to be close to the mean value. The larger the value of sx the more variability there will be in observed x values.
How do the means and standard deviations of these three density curves compare? These two density curves have the same mean but different standard deviations. What happens to the appearance of the density curve as the standard deviation increases?
Mean Value for a Discrete Random Variable The mean value of a discrete random variable x, denoted by mx ,is computed by first multiplying each possible xvalue by the probability of observing that value and then adding the resulting quantities. Symbolically, The term expected value is sometimes used in place of mean value and E(x) is another way to denote mx . all possible x values
Individuals applying for a certain license are allowed up to four attempts to pass the licensing exam. Consider the random variable x = the number of attempts made by a randomly selected applicant The probability distribution of x is as follows:
Standard Deviation for a Discrete Random Variable all possible x values
Mean and Standard Deviation When x is Continuous For continuous probability distributions, mxand sx can be defined and computed using methods from calculus. The mean value mx locates the center of the continuous distribution and gives the approximate long-run average of observed x values. The standard deviation, sx, measures the extent to which the continuous distribution (density curve) spreads out around mxand indicates the amount of variability that can be expected in observed x values.
4300 4500 4700 4900 A company can purchase concrete of a certain type from two different suppliers. Let x = compression strength of a randomly selected batch from Supplier 1 y = compression strength of a randomly selected batch from Supplier 2 Suppose that mx = 4650 pounds/inch2sx = 200 pounds/inch2 my = 4500 pounds/inch2sy = 275 pounds/inch2 Which supplier should the company purchase the concrete from? Explain. The density curves look similar to these below. my mx
Consider the experiment in which a customer of a propane gas company is randomly selected. Suppose that the mean and standard deviation of the random variable x= number of gallons required to fill a propane tank is 318 gallons and 42 gallons, respectively. The company is considering two different pricing models. Model 1: $3 per gallon Model 2: service charge of $50 + $2.80 pergallon The company is interested in the variable y= amount billed For each of the two models, y can be expressed as a function of the random variable x : ymodel 1 = 3x ymodel 2 = 50 + 2.8x
Revisit the propane gas company . . . m = 318 gallons s = 42 gallons The mean billing amount for Model 1 is a bit higher than for Model 2, as is the variability in billing amounts. Model 2 results in slightly more consistency from bill to bill in the amount charged. The company is considering two different pricing models. Model 1: $3 per gallon Model 2: service charge of $50 + $2.80 pergallon
Let’s consider a different type of problem . . . Suppose that you have three tasks that you plan to do on the way home. x1 = time required to return book x2 = time required to deposit check x3 = time required to buy printer paper You can define a new variable, y, to represent the total amount of time to complete these tasks y = x1 + x2 + x3 Return library book Purchase printer paper Deposit paycheck
Linear Combinations If x1, x2, …, xn are random variables and a1, a2, …, an are numerical constants, the random variable y defined as y = a1x1 + a2x2 + … + anxn is a linear combination of the xi’s. Let’s see how to compute the mean, variance, and standard deviation of a linear combination.
Mean and Standard Deviations for Linear Combinations This result is true regardless of whether the xi’s are independent. If x1, x2, …, xn are random variables with means m1, m2, …, mn and variances s12, s22, …, sn2, respectively, and y = a1x1 + a2x2 + … + anxn then This result is true ONLY if the xi’s are independent.
A commuter airline flies small planes between San Luis Obispo and San Francisco. For small planes the baggage weight is a concern. Suppose it is known that the variable x = weight (in pounds) of baggage checked by a randomly selected passenger has a mean and standard deviation of 42 and 16, respectively. Consider a flight on which 10 passengers, all traveling alone, are flying. The total weight of checked baggage, y, is y = x1 + x2 + … + x10
Airline Problem Continued . . . mx = 42 and sx = 16 The total weight of checked baggage, y, is y = x1 + x2 + … + x10 What is the mean total weight of the checked baggage? mx= m1 + m2 + … + m10 = 42 + 42 + … + 42 = 420 pounds
Airline Problem Continued . . . mx = 42 and sx = 16 The total weight of checked baggage, y, is y = x1 + x2 + … + x10 What is the standard deviation of the total weight of the checked baggage? Since the 10 passengers are all traveling alone, it is reasonable to think that the 10 baggage weights are unrelated and therefore independent.
Binomial and Geometric Distributions Properties of Binomial Distributions Meanof Binomial Distributions Standard Deviation of Binomial Distributions Properties of Geometric Distributions
Suppose we decide to record the gender of the next 25 newborns at a particular hospital. What is the chance that at least 15 are female? What is the chance that between 10 and 15 are female? Out of the 25 newborns, how many can we expect to be female? These questions can be answered using a binomial distribution.
Properties of a Binomial Experiment A binomial experiment consists of a sequence of trials with the following conditions: • There are a fixed number of trials • Each trial results in one of only two possible outcomes, labeled success (S) and failure (F). • Outcomes of different trials are independent • The probability of success is the same for each trial. The binomial random variable x is defined as x = the number of successes observed when a binomial experiment is performed We use n to denote the fixed number of trials. The term success does not necessarily mean something positive. For example, if the random variable is the number of defective items produced, then being “defective” is a success. The probability distribution of x is called the binomial probability distribution.
Binomial Probability Formula: Let n = number of independent trials in a binomial experiment p = constant probability that any particular trial results in a success Then Notice that the probability distribution is specified by a formula rather than a table or probability histogram. . . . can be written as nCx
Sixty percent of all computers sold by a large computer retailer are laptops and 40% are desktop models. The type of computer purchased by each of the next 12 customers will be recorded. Define the random variable of interest as x= the number of laptops among these 12 The binomial random variable x counts the number of laptops purchased. The purchase of a laptop is considered a success and is denoted by S. The probability distribution of x is given by