210 likes | 397 Views
Discrete Probability Distribution. Henry Mesa Use your keyboard’s arrow keys to move the slides forward (▬►) or backward (◄▬) Hit the Esc key to end show. In this slide show we will look at creating a probability distribution. A probability distribution consists of. The sample space
E N D
Discrete Probability Distribution Henry Mesa Use your keyboard’s arrow keys to move the slides forward (▬►) or backward (◄▬) Hit the Esc key to end show
In this slide show we will look at creating a probability distribution. A probability distribution consists of • The sample space • The probability of the measurements in that sample space. Visualizing, or creating the sample space can be the portion that takes the most time, and I will show you two different examples. One will be used again in chapter 17, and belongs to a very specific discrete distribution called Binomial. The Binomial distribution is basic, very important, and used often in practice. It is also used to derive other distribution types. We will create this distribution so you can see the structure and then understand what this type of distribution can represent. I will first start with a situation in which every member of the population is equally likely to be chosen.
Suppose that I have two dice - or I have one die and I am throwing it twice. Notice this is a situation in which I am sampling twice; a sampling distribution. This die is fair, with six sides, and the sides are marked with the values of {1, 2, 3, 4, 5, 6}. And my random variable, D, will be the sum of the two dice. So, how do I create the sample space here? Organization is essential, so I am going to create a grid. Since, I only have two dice, and both die are fair, this will be the easiest method.
Inside this grid will go the sum of the two dice. Notice that while we can not tell the difference (if both die are identical) as to which produces a “four” for example, the two die are physically two separate die, and must be recognized as such as we create the sample space. So, now I will put the sum of the result of adding the row value with the column value, in the grid itself. Using the results here I can now create the sample space of the random variable D, which is the sum of the two die, and I can also find the probability of each outcome.
The sample space of D is {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}. Do not confuse this with the probability of getting those events. I want to make an important distinction at this moment, and the main issue you will see in chapters 17 and 18. Again, notice that I am “sampling” twice from the population, which consists of the values of a die. Furthermore, what am I doing with the values I sampled? Adding them. I know that this may be difficult to grasp, at the moment, what I am trying to convey with the “sampling and then adding” statement above, but keep in mind that what I am saying is at the heart of what is Statistics!
Before continuing with the original problem, I want to emphasize what I am trying to allude to. In Statistics we measure so we can understand some characteristic of a population. But then once you collect the data, you must summarize this information (i.e. chapters 4 and 5 provides examples of those summaries). The purpose of the summaries is to condense all the data to values that can be interpreted and somehow represent the characteristic that we want to learn more about. But what we are interested in is the “chance” of getting the values of those summaries. Gosh, can you tell that I am pulling my hair trying to pass some important information to you? We will revisit this in chapter 18 and in the modules after exam 3. Back to the problem.
Notice that every single value on that grid is equally likely because the dice are fair. So, because there are 36 squares, all equally likely, the sum of “six” (color coded yellow ) is equally likely compared to the other “six” (color coded blue). All the sixes in that grid are equally likely to occur, and all are unique (occur because of a unique roll of the dice). Notice that P(six) = 5/36, since there are 36 possibilities all equally likely, and five of the 36 meet the criteria of “sum of six”.
So, here is the probability distribution, which is created by matching the sample space with the probability of each outcome. Notice to get the probabilities all I have to do is count how many of that particular sum appear. If the die was not fair, getting the probabilities would not be as easy.
Here is a different problem in which things are not equally likely and counting will not work. At a college 59% of the student body are women. Suppose that four students are sampled at random. I will define the random variable X, to count how many women are in the sample of four. Notice this is a scenario where I am sampling four times. So, my sample space, for the random variable X is… see if you can write down the sample space on your own. {X | 0, 1, 2, 3, 4} Now, I need to find the probability of each of those values. Lets look at the two easiest value to calculate: P(X = 0) and P(X = 4). So, P(X = 4) means P(woman AND woman AND woman AND woman). Do I have independence during my sampling? Good enough to say we do have independence, since I am assuming this is a college with a couple thousand students, and I am only sampling four people from that population. P(woman | woman) ≈ P(woman), P(woman | woman AND woman AND woman) ) ≈ P(woman),
At a college 59% of the student body are women. Suppose that four students are sampled at random. I will define the random variable X, to count how many women are in the sample of four. So, my sample space, for the random variable X is {X | 0, 1, 2, 3, 4}. So, P(X = 4) means P(woman AND woman AND woman AND woman) = (0.59)(0.59)(0.59)(0.59) = 0.1212 Try and calculate P(X = 0) on your own. P(Not Woman AND Not Woman AND Not Woman AND Not Woman) = (1 – 0.59)4 = 0.0283
At a college 59% of the student body are women. Suppose that four students are sampled at random. I will define the random variable X, to count how many women are in the sample of four. {X | 0, 1, 2, 3, 4} Now let us get to the tougher issue of finding P(X = 1), P(X = 2) and P(X = 3). We saw what it is we are after when we found P(X = 4) for instance, but because the number of possibilities are going to increase to make the notation a bit cleaner instead of writing Woman AND Woman AND Woman AND Woman I will write WWWW, and for Not Woman AND Not Woman AND Not Woman AND Not Woman I will write NNNN. So W indicates a person is a woman, and N indicates a person is not a woman.
At a college 59% of the student body are women. Suppose that four students are sampled at random. I will define the random variable X, to count how many women are in the sample of four. {X | 0, 1, 2, 3, 4} To see how we can get the rest of this probabilities, I will use a tree diagram. P(X = 1) = P(WNNN OR NWNN OR NNWN OR NNNW) So, P(W) = 0.59 P(N) = 0.41, we have independence, the compound events are disjoint, thus P(X = 1) = 0.59(0.41)3 + 0.59(0.41)3 + 0.59(0.41)3 + 0.59(0.41)3 = 4(0.59)(0.41)3 = 0.1627 I can follow a similar approach to find P(X = 2), P(X = 3).
At a college 59% of the student body are women. Suppose that four students are sampled at random. I will define the random variable X, to count how many women are in the sample of four. {X | 0, 1, 2, 3, 4} See if you can find P(X = 2). P(X = 2) = P(WWNN OR WNWN OR WNNW OR NWNW OR NNWW OR NWWN) P(X = 2) = 6(0.59)2 (0.41)2 = 0.3511
At a college 59% of the student body are women. Suppose that four students are sampled at random. I will define the random variable X, to count how many women are in the sample of four. {X | 0, 1, 2, 3, 4} See if you can find P(X = 3). Use the tree as a guide. P(X = 3) = P(WWWN OR WWNW OR WNWW OR NWWW) P(X = 3) = 4(0.59)3 (0.41) = 0.3368
At a college 59% of the student body are women. Suppose that four students are sampled at random. I will define the random variable X, to count how many women are in the sample of four. Thus the probability distribution of X is
An over the counter AIDS test is 90% effective at providing a positive result if a person has AIDS. A sample of three people with AIDS is randomly chosen and the AIDS test is given to them. Let the random variable X count the number of positive results from the sample of three. Again, notice that I am sampling three times from the population. 1. What is the sample space of X? {0, 1, 2, 3} 2. Find the probability of each outcome. P(X = 0 ) = 0.13 = 0.0010 P(X = 3 ) = 0.93 = 0.7290
An over the counter AIDS test is 90% effective at providing a positive result if a person has AIDS. A sample of three people with AIDS is randomly chosen and the AIDS test is given to them. Let the random variable X count the number of positive results from the sample of three. 1. What is the sample space of X? {0, 1, 2, 3} 2. Find the probability of each outcome. P(X = 1 ) = 3(0.1)2(0.9) = 0.0010 P(X = 2) =3(0.1)( 0.9)2 = 0.2430
An over the counter AIDS test is 90% effective at providing a positive result if a person has AIDS. A sample of three people with AIDS is randomly chosen and the AIDS test is given to them. Let the random variable X count the number of positive results from the sample of three. 1. What is the sample space of X? {0, 1, 2, 3} 2. Find the probability of each outcome. P(X = 1 ) = 3(0.1)2(0.9) = 0.0010 P(X = 2) =3(0.1)( 0.92) = 0.2430 3. Write down the probability distribution of X.