1 / 46

DATA 220 Mathematical Methods for Data Analysis September 24 Class Meeting

DATA 220 Mathematical Methods for Data Analysis September 24 Class Meeting. Department of Applied Data Science San Jose State University Fall 2019 Instructor: Ron Mak www.cs.sjsu.edu/~mak. Recall: Interpretations of Probability.

ralphwest
Download Presentation

DATA 220 Mathematical Methods for Data Analysis September 24 Class Meeting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DATA 220Mathematical Methods for Data AnalysisSeptember 24 Class Meeting Department of Applied Data ScienceSan Jose State UniversityFall 2019Instructor: Ron Mak www.cs.sjsu.edu/~mak

  2. Recall: Interpretations of Probability • Classical interpretation: The probability of an event E is the ratio of the number of outcomes Nefavorable to event E to the total number of possible outcomes N. • Example: In an arbitrary line-up of 15 children, what is the probability of a bad line-up, where Mary and John are together?

  3. Reminder: Interpretations of Probability, cont’d • Relative frequency interpretation: Repeat an experiment n times and count number of outcomes nefavorable to event E. The approximation of the probability of E improves as n grows larger: • Example: Lab Exercise 4, you generated n random numbers of a given probability distribution. Your graph of the distribution approached its theoretical shape as you increased the value of n.

  4. Counting When Order Matters • Left unsaid last week: When we were counting the number of items in sequences (letter sequences, 3-digit numbers, the number of possible teams, etc.), order was significant. • ABCD and ADCB are two different items. • 123 and 321 are two different items. • In a baseball team of 9 children, there are nine positions (roles). A team where Susan is the pitcher and Bobby is the catcher is a different team from one where Bobby is the pitcher and Susan is the catcher.

  5. Counting When Order Doesn’t Matter • How do we count the number of items in a collection, where order doesn’t matter? • Example: How many 3-digit numbers are there using the digits 1 through 9 that have no repeated digits, but numbers like 123, 132, 213, 231, 312, and 321 are all considered to be the same item?

  6. Counting When Order Doesn’t Matter, cont’d • Example: How many different committees of 4 children can you make from a group of 15 children? • Order doesn’t matter – George, Katie, Tom, and Jill are the same committee no matter in what order you list them. • Suppose instead of a committee, we are making a slate of 4 class officers: president, vice president, secretary, and treasurer. How many possible slates?

  7. Counting When Order Doesn’t Matter, cont’d • We can make the slate of 4 class officers(order does matter) using a different strategy that involves two steps: • Make a committee of 4 students. (We don’t know yet how many ways there are to do this.) • Assign each of these 4 students to an office. There are 4! possible ways of assigning offices to the 4 students. { { { { } } } } . the number of ways to make a committee the number of ways to select class officers the number of ways to make a committee the number of ways to select class officers 4! =

  8. Counting When Order Doesn’t Matter, cont’d The number of ways to choose a collection of k objects without repetition among n objects, and order doesn’t matter is { { { { } } } } . the number of ways to make a committee the number of ways to select class officers the number of ways to make a committee the number of ways to select class officers 4! =

  9. Counting When Order Doesn’t Matter, cont’d • Example: Pizza Palace offers 7 toppings from which you can choose 3. How many different pizzas can you possibly order? • n = 7, k = 3 Notice that there are 3 factors above and 3 factors below.

  10. Binomial Coefficients • The formula has its own math notation: • The is a binomial coefficient. • Commonly read “n choose k”.

  11. Binomial Coefficients, cont’d • Counting the number of ways to make a committee of 4 students out of 15 students is the same as counting the number of ways to leave 11 students off of the committee:

  12. Binomial Coefficients, cont’d • Example: How many possible 5-card hands can you make from a standard deck of 52 cards? • The answer iswhich will be much too difficult to calculate. • But you know there will be cancellations in the numerator and in the denominator: There are 5 factors above and 5 factors below.

  13. Counting Formulas So Far • What about those collections (order never matters) where we allow repetitions?

  14. Collections that Allow Repetitions • Example: You want to make a fruit salad that contains 5 servings (1 fruit = 1 serving) of fruit. In your fruit bowl, you have the following 8 varieties of fruit: • apple, apricot, banana, kiwi, lime, orange, pear, quince • If it’s OK to have more than one serving (among the 5 servings) of any fruit – repetitions are allowed – how many different salads can you make? • Note that the order doesn’t matter how the fruits are in your salad. Listed alphabetically here only for convenience.

  15. Collections that Allow Repetitions, cont’d apple, apricot, banana, kiwi, lime, orange, pear, quince • Let’s solve this problem graphically! • Line up 5 white squares (1 per fruit serving in your salad) and 7 gray squares to separate the 8 varieties of fruit. • The order matters how we line up the white and gray squares. • The first gray square separates apples and apricots. The second gray square separates apricots and bananas. The third gray square separates bananas and kiwis, etc. Again, we refer to the fruits alphabetically only for convenience.

  16. Collections that Allow Repetitions, cont’d apple, apricot, banana, kiwi, lime, orange, pear, quince • Example: Your salad contains an apple, an apricot, two oranges, and a pear. • All the gray squares are alike. Each one has a fruit separation role due solely from its position: first, second, third, etc. orange orange pear apple apricot separate apricots and bananas separate bananas and kiwis separate kiwis and limes separate limes and oranges separate oranges and pears separate pears and quinces separate apples and apricots

  17. Collections that Allow Repetitions, cont’d apple, apricot, banana, kiwi, lime, orange, pear, quince • Example: Your salad contains a banana, a kiwi, an orange, and two quinces. • We have a one-to-one correspondence between the placement of gray squares and the fruits we have in our salad. orange quince quince banana kiwi separate bananas and kiwis separate apples and apricots separate kiwis and limes separate limes and oranges separate pears and quinces separate oranges and pears separate apricots and bananas

  18. Collections that Allow Repetitions, cont’d apple, apricot, banana, kiwi, lime, orange, pear, quince • There are many possible positions for the 7 gray squares. Some more examples:

  19. Collections that Allow Repetitions, cont’d • Since all 7 gray squares are all alike, their order doesn’t matter. How many ways can we choose to make 7 squares gray out of 12? • We used the mathematical trick of reducingour original problem of how many different fruit salads we can make to a known problem that we can solve.

  20. Collections that Allow Repetitions, cont’d • In the fruit salad problem: • n = 8 = the number of fruit varieties (n – 1 = 7 gray boxes) • k = 5 = the number of fruit servings in the salad (k = 5 white boxes) • (n – 1) + k = 12 = boxes total The number of ways to choose a collection of k objects from n objects, with repetitions allowed and order doesn’t matter is

  21. Counting Formulas • Permutations: Arrangements of items where order does matter. • Combinations: Arrangements of items where order does not matter. • Combinatorics: The branch of mathematics primarily concerned with counting.

  22. Expected Value • You must pay to enter a contest where you’ll be paid $1 times the number that you roll with a die. What’s the most you should pay? • The probability of each number is . Therefore, the average payoff – the expected value – is • Therefore, don’t pay more than $3.50.

  23. Expected Value, cont’d If an experiment has n possible results with probabilities p1, ..., pn and payoffs a1, ..., an, then the expected value of the experiment is

  24. Expected Value, cont’d • Example: A slot machine has 3 reels, each of which has the same 5 different pictures. Assume that each reel is independent of the others, and the 5 pictures on each reel have equal probabilities. The payoffs are: • 3 bells: $50 • 3 of any of the other 4 pictures: $10 • 2 bells and any other picture: $2 • Is it worth $1 to play this slot machine?

  25. Expected Value, cont’d • Is it worth $1 to play this slot machine? • There are 53=125 equally likely outcomes. • P(3 bells) = . • P(3 of any of the other 4 pictures) = . • There are ways that two reels can stop on a bell, which leaves 4 picture choices for the third reel.P(2 bells and any other picture) = . • 3 reels each with 5 pictures: • 3 bells: $50 • 3 of any of the other 4 pictures: $10 • 2 bells and any other picture: $2

  26. Expected Value, cont’d • Mary, age 60, has a life-threatening illness. Should she choose surgical procedure A or B? • A: 40% of cases cured outright with patients living out their average life expectancy (20 more years for Mary), 20% died in surgery, 40% lived 5 more years. • B: 20% of cases cured outright with patients living out their average life expectancy, no one died in surgery, 80% lived 5 more years.

  27. Venn Diagram of • Let A be the set of outcomes favorable to some event EAand B be the set of outcomes favorable to some event EB. There may be some common outcomes in both A and B. • is “A union B”: event Aor event B occurs. • is “A intersect B”: event Aand event B occurs. • What if event A and event B are disjoint? (the empty set)? B A

  28. Venn Diagram of , cont’d • Example: What is the probability that a roll of a die produces an even numberor a number between 2 and 5? • Let A = the set of even number outcomesLet B= the set of between 2 and 5 outcomes 5 6 4 3 2 1

  29. Venn Diagram of • Let A and B be twoindependent events. • The outcome of one event is not dependent on anoutcome of the other event. B A

  30. Venn Diagram of , cont’d • Example: What is the probability that a roll of a die produces an even numberand the number is between2 and 5? • Let A = the set of even number outcomesLet B= the set of between 2 and 5 outcomes 5 6 4 3 2 1

  31. Break

  32. Conditional Probability • What if events A and B are not independent? • Example: What is the probability that a roll of a die produced an even number given that thenumber was between 2 and 5? • Let A = the set of even number outcomesLet B= the set of between 2 and 5 outcomes • Now we look only at the reduced sample space consisting of numbers between 2 and 5. 5 6 4 3 2 1

  33. Conditional Probability, cont’d • Let A = the set of even number outcomesLet B= the set of between 2 and 5 outcomes 5 6 4 3 2 1

  34. Conditional Probability, cont’d • Let A = the set of even number outcomesLet B= the set of between 2 and 5 outcomes 5 6 4 3 2 1

  35. Conditional Probability, cont’d • Example: A spa manufacturer has 25 spas in stock. Each spa is equipped with an on-off switch. The switches, some of which are defective, are supplied by two sources: 15 10 18 7 25

  36. Conditional Probability, cont’d • A random spa is selected for testing. • Let E = event that spa’s switch is from Source 1 • Let F = event that the spa’s switch is defective • Testing reveals a defective switch. What is the probability the switch came from Source 1? 15 10 18 7 25

  37. Independent Events Two events A and B are independent if Otherwise, they are dependent. implies both and

  38. The Monty Hall Problem • Monty Hall hosted a popular TV game show. • You are a contestant on the show. • Monty shows you three closed doors #1, #2, and #3. • Behind one door is a new car, but behind each of the other doors is a goat. You pick one of the doors. • Before Monty opens the door that you picked, he opens one of the other two doors and reveals a goat. • He gives you the option to switch to the third door. • You want to win that car! Should you stay with your original pick, or should you switch?

  39. The Monty Hall Problem, cont’d • After seeing a goat behindthe door that Monty opened: • Should you stay with the dooryou originally picked? • Should you switch to the otherunopened door? • Does it make any difference whether you stay or switch? • This is a conditional probability problem!

  40. The Monty Hall Problem, cont’d • Let’s assume that: • The car can be hidden behind any door with theequal probability of . • You pick door #1 at the start. • The calculations are the same for any other starting door. • Monty opens door #2 to reveal a goat. • The calculations are the same for opening the other door. • If Monty has a choice of two goat doors to open, he can pick either of the two doors with the equal probability of . • Should you switch from door #1 to door #3?

  41. The Monty Hall Problem, cont’d • We need to compute and compare two probabilities: • The probability that the car is behind door #1 given that Monty opened door #2 to reveal a goat. • Important to know if we decide to stay with door #1. • The probability that the car is behind door #3 given that Monty opened door #2 to reveal a goat. • Important to know if we decide to switch to door #3.

  42. The Monty Hall Problem, cont’d If the car is behind #1, Monty can pick eitherdoor #2 or #3 to open. If the car is behind #2, Monty will not open #2. If the car is behind #3, Monty must open #2. Your probability of winning the car is 2x higher if you switch from door #1 to door #3. stay switch

  43. The Monty Hall Problem, cont’d • A more intuitive explanation? • If you stay with your original door: • Your likelihood of winning remains the same:one in three. • If you switch to the other closed door: • One time in three, your original door was right, so you lose by switching. • Two times in three, your original door was wrong, so you win by switching.

  44. Lab Assignment #5: Monty Hall Simulation • Create a Jupyter notebook that simulates being a contestant on Monty Hall’s game program. • The sequence for each simulation: • Randomly hide the car behind door #1, #2, or #3. • You randomly pick a door as the contestant. • Monty opens a door to reveal a goat. • Monty knows behind which door the car is hidden. • If you picked the right door, he randomly chooses one of the other two doors to open and reveal a goat. • Otherwise, he opens the only door he can to reveal a goat.

  45. Lab Assignment #5, cont’d • The sequence for each simulation, cont’d: • Record whether you win the car by staying with your original door or by switching to the other unopened door. • For each simulation, show: • behind which door the car is hidden • which door you picked first • which door Monty opens • which door would be your second pick • whether you win the car by staying or by switching

  46. Lab Assignment #5, cont’d • After 100 simulations, what is the ratio of winning by switching to winning by staying? • Sample output and chart (after only 10 simulations): • Due Monday, September 30.                Car    Your   Monty    Your    Win      Win Simulation  hidden   first  opened  second     if       if      index   here! choice    door  choice  stay?  switch?          1       3       1       2       3             yes          2       1       1       2       3    yes          3       1       3       2       1             yes          4       2       1       3       2             yes          5       3       2       1       3             yes          6       1       1       2       3    yes          7       2       2       1       3    yes          8       3       2       1       3             yes          9       2       2       1       3    yes         10       3       2       1       3             yes          4 wins if you stayed with your first choice          6 wins if you switched to your second choice Win ratio of switching over staying: 1.5

More Related