1 / 51

Picking a Winner in the NHL… Is It Possible?

Picking a Winner in the NHL… Is It Possible?. An Investigation By. History. Since the late 1800s when the National Hockey League was formed, there have been people to coach, people to play and people to watch. Hockey is and has been a source of enjoyment for many people around the world.

jake
Download Presentation

Picking a Winner in the NHL… Is It Possible?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Picking a Winner in the NHL…Is It Possible? An Investigation By

  2. History • Since the late 1800s when the National Hockey League was formed, there have been people to coach, people to play and people to watch. • Hockey is and has been a source of enjoyment for many people around the world. • To enhance this enjoyment, people have been placing wagers on outcomes of certain games. • At the time of the NHL’s commencement, people wagered gold, land, and owners even wagered their players.

  3. Gambling Today • Today the, gambling is more structured and is mostly organized through the Federal Lottery and Gaming Commission. • A report from Statistics Canada showed that Canadian wagered over 100 million dollars in 2002, on sports alone. This does not even take into account the money wagered “under the table”. Statistics Canada also reported that there was only a return of 14 million dollars. This means that 86% of the money wagered is lost! • It has been the job of many people in Las Vegas and other “gambling hot spots” to come up with predictions, and make odds. These people use statistics to make educated predictions.

  4. Gambling Now…

  5. Synopsis • Using statistics from the entire 2002-2003 NHL Regular Season, we will use our knowledge from the Mathematics of Data Management course to see if we can predict a Stanley Cup winner!!!

  6. Gambling After Investigation…

  7. Variables Investigated Some of the variables that we will be analyzing are: • Goals • Shots • Face-offs • Winning Percentage • Power Play Percentage • Penalty Kill Percentage

  8. In Summation Using as many tools possible that we have learned we will make an educated guess as to who the Stanley Cup Winner will be, and if there is a reliable method as to how to pick a winner of any individual game. Basically what we are trying to ask is, using statistics and mathematical theories…

  9. CAN WE PREDICT A WINNER?

  10. Procedures Taken • Collecting the data. This was done by locating the official score-sheets, face-off comparisons and super stats. • The statistics contained in each of the aforementioned documents were extracted and organized into charts for each team, as well as the entire league.

  11. Procedures… • The data contained in the charts was once again extracted and placed into sub-charts. Some of these sub-charts include the following: • Correlation between goals and shots • Correlation between face-off percentage and goals • Correlation between face-off percentage and winning percentage • Correlation between average shots taken and winning percentage • Probability of having more shots than opponent • Probability of winning more face-offs than opponent • Correlation between power play percentage and winning percentage • Correlation between penalty minutes for and goals against

  12. Procedures… • This data was then used to create scatter plots to have a better understanding of the data • Trend lines were used to pinpoint any type of relationship between the data • The correlation coefficient was established, again using technology, and this was used to determine how well or how poorly the data fit the newly discovered trend

  13. Excel Functions Some of the commands used in Microsoft Excel were: • AVERAGE() – used to calculate the mean of the data • STDEVP() – used to calculate the standard deviation of the data • CORREL() – Used to find the correlation coefficient of the data (Same as PEARSON()) • The Sort command was used quite often to sort data by different variables • Sum and Difference functions were used when analyzing data

  14. Procedures… • Certain empirical probabilities were calculated from the data collected • Measures of central tendency were calculated along with measures of spread • The data collection and organization can be considered an iterative process. (I.E. Data taken from score sheet, place on to Excel spreadsheet, repeat unless no more data points to fill)

  15. Data Collected

  16. (6 more headings) Team Winning % Face-off % PP % PK % GF Atlanta 38 46 17.2 81.6 226 Florida 29 46 14.2 81.4 176 Minnesota 51 47 14.2 86 198 (27 more teams) Team Data Team data was organized into chart form using Microsoft Excel. This data was then used to create the graphs contained in this section.

  17. Shots vs. Goals The first scatter plot that we made looked at the correlation between shots and goals, the most obvious comparison made by all sports statisticians. With this data we found a very weak correlation. The value of r, the correlation coefficient sits at less than 0.33, it is 0.15.

  18. Mean (Shots) 28.37 Standard Deviation (Shots) 6.56 Mean (Goals) 2.65 Standard Deviation (Goals) 1.62 Given this, we can come to the conclusion that predicting the outcome of a on the relation of goals vs. shots would not be very accurate. The vital statistics about the data in the graph above are contained in the following chart.

  19. Power Play Percentage vs. Winning Percentage Many sports commentators get hung up on the idea of a great power play wins games, but is this true? The following graph will compare the two variables and it will be discussed whether or not this prediction made by analysts everywhere is correct. Compared to the last set of data examined, this correlation would be superior. With a moderate correlation coefficient of 0.568 we can see that as the power play percentage increases, so does the winning percentage.

  20. Mean (PP%) 16.36 Standard Deviation (PP%) 2.86 Mean (Winning Percentage) 43.67 Standard Deviation (Winning Percentage) 9.55 Apparently the statisticians that boast this fact are correct. The vital statistics about the data in the graph above are contained in the following chart.

  21. Penalty Minutes For vs. Goals Against We will examine the perceived ill effects of having to kill a penalty by looking at the relationship between penalty minutes for and goals scored against for each game.Although the correlation is not very strong (r=0.147), it cans till be seen that as the penalty minutes increase, the more goals there are that are being scored on you team.

  22. Mean (Penalty Minutes) 14.26 Standard Deviation (Penalty Minutes) 10.50 Mean (Goals Against) 2.65 Standard Deviation (Goals Against) 1.62 This weak correlation combined with other statistics could prove to be useful in the future. The vital statistics about the data in the graph above are contained in the following chart.

  23. Face-off Percentage vs. Goals Scored It seems that in every playoff game, the face-offs are key to winning the game. On a key face-off win, the game-winning goal is scored, or when it is lost, the momentum shifts to the other team. In the hunt for the cup, do the face-offs you win help you score goals? The graph below will examine this more closely.

  24. Mean ( Face-off Percentage) 50.00 Standard Deviation (Face-off Percentage) 7.26 Mean (Goals) 2.65 Standard Deviation (Goals) 1.62 Interestingly enough, there is little to no correlation (r=0.0316), and as it can be seen, after a certain point, the more face-offs you win, the amount of goals you score decreases. The vital statistics about the data in the graph above are contained in the following chart.

  25. Average Face-Off Winning Percentage vs. Game Winning Percentage As in the last relationship where face-off wins were compared with goals scored, on our quest to find the Holy Grail of statistics, we must examine whether winning face-offs improves your overall winning percentage. This differs from the last example because it goes on a team average basis, where the average from the entire season was taken, along with the winning percentage obtained by each team.

  26. Mean (Face-Off Percentage) 50.00 Standard Deviation (Face-Off Percentage) 2.25 Mean (Winning Percentage) 43.67 Standard Deviation (Winning Percentage) 9.55 Similarly to the “Face-off Wins vs. Goals” data, this set follows an odd trend, where after a certain point, the more face-off you win, the worse your winning percentage becomes. This odd occurrence follows a weak correlation, the value for r=0.171. The vital statistics about the data in the graph above are contained in the following chart.

  27. Average Shots vs. Winning Percentage It would be assumed that the more shots you have, the better you chance at scoring, and therefore the better you chance at winning, but is that necessarily true? This relationship could very well show the most promise when it comes to picking a winner because as any young player learns, “You don’t score 100% of the shots you don’t take”. Taught early on, it must be important and must lead to winning. The following graph shows the results.

  28. By the slight increase in slope, it can be observed that the more shots on average that you take, leads to a higher winning percentage at the end of the season. So, apparently taking a lot of shots is important. The correlation coefficient shows a moderate correlation of r=0.364. To win 100% of the games, lets solve how many shots you would need to take per game. 100 = 2.0275x – 13.841 100 + 13.841 = 2.0275x 113.841/2.0275 = x 56.15 = x Therefore, in order to win every game, you would need to take on average 56 shots per game.

  29. Mean (Average Shots) 28.36 Standard Deviation (Average Shots) 1.72 Mean (Winning Percentage) 43.67 Standard Deviation (Winning Percentage) 9.55 The vital statistics about the data in the graph above are contained in the following chart.

  30. Measures of Central Tendencies and Measures of Spread

  31. Mean Goals per game = 2.7 each Shots per game = 28.4 each Penalty minutes per game = 14.3 each Face-off percentage = 50% each Note: The shots per game takes into account 50% winning and 50% losing, if it were doubled to be 100% winning you would see that the amount of shots taken would be 58.8 (28.4 * 2), which is almost identical to that value we found in the “Average Shots vs. Winning Percentage” section.

  32. Median Goals per game = 2 Shots per game = 28 Penalty minutes per game = 12 Face-off percentage = 50%

  33. Mode Goals per game = 2 Shots per game = 28 Penalty minutes per game = 8 Face-off percentage = 50%

  34. Standard Deviation When calculating the standard deviation of this data it is important to realize that we are dealing with a population not a sample, because we are dealing with every single game that took place during the 2002-2003 NHL season. Goals per game = +/- 1.6 each Shots per game = +/- 6.6 each Penalty minutes per game = +/- 10.5 each Face-off percentage = +/- 7.3% each

  35. Probabilities

  36. Probability of Having More Shots Than Opponent and More Face-offs (Separate)

  37. Probability of Having More Shots and More Face-off Wins (Combined)

  38. Breakthrough! This combination of statistics is the biggest breakthrough when looking how to pick a winner. As it can be seen, the two NHL 2003 Stanley Cup Finalists sit atop the leader board. We can ignore Philadelphia because they are a part of the same conference as New Jersey. So New Jersey and Anaheim are both in first place in their respective conferences. Lets evaluate this further!

  39. Analysis Taking into account everything we have learned, we can now start to make a prediction as to who the winners of the Stanley Cup will be this year. We will evaluate the most important statistics, face-offs and shots.

  40. Face-offs We learned earlier in our “Average Face-Off Winning Percentage vs. Game Winning Percentage” section that winning too many face-offs can be detrimental to your teams chances. We will examine the two Stanley Cup Finalists. New Jersey has a 51% chance that they will win more face-offs than their opponent. Anaheim has a 56% chance that they will win more face-offs than their opponent.

  41. When we put these percentages into our equation: New Jersey  -0.2578(51)^2 + 26.024(51) – 611.62 = Winning Percentage 45.0662 = Win % Anaheim  -0.2578(56)^2 + 26.024(56) – 611.62 = Winning Percentage 37.2632 = Win % Therefore in the face-off category, the advantage goes to the New Jersey Devils.

  42. Shots We learned earlier in our “Average Shots vs. Winning Percentage” section that the more shots you take, the better your winning percentage. For this we need to look past the above chart, and back at our Team Stats Chart. New Jersey averages 31.7 shots per game Anaheim averages 27.4 shots per game

  43. When we put these into our equation for shots: New Jersey  2.0275(31.7) – 13.841 = Winning Percentage 50.43075 = Win % Anaheim  2.0275(27.4) – 13.841 = Winning Percentage 41.7125 = Win % Therefore in the shot category, the advantage goes to the New Jersey Devils, once again.

  44. From these two pieces of information we can conclude confidently that the NEW JERSEY DEVILS will be the 2003 NHL Stanley Cup Champions.

  45. Testing The Face-off Win and Shot EquationsIn NHL 2003 by EA Sports

  46. Conclusions To answer the question that we first posed, “Can you predict a winner?” The answer is yes. Through our findings of the Shots and Face-off, we are confident that the New Jersey Devils will win the Stanley Cup. However, these findings are not just limited to this year. They can be used over and over again. Sometimes they will need to be changed for changes in the rules and whatnot, but the principle will still be the same. To win the game you need two things: 1) More Shots 2) Less than 51.2% Face-off Wins

  47. What About a Split? If the equations return a different team for each (I.E. New Jersey with the advantage for shots, and Anaheim with the advantage for face-offs) then the team with the advantage for the shots shall be chosen as the winner because the correlation coefficient is larger, therefore the data fits the trend better.

  48. Discussion When doing an investigation like this, it is important to note certain things. The first thing we would like to note is that when taking data, sometimes there can be a bias. In this case, since we chose all the data that the NHL had offered to give there is no bias. All of the data is fact since it actually occurred, there can be no discrepancies.

  49. The second thing we would like to note is, although the data may look nice and sound nice, and we think we have predicted the winners, in a game like hockey there are so many outliers. Some of these include: Sickness Temperature of the rink Puck Elasticity Meals eaten Loss of sleep the night before Muscle pulls/ Injuries New Equipment/ Equipment problems The list goes on…

  50. The Last Word… All the aspects in which the investigation was done, was done in the way that we learned in class. This was done to create familiar concepts, theories and mathematical evaluations so that all who have take Data Management, and even some who haven’t, could understand.

More Related