1 / 22

Using Performance Metrics to Forecast Success in the National Hockey League

Josh Weissbock. Using Performance Metrics to Forecast Success in the National Hockey League. Outline. Introduction to Hockey Introduction to Performance Metrics Predicting the outcome of a Single Game Exploration of Performance Metrics. Aim.

ronda
Download Presentation

Using Performance Metrics to Forecast Success in the National Hockey League

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Josh Weissbock Using Performance Metrics to Forecast Success in the National Hockey League

  2. Outline • Introduction to Hockey • Introduction to Performance Metrics • Predicting the outcome of a Single Game • Exploration of Performance Metrics J. Weissbock (2013)

  3. Aim Performance Metrics (or “Advanced Stats”) have been shown on the internet to correlate much higher to wins and points in the standing, for the National Hockey League, than traditional statistics posted by the NHL. Can we use these advanced stats predict success in the NHL? J. Weissbock (2013)

  4. Introduction Lack of academic attention to hockey. Hard to analyze due to the lack of events (goals). We attempt to use Machine Learning to predict a single game in the National Hockey League: Using Traditional Statistics; Using Performance Metrics; and “Tuning Performance Metrics” J. Weissbock (2013)

  5. Sports in Machine Learning Chen et al. (1994) used Neural Networks to predict greyhound races. 2006 Soccer World Cup prediction accuracy 75%. NFL accuracy with neural networks: 78.6%. NCAA Football games prediction accuracy: 76%. NBA basketball accuracy at 76%. J. Weissbock (2013)

  6. Background Hockey is a sport played on a rectangular sheet of ice 60-61m x 25-30m. 2x teams of 5x players and 1x goal keeper (goalie). Team the scores the most goals in 60 minutes wins. NHL top league in the world. Other major leagues: KHL, SHL, ELH. Top 16 teams at the end of the year compete in an elimination tournament for the Stanley Cup. 4x rounds of best-of-seven series. J. Weissbock (2013)

  7. Background Traditional Statistics are “Real Time Scoring System” Statistics, based on goals (low events), usually are simple goal based stats and subject to rink bias. i.e. Goals, Assists, +/-, Giveaway, Takeaway, etc. Demonstrated bias amongst RTSS statistics in NHL arenas. Advanced Statistics based on more events (all shots, misses, blocks, and goals), shown to be highly correlated to wins and points. J. Weissbock (2013)

  8. Advanced vs Traditional Statistics Source: http://blogs.thescore.com/nhl/2013/02/25/breaking-news-puck-possession-is-important-and-nobody-told-the-cbc/ J. Weissbock (2013)

  9. Advanced vs Traditional Statistics Source: http://www.nucksmisconduct.com/2013/2/13/3987546/exploring-marginal-save-percentage-and-if-the-canucks-should-trade-a J. Weissbock (2013)

  10. Advanced Statistics Fenwick Close: statistic of posession, summation of shots, missed shots, blocks, and goals. Correlates to zone time. “Close” refers to only when the score is within 1 in the 1st/2nd period, or when the score is tied in 3rd/OT to eliminate “Score Effects”. PDO: Statistic of “luck” (or random chance), summation of Shooting % + Save %. Regresses to 100% +/- 2% over a 82-game season. 5/5 Goals For/Against: The ratio of goals scored for and against during even strength play. J. Weissbock (2013)

  11. Experiment 1 • Predicting a single game in NHL using both advanced and traditional statistics. • To see how high of an accuracy we can obtain • To see if advanced or traditional statistics better help predicting at the micro-scale. J. Weissbock (2013)

  12. Data 517 games of 2012-2013 NHL Season (72% of season) Python script to collect data daily. 14 Features/Team collected including: Location, Goals For & Against, Season Goals For & Against, PP%, PK%, Sh%, Sv%, 5v5 Goals For/Against, Win Streak, Conference Standing, Fenwick Close, PDO. Data collected before and after game: Goals scored for & against, shots for & against. To assist calculating statistics for future games. Sources: NHL.com, BehindTheNet.com, TSN.ca/NHL. J. Weissbock (2013)

  13. Example of data J. Weissbock (2013)

  14. Experiment Data represented as differentials between both teams. Two entries for each game, one for each team. Labelled as either “win” or “loss”. Weka’s implementations of SMO (Support Vector Machines), Neural Networks, J48 (Decision Tree), and NaiveBayes. Binary classification using 10-fold cross-validation. Compared datasets of only traditional and advanced statistics, as well as both. J. Weissbock (2013)

  15. Experiment J. Weissbock (2013)

  16. Experiment Best results from Neural Networks: With additional tuning, accuracy of 59.38%; Not statistically different than SMO. Splitting the data into testing/training (66%/33%) – accuracy of 57.83%: Looked at pairs labelled Win/Win or Loss/Loss by algorithm, keeping the label with the highest confidence and inverting the other, accuracy of 59%. Ensemble learning w/ stacking and voting returned similar accuracy. J. Weissbock (2013)

  17. Experiment Using Consistency Subset Evaluation, the top three features were: Location Goals against Goal differential J. Weissbock (2013)

  18. Experiment Second half of our experimental evaluation, we consider shortening PDO to the last n games to see how “lucky” a team has been recently J. Weissbock (2013)

  19. Discussion Altering PDO does not appear to have a significant affect on accuracy. Can predict ~60% of games correctly. Despite possession shown to be more useful in long term predictions, the traditional statistics are better for predicting a single game. In a single game the most valued features are: goals against, goal differential and location. J. Weissbock (2013)

  20. Future Work Collecting additional features: Rest days, days of travel, time-zone shifts, altitude shifts, change in weather at arena, gambling odds, injures, score-adjusted Fenwick, possession over the last n games. Collecting a full season of data (1230 games regularly). Training on past seasons of data. Compare prediction of single game in other league with same features Predicting the playoffs with best-of-seven series. J. Weissbock (2013)

  21. Conclusion ~60% Accuracy to predict a single game. Traditional statistics more effective in predicting a single game than advanced statistics. Predicting a single game is difficult due to large variance in the standings. Theoretical limit in prediction for machine learning for a single game in the NHL appears to be 62%. Changes based on the parity of the league and number of events. J. Weissbock (2013)

  22. Questions? Joshua Weissbock jweis035@uottawa.ca Follow my hockey analysis on twitter: @joshweissbock J. Weissbock (2013)

More Related