200 likes | 213 Views
Haslametrics.com offers unique predictive analysis for NCAA basketball based on teams' prior performances, using transitive comparisons and play-by-play data. The methodology focuses on shooting and scoring metrics to generate accurate rankings and outcomes. The algorithms factor in game pace, home-court advantage, and key game data elements, providing valuable insights for fans and teams alike.
E N D
Erik A. Haslam Sports Business Club, University of Wisconsin Madison, WI October 9th, 2019
What is Haslametrics? • Haslametrics.com is a website (founded in 2014) designed to offer predictive analysis based on teams' prior performances in a given NCAA basketball season. • Algorithms behind the scenes fuel team rankings, projected outcomes of future games, and bracketology estimates throughout a majority of the season. • Transitive comparisons are the basis for the rankings and ratings. • Ratings adjusted for game pace, home-court advantage, meaningful minutes, and recentness of game data. • Ratings and rankings are based on overall game performance, not wins and losses. • Scripts and SQL Server do the work behind the curtain.
How is Haslametrics Unique? (1/3) • I have sworn off several of the more popular methods endorsed by hoops stats enthusiasts to rate teams. • Dean Oliver's “Four Factors of Basketball Success” • Effective field goal percentage • Turnover percentage • Offensive rebounding percentage • Free throw rate • For much of the year, is there enough data available to properly formulate a reliable equation?
How is Haslametrics Unique? (2/3) • My methodology focuses directly on shooting and scoring. • How many opportunities teams have to shoot • How close to the basket each shot is • How well teams shoot from different locations on the floor • How often steals and offensive rebounds affect the shot selection and success • Remember, we must also factor in these same traits from a defensive perspective.
How is Haslametrics Unique? (3/3) • My algorithms utilize play-by-play logs over box score data. • Based on play-by-play logs that I have collected and parsed, I only utilize data for a particular game where the outcome of said contest is still in question. (Clock times are included.) • Using a formula to determine when a game is “analytically final,” I can truncate data that is likely to be contaminated by bench players ("scrubs") getting time on the floor when a lead is out of reach. • Play-by-play logs help us differentiate between “mid-range” two-point field goals and “near-proximity” two-point field goals. They also reveal special scoring scenarios.
Solving for “Where” vs. “How” • Three-pointers and free throws are self-explanatory. • Near-proximity field goals account for shots labeled as layups, tips, dunks, or alley-oops. • Mid-range field goals account for all other two-point shots. • “Second chance” opportunities account for shots five seconds or less after an offensive rebound. • “Breakaway” opportunities account for shots ten seconds or less after a steal.
The Transitive Comparison (3/3) • Sets of new transitive comparisons can be used to form secondary transitive comparisons. • We just formed a transitive comparison between Wisconsin and Boston College in our previous example. • If we form another transitive comparison between Boston College and, say, Duke, we have then formed a secondary comparison between Wisconsin and Duke. • Transitive comparisons are like cooking ingredients. • Metaphorically speaking, 353 empty pots are on the stove on Day One. (All teams are considered equal. There is no subjective bias.) • New ingredients are added to each pot at the conclusion of each day’s set of games during the season. • Game data “burns off” at a rate of 1.5% of its present value on a daily basis. • D-1 vs. D-1 games are only considered.
Master Ratings (Offense & Defense) • Master ratings reflect predicted performance against the “AO” (average opponent), a fictitious opponent who represents the average in every one of our statistical categories.
Predicting Future Outcomes (1/2) • We have our master ratings, as well as the D-1 average for each of our metrics. • For any particular rating metric M, forecasting a game outcome O is performed by summing the offensive and defensive deviations from the D-1 average on top of the D-1 average itself. • OOffense1 = AvgM + (MOffense1 – AvgM) + (MDefense2 – AvgM) • Projections require adjustment for home-court advantage and assume teams play at “full strength” for the entire 40 minutes of play.
Predicting Future Outcomes (2/2) • Factoring in game pace, the algorithms can estimate a final score for any of the 62,128 possible matchups in D-1 college basketball. • Doing so results in an All-Play Percentage for each team. • All-Play Percentage measures how many D-1 opponents each team should beat on a neutral court. • All-Play Percentage also determines the rankings you see on a daily basis during the season at Haslametrics.com.
Haslametrics Bracketology • “Bracketology Deserves” attempt to place teams based on a pre-determined algorithm. • System values “résumé” over “ratings.”
Automated Team Capsules • Analytics are only successful if you can successfully translate them for the “consumer.” • Automated game previews have been developed (slated for a December 2019 release).
Ongoing Challenges • Injuries or loss of key contributors • How do you determine the true impact of a single player? • The “eye test” is still an important factor. Analytics simply cannot do it all. Consider them to be evidence at a crime scene. • Duke’s loss of Zion Williamson for five games near the end of the 2018-19 regular season was a great example. • Data acquisition and validity • More robust solutions are currently unaffordable. • STATS LLC costs are north of $10,000 per year. • Limited by quality of the data • e.g. “Mid-range” shots – Did they come from 5’ or 17’?
Recommendations • Simplify wherever possible! Don’t overcomplicate! • “The art of sports analytics is like digging a grave. 6" isn't enough, but 600' is just overkill. (No pun intended.)” • Don’t bury people under a mound of charts and useless data. • Know your audience! • Think gray, not black & white. • e.g. Performance vs. wins/losses • Be unique! • “Be the shepherd, not the sheep.” • Challenge the status quo. (The “10th-man mentality”) • Remember…..analytics can be like wine-tasting. There is no good/bad, just personal preference.
Questions? • Erik A. Haslam, Haslametrics.com • Email: haslam@haslametrics.com • Twitter: @haslametrics • Facebook: www.facebook.com/Haslametrics