1 / 48

How to Catch a Tiger: Understanding Putting Performance on the PGA TOUR

Discover statistical models predicting golf performance on PGA TOUR, correcting for player skill & course difficulty, and assessing shot value.

sef
Download Presentation

How to Catch a Tiger: Understanding Putting Performance on the PGA TOUR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How to Catch a Tiger:Understanding Putting Performance on the PGA TOUR Jason AcimovicMIT Operations Research Center, acimovic@mit.edu Douglas FearingMIT Operations Research Center, dfearing@mit.edu Professor Stephen GravesMIT Sloan School of Management, sgraves@mit.edu

  2. Agenda • Introduction • Project Question • Applications • Approach and contribution • Golf and data overview • Putting model • Off-green model • Situational analysis

  3. Project Question • How well do people perform on tasks? • Tasks differ from each other • Not everyone performs every task • Even the same task can be different from person to person

  4. Applications • Evaluating employees in a distribution center • Pickers in a warehouse vary in skill (picks per hour) • Pick zones vary in difficulty (books vs. electronics) • Difficulty also varies by hour of day and day of week • Pickers shift around, but not enough to ensure perfect mixing • How do you compensate the best employees and identify underperformers? • Golf putting • Different golfers play different tournaments • Greens vary in their difficulty • Different golfers start on the green from different distances • How do we identify the best putters?

  5. Project approach and contribution • Develop statistical models to predict strokes-to-go • Correct for player skill and course difficulty • Evaluate incremental value of each shot taken relative to the expectation for the field • Compare predicted strokes-to-go before and after shot • Aggregate shot value across players, shot types, etc. to better understand player performance • Compare our model to current metrics, namely, Putting Average • Paper: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1538300 (or email us)

  6. Agenda • Introduction • Golf and data overview • Strokes-to-go example • ShotLink data • Putting model • Off-green model • Situational analysis

  7. Quick golf primer • The goal is to get from the tee to the pin in the fewest number of strokes • 18 holes in a round of golf • Typically 4 rounds in a tournament • Lowest total score wins Green Tee Fairway

  8. Strokes-to-go example 4.4 – 3.0 – 1 = 0.4

  9. ShotLink Data • Every tournament, 250 volunteers gather data on every shot • Lasers pinpoint the ball location to within an inch • Field volunteers gather qualitative characteristics • Data is used for both real time reporting as well as detailed analyses • 5 Million shot data points • 2 Million putt data points

  10. Visual explanation of ShotLinkTM dataset Course Year Round Number Hole Number Tee Location Ball Location Pin Location Player Shot Number Location Type Ball Lie Hole Par Stimp Reading Green Length Z Coordinate Z Coordinate X Coordinate X Coordinate Y Coordinate Y Coordinate 16th Hole on Colonial

  11. Data for the 14th hole at Quail Hollow – 1 day

  12. Agenda • Introduction • Golf and data overview • Putting model • Empirical data • Two stage model • Holing out submodel • Distance-to-go submodel • Markov chain • Correct for hole difficulty and player skill • Putts-gained per round and results • Off-green model • Situational analysis

  13. Empirical mean and std. dev. of putts-to-go Mean Std. Dev.

  14. Two-stage model to predict putts-to-go • First stage sub-model • From anywhere on the green, the first model predicts the probability of sinking the putt Probability of 0.1 of making it in on this putt

  15. Second stage finds conditional distance-to-go • Second stage sub-model • If the golfer misses the putt, the second model calculates the distribution of the distance-to-go for the green If I miss, I have a 0.0021 probability of being in this blue area. (calculate this for entire green)

  16. Combine and … • We can calculate the putts-to-go distribution from anywhere on the green Consider only distance in our model

  17. Empirical probabilities of holing out Empirical probability of holing out vs. distance

  18. Normal regression is inappropriate • With Ordinary Least Squares regression, “one” might predict the probability of making a putt based on starting distance…. • But… • We want to predict a probability with a range between 0 and 1 • Errors are not normal

  19. One-putt logistic regression model • Y – putts-to-go • d – initial distance to the pin • Fitted model parameters: • Probability:

  20. Model holing out as a logistic regression Model probability of holing out vs. distance

  21. 2nd-stage problem, determining distance-to-go • What happens if we miss the first putt? z

  22. Empirical mean and std. dev. of distance-to-go Mean Std. Dev.

  23. Empirical distributions of distance-to-go From 10 ft. From 30 ft.

  24. Distance-to-go gamma regression model • d – initial distance to the pin • z – distance-to-go (assuming a miss) • Fitted model parameters: • Mean: • Density:

  25. Distance-to-go model: mean and std. dev. Mean Std. Dev.

  26. Distance-to-go model distributions From 10 ft. From 30 ft.

  27. Putts-to-go as Markov chain g (z|d) = (1 - [ 1 + exp(…) ]-1) x f(z|d) p = [ 1 + exp(…) ]-1 Probability of holing out in n putts is probability of reaching absorbing state in n transitions p = 1 H z d distance Where g(z|d): probability density of ending up at z conditioned on starting at d f(z|d) probability density of ending up at z conditioned on missing and starting at d (from the distance-to-go gamma regression model)

  28. Making it within n putts (model prediction) • Over 90% of golfers 2-putt or better within 35 ft. • Only a 1.6% chance of 4-putting or worse at 100 ft. Two-Stage Model Within N Putts

  29. Two-stage model mean and std. dev. Mean Std. Dev.

  30. Comparing putt quality • Greens vary in difficulty • Fast vs. slow greens • Type and length of grass • Good putts on a hard green should be valued more than the same on an easy green • Adjust parameters for each hole to the logistic and gamma regression models

  31. Revised logistic and gamma regressions • Every player p and hole h have their own dummy variables and specific holing-out probabilities* • Ipis the indicatory variable, and is equal to 1 if observation i contains player p and is zero otherwise. • Instead of a regression with 6 parameters, we now have thousands of parameters • E.g., there is a β0hparameter for every hole The gamma regression is adjusted similarly *The actual analysis accounts for the number of observations per player and per hole, so that the model is more complex for players about whom we know more.

  32. Visualizing player skill level differences • Comparison of above average (Brent Geiberger), below average (John Huston), and field average putter for an average green

  33. Visualizing green difficulty differences • Comparison of an easy green (Bay Hill #9), difficult green (Sawgrass #1), and average green based on a field average golfer

  34. Calculating putts gained per round • Calculate the gain associated with each putt • Relative to the putts-to-go for each specific hole • Example: Golfer starts at 12 ft. and takes 2 putts to sink ball • Expected putts-to-go: 1.71 • Actual number of putts: 2 • Relative gain: (- 0.29) • Sum the relative gains for each player • Divide by the number of rounds played 12 feet 1.71 putts to go

  35. Top 10 putts gained per round

  36. Putting average is the most popular metric today • Putting Average • Average number of putts per green* • When a golfer reaches a green • Count the putts it takes to get it in the hole • Average this among all his green appearances • Regardless of how close he starts on the green *Actually, a green in regulation, which means the green was reached in no more than (par – 2) strokes

  37. Comparing with putting average

  38. Understanding the discrepancies • Insert first-putt distance histograms for most severe outlier. Percentage of 1st putts 20 ft. or closer • 54% for All Players • 51% for Stephen Leaney • 60% for Ernie Els On average he starts closer to the hole, so his putting average is inflated by his excellent approach shots

  39. Agenda • Introduction • Golf and data overview • Putting model • Off-green model • Situational analysis

  40. Evaluating off-green performance • For each hole, calculate “field par” • Empirical average number of strokes corrected for player skill and hole difficulty • Calculate total strokes gained per round for each player • Calculate off-green strokes gained per round (Off-green strokes gained = Total strokes gained – putts gained)

  41. Top 10 golfers (on and off green performance)

  42. Agenda • Introduction • Golf and data overview • Putting model • Off-green model • Situational analysis • Player specific putts • Fourth round pressure • Tiger woods’ fourth round performance

  43. Situational putting performance • Above, we used the general putting model to evaluate putting relative to the field of professionals • We also have the capability to evaluate a golfer’s putting relative to his own expected performance • For instance, even if Tiger Woods usually putts better than the field, we can also determine when he putts worse than himself • Does he putt better or worse after the cut? • Does he putt better or worse for birdie vs. for par?

  44. Player-specific putts gained – example • On the 10th green at Quail Hollow, 9 feet from the pin: • Tiger Woods’ personal expected putts-to-go is 1.54 • Vijay Singh’s personal expected putt-to-go is 1.59 • If they each sink it, Tiger gains only 0.54 strokes whereas Vijay gains 0.59 strokes Vijay: E[putts] = 1.59 Tiger: E[putts] = 1.54 9ft 9ft

  45. Advantages of player-specific putts gained • Easy to test various hypotheses • After calculating the shot value for every putt, we need only to filter and aggregate the results • Describes the magnitude in terms of score impact • Suggests areas for further investigation • Standard deviation of putts gained provides the relative significance of the effect

  46. Fourth round pressure • Putting does not seem to be affected by the pressures of being in the fourth round

  47. Tiger Woods’ fourth round performance • A common perception is that Tiger has the ability to kick it up a notch during the final round • Looking at his putts-gained suggests otherwise

  48. Conclusion • Developed a model for putting • Corrected for player skill and hole difficulty • Intuitive model that describes how putts occur • Demonstrated the differences between our metric and current putting statistics • Developed a “field par” which corrects for hole difficulty and quality of field • Compared on- and off-green performance • Examined situational putting performance

More Related