1 / 58

PitchFX : Sounds great! ... Now, where do I get it?

PitchFX : Sounds great! ... Now, where do I get it?. Daniel I. Brooks The University of Iowa. PitchFX. PitchFX. Tracks each and every pitch thrown in MLB in real-time

cricket
Download Presentation

PitchFX : Sounds great! ... Now, where do I get it?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PitchFX:Sounds great! ... Now, where do I get it? Daniel I. Brooks The University of Iowa

  2. PitchFX

  3. PitchFX • Tracks each and every pitch thrown in MLB • in real-time • Provides all of the parameters necessary to very accurately model the flight of the baseball from the pitcher’s hand to the plate • Data includes accompanying play-by-play info • Available at a very affordable price. (Free!) • Sounds great! ... Now, where do I get it?

  4. Overview Prologue: Accessing PitchFX Data in 2007 PitchFX Data Today Part 1: How can interested fans get access to it? Part 2: What can they get out of it? Part 3: How could availability and analysis improve?

  5. Prologue:How to get PitchFX Data Circa 2007

  6. (The First?) Step by Step Guide • Alan Nathan • Published August 6th, 2007 • http://webusers.npl.illinois.edu/~a-nathan/pob/tracking.htm • Contains a section on “How to Download”. • Ought to be straightforward enough, right?

  7. How to Download MLB Extended Gameday Pitch Logs I. How to Download Downloading the data: Go to the web site http://gd2.mlb.com/components/game/mlb/. Click on the year, then on the month; on the next page click on the day;on the next page click on the specific game; on the next page click on pdb; on the next page click on pitchers. For the Baltimore vs. Boston game played on August 1, 2007, the full link is as follows: http://gd2.mlb.com/components/game/mlb/year_2007/month_08/day_01/gid_2007_08_01_balmlb_bosmlb_1/pbp/pitchers/ The above steps take you to a page with a bunch of links that are of the form zzzzzz.xml, where zzzzzz is a six-digit code for a specific pitcher (see section III). For the above game, click on 122201.xml, which will get you to the pitch logs of Paul Shuey, who pitched to two batters in the 7th inning. There is no way you could have known this

  8. Here’s What You Get Error Message?

  9. How to Download You will then see a lot of numbers on the screen. Use whatever tools you have with your browser (e.g., "save page as") to save it as 122201.xml in some convenient folder. Now launch Excel. From the File menu, open the file you just saved. An Open XML box will pop up. Check the As an XML list box, then click OK, and the file will load. You should see columns A through AK (37 columns total) filled and with headers in the first row. Immediately save is as an Excel file. The number of columns may change depending on when the file was written, and there is no guarantee that the number will remain the same into the future. However, the header names will hopefully stay constant. In the next section, I will discuss the meaning of the important parameters in the database. As long as your version of Excel supports this… earlier versions (and some Mac versions) do not

  10. It gets harder… • But that’s only if you want to look at two batters worth of data (pitched by Paul Shuey!) • Want to look at multiple starts? Then you need a database. And you probably need perl or php or some other scripting language with easy XML parsing. And then you need a database front-end, and you need to learn SQL to access your database…

  11. Accessing PitchFX data in 2007… • Is really hard. • Is really time consuming. • Requires a high level of technical expertise. • And that’s before you even ever get into what it means. • What has changed to help remedy this problem?

  12. Part 1: The Casual SabermetricianorHow can interested fans access PitchFX Data?

  13. The Casual Sabermetrician • A new group of baseball viewer • “Casual Sabermetricians” • This group is roughly made up of: • Bloggers • Forum Dwellers And… • Sportswriters • Major League Scouts

  14. The Casual Sabermetrician • The casual sabermetrician : • Wants to answer data-driven question • Knows PitchFX Data is out there • Lacks expertise to access PitchFX data

  15. How to Access PitchFX Data? • There are now a few different ways these individuals can access PitchFX data: • Josh Kalk’s Website (now offline) • Fangraphs.com • BrooksBaseball.net

  16. PitchFX Tools • Fangraphs .com • Seasonal detail • Some game-by-game info • Lots of other sabermetric statistics handy • BrooksBaseball.net • Lots of game-by-game detail • Easily view other pitchers from same game • Strikezone maps / Splits / Situational Graphs

  17. PitchFX Tools • These tools simply getting information • Still require that you can interpret the data once you have it… • …but they offload the busy work onto computers

  18. Part 2: What can we get from the PitchFX data?

  19. Let’s Pick a Pitcher • Suppose we were interested in Jon Lester. • Let’s generate a “scouting report”: • What does he throw? • How hard does he throw the ball? • What mix of pitches does he use in games? • Which pitches worked for him? • When does he throw different pitches?

  20. FanGraphs.com: How to Search

  21. FanGraphs.com: Click “PitchFX”

  22. FanGraphs.com: Pitch Selection

  23. FanGraphs.com: Velocity Tracking

  24. PitchFX through B-Ref

  25. PitchFX through B-Ref

  26. Searching BrooksBaseball.net

  27. Jon Lester Pitch Clusters

  28. Jon Lester Pitch Clusters

  29. Lefty/Righty Splits Vs. LHH Vs. RHH

  30. Maintaining Velocity

  31. Different Pitches in Different Counts

  32. Smoltz Pitching Backwards

  33. Strikezone Map

  34. PitchFX Tools • Using a combination of PitchFX tools we can get an incredible amount of information about how a pitcher has performed. • Fangraphs: season-wide perspective • BrooksBaseball: start-by-start perspective

  35. PitchFX Tools • Each tool provides other information that can help evaluate a pitcher: • FanGraphs provides easy access to other sabermetric pitching statistics • BrooksBaseball provides easy access to other pitcher detail from the same game

  36. One More Case StudyAroldis Chapman “He has a fastball clocked at 101 or 102 MPH, and a plus curveball and plus slider, to use the scouts' vernacular.” -ESPN “Aroldis Chapman has a tantalizing 100 mph fastball, but also question marks about his other pitches -- and his maturity.” -…also ESPN “He throws 100 and 101 mph… If he polishes up his changeup and tightens up his slider, he can be a young Randy Johnson.” -His Agent “His fastball was clocked from anywhere between 97 and 100 mph.” -MLB.com "In order to become the best pitcher, I still need lots of things. I need to improve professionally. I need to work. I need to work with curveballs. I need to work with other kinds of pitches." -Chapman

  37. Case Study: Aroldis Chapman

  38. Can Aroldis Throw 100mph?

  39. A “plus-slider and plus curveball”? ?

  40. Case Study: Aroldis Chapman • You can go do this at home. • You need to know virtually nothing about computers, you just need to know who Aroldis Chapman is and when he might have pitched.

  41. Part 3: Improving Availabilityand Analysis

  42. The New Access Barrier • The casual sabermetrician : • Wants to answer data-driven question • Can easily access PitchFX data Problem Solved! ... Right?

  43. The New Access Barrier • The casual sabermetrician : • Wants to answer data-driven question • Can easily access PitchFX data Problem Solved! ... Right? • Two existing problems: • Data analysis is non-trivial • How trustworthy is the data?

  44. Data Analysis is Non-Trivial • Identifying pitches can be difficult at first • though it gets easier with practice • Sabermetricians are notoriously descriptive statisticians. • You could read dozens of articles online and not find a single inferential statistical test or any measure of variability. • This is exacerbated by a strange fascination with small sample sizes.

  45. Problems with Trust • Our tools purport to show lots of information • How accurate is this info? • Scouts/Teams may feel that the data isn’t trustworthy enough to use to evaluate pitchers. • May feel that due to obvious errors, data is bad • Consistent pitch classification is a huge problem. • May feel that due to odd conventions, data makes no sense

  46. That Graph From Earlier

  47. Consistent Classification is an Issue

  48. Problems with Trust • Certain conventions that the community has adopted are strange and educated fans/scouts get frustrated • Vertical Movement (rising fastballs, etc) • Certain results from the data are so counterintuitive that people get worried • Sinkers in large majority don’t really sink.

More Related