1 / 36

Football for KMS: NFL ‘01

Football for KMS: NFL ‘01. APRIL 30 TH 2008. Abhijit Kumar Kaijia Bao Vishal Rupani. Course Instructor: Prof. Hsinchun Chen. Agenda. VISHAL. KAI. ABHI. Data Cleaning Statistical Analysis Final Paper. Data Collection Client Relations Final Presentation. Data Import

justine-roy
Download Presentation

Football for KMS: NFL ‘01

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Football for KMS: NFL ‘01 APRIL 30TH 2008 Abhijit Kumar Kaijia Bao Vishal Rupani Course Instructor: Prof. Hsinchun Chen

  2. Agenda VISHAL KAI ABHI Data Cleaning Statistical Analysis Final Paper Data Collection Client Relations Final Presentation Data Import Data Transformation Data Mining • Data Mining Techniques • Key Findings • KMS Demonstration Objectives Literature Overview Conclusion • Knowledge DiscoveryStatistical Analysis

  3. Research Objectives • Pattern identification • Descriptive Statistics • Data Mining Techniques • Prediction • Developing a strategy • Fantasy League

  4. Literature Overview • Moneyball: The Art of Winning an Unfair Game Michael Lewis • Las Vegas Odds www.VegasInsider.com • NFL Fantasy League www.Nfl.com/fantasy

  5. Knowledge Discovery Process TRANSFORMATION DATA Dependent Variables Calculated Variables Independent Variables Play Decision, Intended Player, Play Direction, Yards Pro-Football -3 Tables -40 Columns -82,346 Rows Lisa Ordonez -1 Table -90 Columns -50,417 Rows GameNum, IsPlayChal, PlayZone, TotalOffTO, PlayDecision, QtrTimeLeft, HalfTimeLeft, GameTimeLeft Defense, Down, GAP, Halftime Left, Off Ydl, Offense, Play Zone, QTR, ToGo, Total Off TO SQL 2005 AS SQL 2005 IS

  6. Knowledge Discovery Process MINING PROCESSING Models - ID3 - Neural Networks Accuracy -Lift Charts -Classification Matrix TRANSFORMATION • Simple Statistics • -Play Decision • Intended Player • Play Direction • Yards DATA Dependent Variables Calculated Variables Independent Variables Pro-Football -3 Tables -40 Columns -82,346 Rows Lisa Ordonez -1 Table -90 Columns -53,000 Rows SQL 2005 AS MS Excel 2007 SQL 2005 AS SQL 2005 IS

  7. Dependency Network

  8. Dependency Network

  9. Intended Player: Statistics Top 3 Intended Players for Passes for the 4 teams that played in the semi-finals H.Ward (142), P.Burress (121), B.Shaw (44) T.Brown (143), D.Patten (93), M.Edwards (39) T.Holt (133), M.Faulk (104), I.Bruce (103) J.Thrash (107), D.Staley (89), T.Pinkston (83)

  10. Play Direction: Statistics • Direction of Rushes for all plays in 2001 season Right Tackle Right Guard Left Tackle Left Guard Right End Left End Middle Middle

  11. Play Direction: Statistics • Direction of Rushes for all plays in 2001 season Number of Rushes Direction

  12. Yardage: Statistics • Yardage during each down for Pass and Rush Passes Rushes Average Yards Covered Yards To Go

  13. Play Decision: Statistics • Play Decisions for the 4 teams that played in the semi-finals Play Decision Type Number of Decisions

  14. Play Decision: Analysis Overview • Discovery of what environmental and/or game factors affect play decision • Discovery of football expert knowledge through data mining • Prediction of play decisions based on game factors

  15. Play Decision: ID3 Analysis

  16. Play Decision: ID3 Analysis

  17. Play Decision: Accuracy

  18. Rush Accuracy: Lift Chart

  19. Field Goal Accuracy: Lift Chart

  20. Play Decision: Classification Matrix

  21. Play Decision: Key Findings • Football strategy can be discovered through data, instead of knowledge experts • Top 3 factors affecting decision: • Down, Off Ydl, Time • Accuracy of the models are different depending on the decision we are trying to predict • Team specific strategies may be discovered with more data.

  22. Play Direction: Analysis Overview • Discover team’s strengths and weakness in their defense and/or offense • Prediction of play directions based on game factors Right Tackle Right Guard Left Tackle Left Guard Right End Left End Middle Middle

  23. Play Direction: Accuracy

  24. Play Direction: Key Findings (ID3)

  25. Intended Player: Analysis Overview • Discover each team’s favored recipient of a pass • Prediction of intended player based on game factors

  26. Intended Player: Lift Chart

  27. Intended Player: Key Findings • There are 400+ intended players • Not enough data to accurately predict intended players • Not enough data to gain knowledge over statistical models

  28. Conclusions

  29. Future Direction • Increase sample set • More instances of different scenarios • Incorporate additional information • Pro-football-Reference.com • VegasInsider.com (Odds for favorites) • Extend Analysis • Nested case (Historical performance)

  30. References • Prof. Lisa Ordóñez • Professor in Statistics • Steve Aldrich • Author of Moneyball in Football • About Football • Glossary of terms

  31. Knowledge Discovery Process MINING PROCESSING Models - ID3 - Neural Networks Accuracy -Lift Charts -Classification Matrix TRANSFORMATION • Simple Statistics • -Play Decision • Intended Player • Play Direction • Yards DATA Dependent Variables Calculated Variables Independent Variables Pro-Football -3 Tables -40 Columns -82,346 Rows Lisa Ordonez -1 Table -90 Columns -53,000 Rows SQL 2005 AS MS Excel 2007 SQL 2005 AS SQL 2005 IS

  32. Research Objectives Literature Overview Knowledge Discovery Statistics: Intended Player Statistics: Play Direction Statistics: Yardage Statistics: Play Decision Accuracy: Lift Chart Charts Analysis: Play Decision Analysis: Play Direction Analysis: Intended Player Conclusions Future Directions System Design

  33. Backup Slide Section

  34. Data Collection 55,000 rows 90 columns 47,033 rows 30 columns Dependent – 4 Independent – 10 Calculated - 9

  35. System Design NFL KMS FOOTBALL DATA NFL Season 2001 Model Building DB Testing/ Accuracy Pattern Analysis FIELD STRATEGY DEFENSE STRATEGY METRICS Formations Accuracy Substitutions Performance Play Decisions

  36. Yards Analysis • Yards gained on the play is used as a metric to measure effort • Discover how environmental and/or game factors affect player’s efforts • Key Findings: Top 4 environmental factors • Off Ydl • Time • Down • Gap

More Related