200 likes | 373 Views
Neural Network Prediction of NFL Football Games. Joshua Kahn ECE539 – Fall2003. Overview. Introduction Work Performed Data Collection Preliminary Study Training and Prediction Set Creation Data Preprocessing Making Predictions Results Conclusion. Introduction.
E N D
Neural Network Prediction of NFL Football Games Joshua Kahn ECE539 – Fall2003
Overview • Introduction • Work Performed • Data Collection • Preliminary Study • Training and Prediction Set Creation • Data Preprocessing • Making Predictions • Results • Conclusion
Introduction • The National Football League (NFL) is a multi-billion dollar business • Many web sites claim to be able to predict the outcome of NFL games • Some of these sites are trustworthy, others are downright seedy • Why are actually correct?
Project Goal • Most prognostications are based on human opinion • Invariably, some degree of bias enters in • This project aims to create a completely objective, statistics based system for predicting the outcome of NFL games • The trouble lies in the “intangible” aspects of the game • It seems plausible to do create a statistical system
Why a Neural Network? • Teams can win in a variety of ways • No linear mapping exists to determine the outcome • This problem essentially boils down to a pattern classification problem • Neural networks are very good at solving these problems • Neural network provides a non-linear mapping
Data Collection • Data was to be available from a typical NFL box score • A large data set was required to represent the large number of ways to win • Collected from NFL.com • Used Excel’s web query feature to acquire tabular data, such as box scores and team averages
Data Collection • Data was extracted from the box scores using a Perl script • Perl provides an Excel interface • Statistics could be selected from the box scores as desired • Perl also allowed additional data processing • Needed to determine which statistics to use
Preliminary Study • Data was analyzed using Matlab to look for dependency, redundant data, etc. • No hyperplane exists to separate wins and losses based on statistical analysis
Preliminary Study Results • Determined the following statistics were most predictive: • Total yardage differential • Rushing yardage differential • Time of possession differential (in seconds) • Turnover differential • Home or away • Differential statistics provide insight into offensive and defensive performance • Scoring data was excluded as it would bias the network’s output toward a single feature
Training and Prediction Sets • Training sets include the statistics for both teams for each game • Each training vector also includes the outcome of the game • Outcome marked for both teams • 1 = win, -1 = loss • Two prediction sets were created: • One based on team season averages • Other based on average of prior 3 weeks • Both sets were applied to determine effectiveness
Neural Network Selection • Back-propagation multi-layer perceptron provides a great deal of flexibility • Good pattern classifier • Supervised learning • Network parameters and structure were determined based on testing
Data Preprocessing • Processed all data using singular value decomposition • Gives additional weight to the most pertinent features prior to network input • Makes training more effective • Performed using Matlab’s svd function
Making Predictions • Trained network using training data • Applied prediction data three times • Used both season and three week average to determine effectiveness of the two • Found the average of the three trials • Classified winner/loser of game • Winner had higher network output
Results • Neural network classification correct 94% when actual (not predicted) statistics are used • NFL teams seem to be consistent over the long-term
Results Week 14 Week 15
Baseline Study • Neural network was more accurate on average • Previous neural networks predictors accurate for 63% of games
Conclusions • Of eight misclassifications, each can be subjectively identified in one of 3 categories
Conclusions • Prediction rate could be improved by adding the “human element” • Take immeasurable into consideration • Las Vegas betting lines • Subjective team rankings • Training set could be based on previous season data • Ways in which teams win presumably does not change over time • Proves that a statistically based system can be developed to predict outcome of NFL games
References Haykin, S. (1999). Neural Networks: A Comprehensive Foundation. Upper Saddle River, New Jersey: Prentice-Hall, Inc. ESPN.com, http://www.espn.com [Retrieved Dec 2003]. Purucker, M.C. (1996) Neural Network Quarterbacking. Potentials, IEEE, vol. 15:3, pp. 9-15. NFL.com, http://www.nfl.com [Retrieved Dec 2003].
Questions??? Thank you…