140 likes | 223 Views
Who is King of the Arena? -- NBA Related Data Analysis. Team: NewBie Aggression 2010. 10. 18. A Brief Project Introduction. Based on Our Love for NBA Orion’s Super analysis power Aimed at Manias who won’t be satisfied with simple statics We can tell
E N D
Who is King of the Arena?--NBA Related Data Analysis Team: NewBie Aggression 2010. 10. 18
A Brief Project Introduction • Based on • Our Love for NBA • Orion’s Superanalysis power • Aimed at • Manias who won’t be satisfied with simple statics • We can tell • Which position is more important in the recent seasons • Which player is likely to win the MVP this year • Which Team is likely to appear in the Playoff • And Much More !
Dashboard Showcase It’s Show Time!
Technical Design • Logical Data Model • Warehouse Design • Project Creation & Management
Logical Data Model • Data Source Available • http://www.databasebasketball.com • Including NBA seasonal team\player\coach statics, playoff teams, final champions and MVP winners from 1946 to 2008 • Lowest Data Level • Season – Team - facts (games won, games lost, pts, ….) • Season – Team - Player - facts (pts, asts,… )
Logical Data Model Region Period City Position Age Coach … M:M Season Team Player Season-Team Facts Season-Player Facts Lowest Level Facts
Warehouse Design • Lookup tables • Season, Player, Coach, Team, City, Region • Lowest level fact tables • Same as data source • Aggregate fact tables • Player Career Facts & Coach Career Facts • Aggregate on “season”, “player”… • For performance consideration
Project Creation & Management • Metadata Schema • Regular Metric & Reports • Widget Usage • Dashboard Layout • Just regular procedures • Time Limit -> details available in Q & A
Project Creation Spotlights – Evaluation Metrics • Examples: • Player credits (value) • Team Offensive/Defensive Efficiency • Team Wining Rate • Compound: • Combination of simple metrics like Asts, Pts,… • Reasonable: • Based on some senior NBA analysis sites • http://www.databasebasketball.com • http://www.nbastuffer.com
Project Creation Spotlights – Training Metric • Player evaluation—All star player prediction • Very difficult • All-star player : all players = 1 : 20 • Training Metric - Logistic regression • Training set: 1950 – 2000 • Test set: 2001 - 2008 • Features • ASTS, BlK, Dreb, Oreb, Pts, Reb, Stl, VI, Gameplayed, Credits, FTA, FGA • Result • Precision: 69.7% • Recall: 60.5% • F Score: 64.8%
Project Creation Spotlights – Training Metric • Player evaluation—MVP player prediction • Much more difficult than All-Star prediction • MVP player : all players = 1 : 450 • Training metric is no longer effective because of the extremely unbalanced dataset • Instead we used an empirical metric “player credits” to predict the MVP • Player Credits= PTS+REB+AST+STL+BLK-FG MISSED-FT MISSED-TO • Result • Top 1 credits player: 37.5% • Top 3 credits player: 62.5%
Project Creation Spotlights – Training Metric • Team evaluation - Predict playoff teams • Much Easier • Training Metric - Logistic regression • Training set: 1950 – 2000 • Test set: 2001 - 2008 • Features • Offense Efficient, Defense Efficient, D Reb, D Pts, O Reb, O Pts, Turn Over • Result • Precision: 89.7% Recall: 95.3% F Score: 92.4%
Division of Work Yang Fan Cao Lei Tang Yang Logical Data Model & Warehouse Design Metadata Schema ETL Processing Metadata & Warehouse Check Metric & Report Metric & Report Metric & Report Dashboard Training Metric Presentation
Q & A Thank You!