1 / 27

Generalized Linear Mixed Model

Generalized Linear Mixed Model. English Premier League Soccer – 2003/2004 Season. Introduction. English Premier League Soccer (Football) 20 Teams – Each plays all others twice (home/away) Games consist of two halves (45 minutes each) No overtime

juan
Download Presentation

Generalized Linear Mixed Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generalized Linear Mixed Model English Premier League Soccer – 2003/2004 Season

  2. Introduction • English Premier League Soccer (Football) • 20 Teams – Each plays all others twice (home/away) • Games consist of two halves (45 minutes each) • No overtime • Each team is on offense and defense for 38 games (38 first and second halves) • Response Variable: Goals in a half • Potential Independent Variables • Fixed Factors: Home Dummy, Half2 Dummy, Game#(1-38) • Random Factors: Offensive Team, Defensive Team • Distribution of Response: Poisson?

  3. Preliminary Summary

  4. Summary of Previous Slide • Teams vary extensively on offense and defense • Offense: min=38, max=73, mean=50.6, SD=8.85 • Defense: min=26, max=79, mean=50.6, SD=13.75 • Strong Negative correlation between off/def: r=-0.80 • Home Teams outscore Away Teams 1.3:1 • Second Half outscores First Half 1.2:1 • No evidence of autocorrelation in total goals scored over weeks, Durbin-Watson Stat = 2.03

  5. “Marginal Analysis” – No Team Effects • Break Down Goals by Home/Half2 (380 Games)

  6. Summary of Previous Slide • Means (Variances) for 4 Half Types: • Home/1st Half: Mean = 0.692 Variance = 0.689 • Away/1st Half: Mean = 0.521 Variance = 0.514 • Home/2nd Half: Mean = 0.813 Variance = 0.912 • Away/2nd Half: Mean = 0.637 Variance = 0.628 • Thus, means and variances in strong agreement • Chi-Square Statistics for testing for Poisson: • Df = (4 categories-1)-(1 Parameter estimated) = 2 • P-values all exceed 0.50 (.8505, .5440, .7353, .6957) • Goals scored consistent with Poisson Distribution

  7. Generalized Linear Models • Dependent Variable: Goals Scored • Distribution: Poisson • Link Function: log • Independent Variables: Home, Half2 Dummy Variables • Models: Model fit using generalized linear model software packages

  8. Parameter Estimates / Model Fit – Model 1 Distribution Poisson Link Function Log Dependent Variable goals Number of Observations Read 1520 Number of Observations Used 1520 Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 1517 1650.4574 1.0880 Scaled Deviance 1517 1650.4574 1.0880 Pearson Chi-Square 1517 1549.2570 1.0213 Scaled Pearson X2 1517 1549.2570 1.0213 Log Likelihood -1411.0226 Algorithm converged.

  9. Parameter Estimates / Model Fit – Model 1 Analysis Of Parameter Estimates Standard Wald 95% Confidence Chi- Parameter DF Estimate Error Limits Square Intercept 1 -0.6397 0.0588 -0.7549 -0.5245 118.48 home 1 0.2624 0.0634 0.1381 0.3866 17.12 half2 1 0.1783 0.0631 0.0546 0.3020 7.98 Scale 0 1.0000 0.0000 1.0000 1.0000 Analysis Of Parameter Estimates Parameter Pr > ChiSq Intercept <.0001 home <.0001 half2 0.0047 Scale NOTE: The scale parameter was held fixed.

  10. Parameter Estimates / Model Fit – Model 2 Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 1516 1650.3613 1.0886 Scaled Deviance 1516 1650.3613 1.0886 Pearson Chi-Square 1516 1549.7072 1.0222 Scaled Pearson X2 1516 1549.7072 1.0222 Log Likelihood -1410.9745 Algorithm converged.

  11. Parameter Estimates / Model Fit – Model 2 Analysis Of Parameter Estimates Standard Wald 95% Confidence Chi-Parameter DF Estimate Error Limits SquareIntercept 1 -0.6519 0.0711 -0.7912 -0.5126 84.15home 1 0.2839 0.0941 0.0995 0.4683 9.10half2 1 0.2007 0.0958 0.0129 0.3885 4.39home*half2 1 -0.0395 0.1274 -0.2891 0.2101 0.10Scale 0 1.0000 0.0000 1.0000 1.0000 Parameter Pr > ChiSq Intercept <.0001 home 0.0026 half2 0.0363 home*half2 0.7566 Scale NOTE: The scale parameter was held fixed.

  12. Testing for Home/Half2 Interaction • H0: No Home x Half2 Interaction (bHomeHalf2 = 0) • HA: Home x Half2 Interaction (bHomeHalf2≠ 0) • Test 1 – Wald Test • Test 2 – Likelihood Ratio Test

  13. Testing for Main Effects for Home & Half2 • Wald tests only reported here (both effects are very significant) • Tests based on Model 1 (no interaction model)

  14. Interpreting the GLM

  15. Incorporating Random (Team) Effects • Teams clearly vary in terms of offensive and defensive skills (see slide 3) • Since many factors are inputs into team abilities (players, coaches, chemistry), we will treat team offensive and defensive effects as Random • There will be 20 random offensive effects (one per team) and 20 defensive effects

  16. Random Team Effects • All effects are on log scale for goals scored • Offense Effects: oi ~ NID(0,so2) • Defense Effects: di ~ NID(0,sd2) • In Estimation process assume COV(oi,di)=0 which seems a stretch (but we can still “observe” the covariance of the estimated random effects)

  17. Mixed Effects Model • Fixed Effects: Intercept, Home, Half2 (a) • Random Effects: Offteam, Defteam (b) • Conditional Model (on Random Effects)

  18. Model in Matrix Notation - Example • League has 3 Teams: A, B, C • Order of Entry of Games: A@B, A@C, B@C, B@A, C@A, C@B • Order of Entry of Scores within Game: Home/1st, Away/1st, Home/2nd, Away/2nd • 3 Offense Effects, 3 Defense Effects, 24 Observations

  19. Model – Based on 3 Teams

  20. Sequence of Potential Models • No fixed or random effects (common mean) • Fixed home and second half effects, no random effects • Fixed home and second half effects, random offense team effects • Fixed home and second half effects, random defense team effects • Fixed home and second half effects, random offense and defense team effects

  21. Results – Estimates (P-Values) • Based on Z-test, not preferred Likelihood Ratio Test • H0:so2 = 0 vs HA:s02>0 TS: 4958.6-4951.9=6.7 P=0.5P(c12≥6.7)=.005 • Based on AIC, BIC, Model with both offense and defense effects is best • No interaction found between team effects and home or half2

  22. Goodness of Fit • We Test whether the Poisson GLMM is appropriate model by means of the Scaled Deviance • H0: Model Fits HA: Model Lacks Fit • Deviance = 1570.7 • DF = N-#fixed parms = 1520-3=1517 • P-value=P(c2≥1570.7)=0.1646 • No Evidence of Lack-of-Fit* • * If we use Scaled Deviance, we do reject, where scaled deviance=1570.7/0.9531=1647.9

  23. Best Linear Unbiased Predictors (BLUPs) Estimated Team (Random) Effects (Teams with High Defense values Allow More Goals) Estimated Fixed Effects For each Halfijkl compute exp{-0.6605+HOMEi+HALF2j+ok+dl} as the BLUP

  24. Comparison of BLUPs with Actual Scores • For Each Team Half, we have Actual and BLUP • Correlation Between Actual & BLUP = 0.2655 • Concordant Pairs of Halves (One scores higher on both Actual and BLUP than other) = 452471 • Discordant Pairs of Halves = 355617 • “Gamma” = (452471-355617)/(452471+355617)=0.1199 • Evidence of Some Positive Association Between actual and predicted scores

  25. Sources: Data: SoccerPunter.com Methods: Littell, Milliken, Stroup, Wolfinger(1996). “SAS System for Mixed Models” Wolfinger, R. and M. O’Connell(1993). “Generalized Linear Mixed Models: A Pseudo-Likelihood Approach,” J. Statist. Comput. Simul., Vol. 48, pp. 233-243.

  26. SAS Code data one;infile 'engl2003d.dat';input hteam $ 1-20 rteam $21-40 goals 47-48 half2 56 home 64 round 71-73;if home=1 then do; offteam=hteam; defteam=rteam; end;else do; offteam=rteam; defteam=hteam; end;%include 'glmm800.sas';%glimmix(data=two, procopt=method=reml, stmts=%str( class offteam defteam; model goals = home half2 /s; random offteam defteam /s ; ), error=poisson, link=log);run;

More Related