380 likes | 728 Views
PROC MIXED. Underlying Ideas with Examples. D. A. Dickey NCSU. Random or Fixed ?. 5 Clinics F R 4 Doctors/Clinic F R 3 Patients/Doctor F R 3 Drugs F R (1 per patient) . Poll:.
E N D
PROC MIXED Underlying Ideas with Examples D. A. Dickey NCSU
5 Clinics F R 4 Doctors/Clinic F R 3 Patients/Doctor F R 3 Drugs F R (1 per patient) Poll: Yijk = m + tk + Ci + Dij + Pijk
Twins: One gets SAS training method 1, the other gets method 2 Response Y = programming times
PROC MIXED Model Model ; Random ; Repeated ; Y = X b + Zg + e Variance of g is G = ,Variance of e is R =
PROCMIXED DATA=TWINS; CLASS FAMILY METHOD; MODEL TIME = METHOD; RANDOM FAMILY; Covariance Parameter Estimates Cov Parm Estimate family 21.2184 Residual 40.8338 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F method 1 19 9.60 0.0059 Intraclass correlation (related to heritability) s2F /(s2F + s2) Estimated as 21.2/62 or about 1/3. Q: Why not usual (Pearson) correlation?
BLUP Di = Family mean – m = Fi + ei. best estimate of Fi = ? Variance of (Fi – b Di) is (1-b)2s2F + b2s2/2 Use b = s2F /(s2F + s2/2) Estimate: b = 21.2/(21.2 + 40.8/2) = 0.510 Overall mean + 0.510(Family i mean – Overall mean) PROCMIXED DATA=TWINS; CLASS FAMILY METHOD; MODEL TIME = METHOD; RANDOM FAMILY; ESTIMATE "1 " intercept 1 | family 1; ESTIMATE "2 " intercept 1 | family 01; PROCGLM DATA=TWINS; CLASS FAMILY METHOD; MODEL TIME = FAMILY METHOD; LSMEANS FAMILY;
MEANS andBLUPs (GLM) (MIXED)
REML Estimation Y: {10, 11, 12, and 15 } mean 12, deviations -2, -1, 0, 3 SSq = 4+1+0+9 = 14 Estimate s2 MLE: 14/4 = 3.5 Unbiased: 14/3 = 4.67 ( MLE would be unbiased IF we knew and usedm ) Use MLE of {Z1, Z2, Z3 } = {-3,1,-2} SSq = 14, “n”=3, MLE is 14/3 MLE of {Z1, Z2, Z3 } is REML ! Known mean 0.
MLE (too narrow) REML Just Right! off center too wide Y: {10, 11, 12, and 15 }
REML vs. ML m s2
Tests for Variance Components (1) Wald test: very approximate - not recommended (2) Likelihood ratio test – next best method -2 REML log likelihood for full, reduced model Difference ~ Chi-Square (df = omitted parameter count) (Self & Liang: divide p-value by 2 for single variance component test) (3) When available use GLM F test Same as method=type3 in MIXED Available for simple variance component structures.
A small Monte Carlo: RCB with 10 blocks, 9 trts. Y = p-value from Wald test (division by 2) X = p-value from PROC GLM (exact F test ) Error variance 1 Block variance 1/8, 1/4, 1/2, 1, 2, 4 1000 reps
A small Monte Carlo: RCB with 10 blocks, 9 trts. Y = p-value from Wald test (division by 2) X = p-value from PROC GLM (exact F test ) Error variance 1 Block variance 0, 1/8, 1/4, 1/2, 1, 2, 4 1000 reps “Jittered p-values for F
Unbalanced Data I vs. III free of subject effects for red data. Misses info in other data.
procglm; class plug worker; model loss = worker plug; Random Worker; Estimate "I vs III - GLM" Plug -101; run; procmixed; class plug worker; model Loss=Plug; Random Worker; Estimate "I vs III - Mixed" Plug -101; run; Covariance Parameter Estimates Cov Parm Estimate worker 37.578 Residual 6.1674 GLM Source DF Type III SS F Value Pr > F worker 6 451.9062500 12.21 0.0074 plug 2 62.6562500 5.08 0.0625 Standard Parameter Estimate Error t Value Pr > |t| I vs III - GLM -4.8125 1.9635 -2.45 0.0579 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F plug 2 5 5.79 0.0499 Estimates Standard Label Estimate Error DF t Value Pr > |t| I vs III - Mixed -5.2448 1.9347 5 -2.71 0.0422
SPLIT PLOT 4 Aquariums, 2 aerated 2 not six dishes / aquarium one plant / dish soil x variety combinations ANOVA Source Air Error A V S VA VS AS AVS Error B Soil Variety 1 1 1 2 2 1 2 2 3 1 3 2
Variety j (j=1..b) Aeration i (i=1..a) Soil k (k=1…c) Errorterms: + interactions Soil 1 aerated Soil 1 not aerated
error A: tank(air) error B b=2 varieties r=2 reps Expected mean squares: MS(B) + [MS(A)-MS(B)]/c = (1/3)[MS(A)+2 MS(B)] (c=3) (1/3)[20.83333 + 2(7.7333)] = 12.10 standard error:
(1/3)[20.83333 + 2(7.7333)] = 12.10 MS(B) + [MS(A)-MS(B)]/c = (1/3)[MS(A)+2 MS(B)] (c=3) df=? Satterthwaite ! Automatic in MIXED! (ddfm=satterthwaite) OK regardless of summing to 0 assumptions!
PROCMIXED; CLASS VAR AQUARIUM SOIL AIR; MODEL YIELD = AIR SOIL VAR SOIL*VAR AIR*SOIL AIR*VAR AIR*SOIL*VAR / DDFM=SATTERTHWAITE; RANDOM AQUARIUM(AIR); ESTIMATE "SOIL 1: AIR EFFECT" AIR -11 AIR*SOIL -110000; RUN;
Covariance Parameter Estimates Cov Parm Estimate AQUARIUM(AIR) 2.1833 Residual 7.7333 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F AIR 1 2 16.20 0.0565 SOIL 2 10 7.87 0.0088 VAR 1 10 24.91 0.0005 VAR*SOIL 2 10 0.04 0.9631 SOIL*AIR 2 10 1.08 0.3752 VAR*AIR 1 10 4.22 0.0669 VAR*SOIL*AIR 2 10 0.23 0.7973 Standard Label Estimate Error DF t Value Pr > |t| SOIL 1: AIR EFFECT 5.2500 2.4597 5.47 2.13 0.0812
Random Coefficient Models the basic idea mistakes Dave Averageprogrammer a0 + b0 t Joy Program writing time Line for individual j: (a0 + aj) + ( b0 + bj )t
Random Coefficient Models Collars measure activity levels, several bears. Random coefficient model: Average daily activity pattern all bears Individual patterns, one per bear Assume sinusoidal function Smooth across midnights, periodic Fun with trigonometry: a sin(wt + d) = A sin(wt) + B cos(wt) where A = a cos(d) B = a sin(d) Moral: put in both to calibrate amplitude and phase to data.
Random Coefficient Models When encountering active bears, remember: You don’t have to outrun the bear, you just have to outrun your slowest friend. Put in fundmental pair (w = 2p/24) and harmonics (wj = 2pj/24), j =2,3,… coefficients Aj, Bj Assume fixed A0’s and B0’s for the average bear. (model statement) Assume random (Ak,Bk) ~ N(0, I2sTrig2) for individual bear k’s random adjustments. (random statement) ……
ods output solutionR=BLUPbear; procmixed data=bears covtest; class bear; model Y = s1--c4/solution outpm=means outp=pbear; random bear; random s1*bear c1*bear s2*bear c2*bear s3*bear c3*bear s4*bear c4*bear /type=toep(1) solution; run; quit; Note: s1, c1 are fundamental sine, cosine, others are harmonics. Toeplitz: (toep) toep(1):
Fixed Diurnal Pattern for Average Bear Y = probability of activity X = time of day
Adding in Random Bear Effects All Individual Bears
Recall: BLUPs were output to a dataset procprint data=blupbear; where Probt<0.001; run; StdErr Obs Effect bear Estimate Pred tValue Probt 1 bear 192 -0.1189 0.02872 -4.14 <.0001 22 s2*bear 396 0.08687 0.02339 3.71 0.0002 24 s3*bear 396 -0.1047 0.02339 -4.47 <.0001 40 s2*bear 414 -0.1601 0.03248 -4.93 <.0001 42 s3*bear 414 0.1308 0.03248 4.03 <.0001 60 s3*bear 432 -0.1094 0.03247 -3.37 0.0008 65 s1*bear 437 -0.09941 0.02134 -4.66 <.0001 91 bear 465 -0.1702 0.03234 -5.26 <.0001 94 s2*bear 465 0.1461 0.03246 4.50 <.0001 101 s1*bear 491 0.09137 0.02340 3.91 <.0001
The End (I bearly finished on time)
Appendix: Random & Repeated? (4 visits per patient) Model Y=drug ; Random patient ; Repeated visit / type=AR(1); Y = X b + Zg + e
PROC MIXED Model Model ; Random ; Repeated ; Y = X b + Zg + e Variance of g is G = ,Variance of e is R =
V12x12 = Var(Y) = Var(Zg + e) Block Diagonal, three 4x4 blocks This is NOT just another AR(1) CAN use both random and repeated statements!
Repeated /Type = CS; *(compound symmetry); V12x12 = Var(Y) = Var(Zg + e) Block Diagonal, three 4x4 blocks This is just another CS Model Y=drug ; Random patient ; Repeated visit / type=CS; Can’t converge – can’t separate from A, B
Repeated /Type = CS; *(compound symmetry); V12x12 = Var(Y) = Var(Zg + e) Block Diagonal, three 4x4 blocks A = 3, B = 16, sP2=2 A = 1, B = 14, sP2=4 A = 5, B = 18, sP2=0 No unique solution!
Fun with algebra: sP2+sVisit2=18, sP2+ rsVisit2 = 14, sP2+ r2sVisit2 = 12 (1-r) sVisit2 = 18-14=4 r(1-r) sVisit2 =14-12=2, r=2/4 = 0.5 0.5 sVisit2 =4, sVisit2 =8, sP2+sVisit2=18 sVisit2=10 sP2=10, sVisit2=8, r=0.5 (no choice – unique solution)