480 likes | 499 Views
This study examines the effectiveness of a treatment on the growth of the upper jaw in Japanese children. The regression analysis is conducted using serial data to determine the impact of age on the dependent variable.
E N D
Regression using serial data Jyoti Sarkar, IUPUI jsarkar@math.iupui.edu
The Problem Given: On n units, (x,y) “before” and (x,y) “after” a treatment Goal: Regress y on x • X=a predictor (easy/inexpensive) • Y=a response (difficult/expensive) • Assume n units are independent jsarkar@math.iupui.edu
Example Concern: Many Japanese exhibit a bigger lower jaw than upper jaw. Treatment (for growth of upper jaw): Children (4 -12 years old) wore a mouth gear 8 -10 hours daily for 1- 2 years. Questions • Was the treatment effective? … no control gp • How did measurements change with age? jsarkar@math.iupui.edu
Face Mask Experiment Sample: 25 boys, 18 girls “before” and “after” treatment • age (year day) From X-ray plates measure: • ccorr = corrected C-axis SM (mm) • theta = (C-axis, anterior cranial base SN) • alpha = (C-axis, palatal plane thro’ M) ( ½ degree) Objective: Regress y=ccorr on x=age jsarkar@math.iupui.edu
Face Mask data ☺ patient gender age1 theta1 alpha1 ccorr1 age2 theta2 alpha2 ccorr2 ☺ 1 2 4.99 39.0 35.0 76.076 5.99 40.5 32.0 77.064 ☺ 2 2 9.90 43.0 32.0 68.666 11.43 47.0 34.0 72.124 etc. jsarkar@math.iupui.edu
Regress y=ccorr on x=age:(1) “before” data: (n=18) >Regress ccorr1 on 1 age1 ccorr1=66.97 + 0.2530 age1 • r2=.023, r2(adj)=.000 • S=3.632, SE(b1)=0.4096, p-value=.545 • t.975,16=2.120 • 95% CI(b1) = (-0.6153,1.1214) jsarkar@math.iupui.edu
(2) “after” data: girls (n=18) >Regress ccorr2 on 1 age2 ccorr2=71.30 + 0.1142 age2 • r2=.006, r2(adj)=.000 • S=3.321, SE(b1)=0.3738, p-value=.764 • t.975,16=2.120 • 95% CI(b1) = (-0.6782, 0.9066) jsarkar@math.iupui.edu
(3) “superimposed” data: (n=36) Data size doubled, range expanded >Stack age1 age2 age >Stack ccorr1 ccorr2 ccorr >Regress ccorr on 1 age Ccorr =67.5636 + 0.3793 age • r2=.049, r2adj=.021 • S=3.745, SE(b1)=0.2880, p-value=.197 • t.975,34=2.032 • 95% CI(b1) = (-0.2060, 0.9646) jsarkar@math.iupui.edu
Regress y on x: naïve attempts All 3 naïve attempts yield • Low r2 • Large p-value => slope=0 • CI э 0 Conclusion: • Either “ccorr does not depend on age” • Or “we need a better regression model” jsarkar@math.iupui.edu
Serial Bivariate Plot • ccorr increases with age (for most girls) • Regression of ccorr on age should have positive slope,especially under treatment Why then is r2 low? Between-subject variation is high. Study within-subject change, to see if ccorr depends on age. jsarkar@math.iupui.edu
Within-subject change • Dage = age2 - age1 = Treatment duration • Dccorr = ccorr2 – ccorr1 = Change in ccorr • Dccorr / Dage = within-subject slope Means (n=18 girls) age2 = 8.39 ccorr2 = 72.26 age1 = 7.26 ccorr1 = 68.80 Dage = 1.13 Dccorr = 3.46 Dccorr/ Dage = 3.0251 Recall b1= (1) 0.2530 (2) 0.1142 (3) 0.3793 jsarkar@math.iupui.edu
Regress Dccorr on Dage >Regress dccorr on 1 dage; >noconstant. dccorr = 3.0763 dage S=2.374, SE(b1)=0.4847, p-value = .000 t.975,17=2.110 95% CI(b1) = (2.0536,4.0990) Conclusion: ccorr increases with age jsarkar@math.iupui.edu
A Paradox: • Naïve regression slopes are zero • Within-subject slope is non-zero What to do? • Find the proper regression model. • Repeated Measures/Growth Curves • Repeated Measures with Covariate • Serial Correlation jsarkar@math.iupui.edu
Serial Correlation Model 1 • Regression model ccorr = b0 + b1 age + error • error variables ID N(0,s2), dependent • Between-subject errors uncorrelated • Within-subject errors have correlation r jsarkar@math.iupui.edu
Regression Model 1 jsarkar@math.iupui.edu
If r unknown Pre-multiply by jsarkar@math.iupui.edu
Orthogonalized Model 1 jsarkar@math.iupui.edu
Stacking … jsarkar@math.iupui.edu
If r unknown jsarkar@math.iupui.edu
Algorithm: Estimate r 0. Begin = correlation(ccorr1, ccorr2) 1. Orthogonalize age and ccorr using to obtain tage & tccorr 2. Regress tccorr on 1 tage Save residuals 3. If = corr(tresi1, tresi2) < .001, STOP Else = + Go to Step 1. jsarkar@math.iupui.edu
MINITAB codes1 >corr c7 c12 # initial rho >let k3=.730 # enter above/updated rho >let k1=(1/sqrt(1+k3)+1/sqrt(1-k3))/2 >let k2=(1/sqrt(1+k3)-1/sqrt(1-k3))/2 # orthogonalize age >let c21=k1*c3+k2*c8 >let c22=k2*c3+k1*c8 >stack c21 c22 c31 >name c31 'tage' jsarkar@math.iupui.edu
MINITAB codes2 >let c23=k1*c7+k2*c12 # orthog… ccorr >let c24=k2*c7+k1*c12 >stack c23 c24 c32 >name c32 'tccorr' >regress 'tccorr' 1 'tage'; >resi c33; >coef c34. >unstack c33 c35 c36; subs c18. >corr c35 c36 # STOP if <.001, else >let k3=k3+corr(c35,c36)/2 jsarkar@math.iupui.edu
“Orthogonalized” data: (n=36) First iteration: (Model 1) Initial =.730 • tccorr =46.9184 + 1.1271 tage • r2=.216, r2(adj)=.193, p-value=.004 • Corr(tresi1, tresi2)=.191 Revised =.82545 jsarkar@math.iupui.edu
Iteration History (Model 1) Iter 0 .730 .191 1 .825 .066 2 .858 .012 3 .8641 .001 4 .8646 .000 5 .864621 jsarkar@math.iupui.edu
“Orthogonalized” data: (n=36) After Five iterations: =.8646 • Corr(tresi1, tresi2)=.000 • tccorr =42.132 + 1.6613 tage • r2=.347, r2(adj)=.328, • S=5.1319, SE(c1)=0.3908, p-value=.000 jsarkar@math.iupui.edu
Regress y on x : (Model 1) ccorr = 57.532 + 1.6613 age • =0.8646 • =5.2091, SE(b1)=.3967, p-value=.000 • t.975,33=2.0345 • 95% CI(b1) = (0.8560, 2.4683) jsarkar@math.iupui.edu
Serial Correlation Model 2 • Regression model 2 ccorr = b0 + b1 (age) + error • error variables ID N(0,s2), dependent • Between-subject errors uncorrelated • Within-subject errors have correlation r(age2-age1) jsarkar@math.iupui.edu
Regression Model 2 jsarkar@math.iupui.edu
MINITAB Codes 3 >let c19=‘age2’ – ‘age1’ >name c19 ‘dage’ >corr c7 c12 >let k3=.730 # enter above/updated correlation # use rDage to orthogonalize >let c51=(1/sqrt(1+k3**c19)+1/sqrt(1-k3**c19))/2 >let c52=(1/sqrt(1+k3**c19) -1/sqrt(1-k3**c19))/2 >let c21=c51*c3+c52*c8 >let c22=c52*c3+c51*c8 etc. jsarkar@math.iupui.edu
Iteration History: (Model 2) Iter 0 .730 .231 1 .845 .063 2 .877 .002 3 .8782 .001 4 .8781 .000 5 .878120 jsarkar@math.iupui.edu
“Orthogonalized” data: (n=36) After Five iterations: =.8781 Corr(tresi1, tresi2)=.000 • tccorr =57.935 intdage + 1.6097 tage • r2=.336, r2(adj)=.316, • S=5.092, SE(c1)=0.3912, p-value=.000 jsarkar@math.iupui.edu
Regress y on x : (Model 2) ccorr = 57.935 + 1.6098 age • =0.8781 • =5.169, SE(b1)=0.3971, p-value=.000 • t.975, 33=2.0345 • 95% CI(b1) = (0.8018, 2.4176) jsarkar@math.iupui.edu
Summary • Model serial data properly • Estimate serial correlation Use iterated algorithm • Regress orthogonalized data • Obtain regression of y on x • Adjust , SE(b1) and CI(b1) • Can extend to more repeats per subject jsarkar@math.iupui.edu
Thank you. jsarkar@math.iupui.edu