240 likes | 320 Views
Workshop in R and GLMs: #4. Diane Srivastava University of British Columbia srivast@zoology.ubc.ca. Exercise. Fit the binomial glm survival = size*treat 2. Fit the bionomial glm parasitism = size*treat 3. Predict what size has 50% parasitism in treatment “0”.
E N D
Workshop in R and GLMs: #4 Diane Srivastava University of British Columbia srivast@zoology.ubc.ca
Exercise • Fit the binomial glm survival = size*treat • 2. Fit the bionomial glm parasitism = size*treat • 3. Predict what size has 50% parasitism in treatment “0”
Predicting size for p=0.5, treat=0 Output from logistic regression with logit link: predicted loge (p/1-p) = a+bx So when p=0.5, solve log(1)=a+bx
What is equation for treat 0? treat 1? Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.38462 0.16780 -14.211 <2e-16 *** size 0.76264 0.04638 16.442 <2e-16 *** treat 0.28754 0.23155 1.242 0.214 size:treat -0.09477 0.06357 -1.491 0.136
Rlecture.csv 3.12
Model simplification • Parsimonious/ Logical sequence (e.g. highest order interactions first) • 2. Stepwise sequence • 3. Bayesian comparison of candidate models (not covered)
ANCOVA: Difference between categories…. Constant, doesn’t depend on size Depends on size size*treat sig size*treat ns 12 10 8 Logit parasitism Logit parasitism 6 4 2 0 0 2 4 6 Plant size Plant size
Deletion tests • How to change your model quickly: • model2<-update(model1,~.-size:treat) • How to do a deletion test: • anova(reduced model, full model, test="Chi") • Test for interaction in logit parasitism ANCOVA • If not sig, remove and continue. If sig, STOP! • 2. Test covariate If not sig, remove and continue. If sig, put back and continue • 3. Test main effect
Code for “parasitism” analysis > ds<-read.table(file.choose(), sep=",", header=TRUE); ds > attach(ds) > par<-cbind(parasitism, 100-parasitism); par > m1<-glm(par~size*treat, data=ds, family=binomial) > summary(m1) > m2<-update(m1, ~.-size:treat) > summary(m2) > anova(m2,m1, test="Chi") > m3<-update(m2, ~.-size) > anova(m3,m2, test="Chi") > m3<-update(m2, ~.-treat) > anova(m3,m2, test="Chi")
Context (often) matters! What is the p-value for treat in: size+treat? treat? Stepwise regression: step(model)
Jump height (how high ball can be raised off the ground) 8 9 10 11 Feet off ground Total SS = 11.11
X variable parameter SS F1,13 p Height +0.943 9.96 112 <0.0001 of player
X variable parameter SS p Weight +0.040 7.92 32 <0.0001 of player F1,13
An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high…
X variable parameter SS F p Height +2.133 9.956 803 <0.0001 Weight -0.059 1.008 81 <0.0001 lighter heavier
Tall people can jump higher Heavy people often tall (tall people often heavy) + Height Jump + - Weight People light for their height can jump a bit more
Species.txt Rothamsted Park Grass experiment started in 1856
Exercise (species.txt) diane<-read.table(file.choose(), header=T); diane; attach(diane) Univariate trends: plot(Species~Biomass) plot(Species~pH) Combined trends: plot(Species~Biomass, type="n"); points(Species[pH=="high"]~Biomass[pH=="high"]); points(Species[pH=="mid"]~Biomass[pH=="mid"], pch=16); points(Species[pH=="low"]~Biomass[pH=="low"], pch=0)
Exercise (species.txt) 1. With a normal distribution, fit pH*Biomass • check model dignostics • test interaction for significance 2. With a poisson distribution, fit pH *Biomass • check model dignostics • test interaction for significance
Moral of the story: Make sure you KNOW what you are modelling!
Exercise (species.txt) 1. Fit glm: Species~pH, family=gaussian 2. Test if low and mid pH have the same effect • this is a planned comparison
Further reading Statistics: An Introduction using R (M.J. Crawley, Wiley publishers) Extending the linear model with R (JJ Faraway, Chapman & Hall/CRC)
Code for “Species” analysis > m1<-glm(Species~pH*Biomass, family=gaussian, data=diane) > summary(m1) > m2<-update(m1, ~.-pH:Biomass) > anova(m2,m1, test="Chi") > par(mfrow=c(2,2)); plot(m1) > m3<-glm(Species~pH*Biomass, family=poisson, data=diane) > m4<-update(m3, ~.-pH:Biomass) > anova(m4,m3, test="Chi") > par(mfrow=c(2,2)); plot(m3) >PH<-(pH!="high")+0 > m5<-glm(Species~pH, family=gaussian, data=diane) > m6<-update(m5, ~.-pH+PH) > anova(m6,m5, test="Chi")