200 likes | 483 Views
SPSS & R. By Gilbert MacKenzie The Centre of Biostatistics, University of Limerick . Introduction. This is a Then and Now talk Revisit my York 2001 critique Review Progress with SPSS since then Talk about the opportunities with R. Then Assess Talk 2001.
E N D
SPSS &R By Gilbert MacKenzie The Centre of Biostatistics, University of Limerick
Introduction • This is a Then and Now talk • Revisit my York 2001 critique • Review Progress with SPSS since then • Talk about the opportunities with R
Then Assess Talk 2001 • My York 2001 talk criticised the package in terms of: • Technical Content • Inconsistencies • ProgrammeStructure
Then Assess 2001 Technical Content • Then weak mainly in relation Complex Modelling: • Linear Mixed Models (Now fixed) • Generalised Linear Models (Now fixed) • Generalised Linear Mixed Models(Now in R) • General Purpose MLE fitting ? (Now in R ) • NB: ThenWebsitereferred toCNLR
Then Using CNLR Fitting the Generalised Time Dependent Logistic Survival Model (MacKenzie, 1996, JRSS D, 45, 1, 21-34; SIM, 1997, 16, 1831-1843.) GTDL: TDL: RR:
Then CNLR – Novel Survival Programme model program b0 = 0.05 alpha = 0.01. compute const = 1. compute fi = 0. compute lambda = exp(fi). compute term0 = b0*const. compute term1 = term0 + alpha*surtim. compute pi = exp(term1)/(1+exp(term1)). compute qi = 1-pi. compute wi = (1+exp(term0)). compute pred = ( ( 1+exp(term1)) / ( 1+exp(term0)) )**(-lambda/alpha) . compute loss = -1*(di*fi+di*ln(pi)+(lambda/alpha)*(ln(qi)+ln(wi)) ). cnlr di /pred = pred /loss=loss/save=pred/bootstrap. First time CNLR is used for survival analysis - this is the method but, watch out the bootstrap does not respect the censoring distribution
Now! – SPSSVersion 16 withR • Seems to solve many of these problems at a stroke • Calling R programme code within SPSS • Passing SPSS objects (eg data) to R • Return R objects (eg results) to SPSS For SPSS users this is an immediate major advance
Now! – How to runR in SPSS-V16 comment SPSS Syntax file . comment Preamble SPSS . GET FILE='C:\Data\SPSSDATA\R_SPSS\lung_cancer.sav'. DATASET NAME DataSet1 WINDOW=FRONT. freq surtim/histogram. comment Invoke R – wrap code. BEGIN PROGRAM R. library(survival) # load libraries library(graphics) cancer<-spssdata.GetDataFromSPSS() # fetch SPSS data print(colnames(cancer)) # check variables subset<-cancer[1:10,1:5] # subset data print(subset) print(mean(surtim)) # send to spv print(var(surtim)) # send to spv #print(hist(age)) #print(nlm) END PROGRAM. comment– more SPSS commands and/orR blocks
Now! – Block Structure ofRin SPSS-V16 comment Preamble SPSS . GET FILE='C:\Data\SPSSDATA\R_SPSS\lung_cancer.sav'. DATASET NAME DataSet1 WINDOW=FRONT. BEGIN PROGRAM R. library(survival) # load libraries cancer<-spssdata.GetDataFromSPSS() # fetch SPSS data results1<- nlm(…) spsspivottable.Display(results1,title="Results1",format=formatSpec.GeneralStat) END PROGRAM. BEGIN PROGRAM R. results2<- nlm(…) spsspivottable.Display(results2,title="Results2",format=formatSpec.GeneralStat) END PROGRAM. BEGIN PROGRAM R. results3<- nlm(…) spsspivottable.Display(results3,title="Results3",format=formatSpec.GeneralStat) END PROGRAM.
Now! – The nlm function inR This is a general minimisation routine to minimise a function of p variables. Typical usage ie a call nlm.res <- nlm(loglik, theta, extra=extra, hess=TRUE) where: loglik = a user written likelihood function loglike(param,extra) theta = a vector of parameter starting values extra= an object containing other quantities required by loglike hess = TRUE forces computation of Hessian matrix H()
Now! – Some Standard Theory H() = D2{ loge(L() } Hessian Io() = -H()Observed Information I() = -E [H()]Fisher Information ()= [I()] -1Covariance Matrix () [Io()] -1 Covariance Matrix var()= diag[Io() -1]Vector of variances se() =sqrt[var()]Vector of Std. Errors
Now! – The loglike function inR This is a user written function to maximise the log-likelihood ie to minimise minus the log-likelihood. Here is one for the Exponential Model (1 parameter) # must set up extra before call loglikexp <- function(x, extra) { fi <- x lam <-exp(fi) # keep scalar hazard lam >0 ti <- extra[,1] # vector of survival times deltai <- extra[,2] # vector of censoring indicators loglike <- -sum(deltai*fi-lam*ti) } call result <- loglikexp(param, extra) result # displays loglikeon console in R Input Parameters Return Value
Now! – SomeSurvival Examples inR • Basic Exponential Survival • Non-PH GTDL • Non-PH ~ Gamma Frailty Move to Syntax File Now!
Now! – Setting up the Software & Interfaces -steps • Install SPSS Version 16 • Go to Cran mirror & download R V2.5 and install • Download R integration installer and documentation from www.spss.devcentral and install • Read the R integration document first • Run SPSS using blocked syntax files
Then Assess 2001 - Conclusions • Obvious need to • Improve the Technical Content of SPSS • Create aConsistent Programming Environment • Support Object Orientated Language • Encourage aTechnical Development Group
Now! – Overall Conclusions R • Basic Technical Deficiencies covered by R • Need to develop output interface • Need to allowR graphical objects out or to • chart editor • Guidance on future SPSS development strategies for Rintegration • Progress Indeed!!
References See http://www.staff.ie/mackenzieg shortly for a complete download of R integration tools, data and code in this talk.