INS Investigators’ Workshop: Methods for Single-Case Studies in Neuropsychology

INS Investigators’ Workshop:Methods for Single-Case Studies in Neuropsychology John R. Crawford School of Psychology College of Life Sciences and Medicine King’s College University of Aberdeen and School of Psychology Flinders University of South Australia j.crawford@abdn.ac.uk www.abdn.ac.uk/~psy086/dept/ Cognitive Neuroscience Research Group

Cognitive Neuroscience Research Group Collaborators: Prof Paul H Garthwaite The Open University also Prof David C Howell University of Vermont Prof Keith R Laws University of Hertfordshire Prof Annalena Venneri University of Hull Dr Colin D Gray University of Aberdeen Prof Addelchi Azzalini University of Padova (Dr Sytse Knypstra University of Groningen)

The Importance of Dissociations “Dissociation is the key word of neuropsychology.” (Rossetti & Revonsuo, 2000, p. 2)

The Case for Single Cases “Studies in groups of patients which aim at elucidating the neurological and functional architecture of mental processes are useless and harmful, since they provide misleading results. The only appropriate method is to study individual patients” (Vallar, 2000, p. 334)

The need for methodological rigour in single-case studies “If advances in theory are to be sustainable they must be based on unimpeachable methodological foundations.” (Caramazza & McCloskey, 1988, p.619).

Evaluating Tests for Deficits in Single-Case Studies • Massive revival of interest in single-case studies in neuropsychology and neurology • The arguments for single-case studies over group studies are viewed by many as compelling • However, it is clear that they present difficulties when it comes to statistical analysis • This aspect of single-case studies has been relatively neglected

Single-case research: The three basic approaches to drawing inferences concerning a patient’s performance • Patient is administered fully standardized neuropsychological tests and performance is compared to large sample normative data • At other extreme, patient’s performance is not referenced to normative data or control performance; i.e., analysis is limited to intra-individual comparisons • Patient is compared to a (modestly sized) matched control sample

Limitations of the fully standardized approach • Can only be used in fairly circumscribed situations because: • New constructs are constantly emerging in neuropsychology • In contrast, collection of norms is a long and arduous process • Even where norms are available, may not be applicable to patient

Results can be very misleading as performance is not referenced to normal performance • Category specificity literature provides a good example of dangers • Reports of apparently striking dissociations between naming of living versus non-living things and even within these categories (Broccoli’s area?) • In vast majority of these studies inferences are drawn on the basis of chi-square tests comparing a patient’s living and non-living naming Dangers of the intra-individual approach

Laws, Gale, Leeson & Crawford’s (2005) study on living / non-living naming • Laws et al. examined cases of AD who had or had not been classified as exhibiting a dissociation using intra-individual approach (chi-square test) • It was found that the performance of some patients with “dissociations” was not unusual when referenced to control performance • Moreover, patients who had not been identified as exhibiting dissociations were identified as such when performance was referenced to controls • In one case a patient classified as exhibiting a dissociation in favour of non-living things was found to exhibit a dissociation in the opposite direction

Testing for a deficit in single-case studies: the “standard” method • Patient’s performance is converted to a standard score based on mean and SD of control sample and referred to table of areas under the normal curve • The statistics of the control sample are treated as population parameters • When sample size is large this is not too much of a problem as the statistics provide sufficiently accurate estimates of the parameters • However, large sample sizes are rare in the single-case literature

Testing for a deficit in single-case studies using Crawford & Howell’s (1998) proposed method: • Uses formula set out by Sokal and Rohlf (1995) • Modified t-test: tests hypothesis that patient did not come from the control population (under null hypothesis patient is an observation from this population) • Control sample statistics are treated as statistics • Crawford & Garthwaite (2002) also developed method of setting confidence limits on abnormality of score (using non-central t distributions)

Comparison of two methods for testing for a deficit: Type I errors (Crawford & Garthwaite, Neuropsychology,2005) • Monte Carlo simulation study • 5 control sample sizes (N) were examined: 5, 10, 20, 50 and 100 • For each value of N one million observations of N +1 were drawn from a normal distribution • The first N observations were taken as the control sample data and the N+1th observation as the control case • The alternative tests for deficits were applied to these data and the percentage of Type I errors compared to the specified rate of 5%

Perform statistical tests comparing control case and Control sample and record if significant, i.e. record if Type I error Monte Carlo simulation: Sampling from the control population Step (2) Get machine to repeat this one million times Step (3) Meanwhile go and get yourself a…

Comparison of two methods for testing for a deficit: Type I errors (Crawford & Garthwaite, 2005)

Departures from normality • Both z and modified t-test assume control data are drawn from normal distribution • However, in single-case studies there is often evidence of negative skew in scores of the control samples (ie control mean=50, SD=10, but max score =55) • We have run Monte Carlo simulations to examine control of Type I error rate when control data are non-normal • Same method as in previous study except that the N+1 observations were sampled from distributions that were skew and /or leptokurtic

Perform statistical tests and record if significant, i.e. record if Type I error Sampling from negatively skewed and / or leptokurtic distributions Step (2) Repeat this one million times Step (3) Meanwhile go and get yourself a…

Results of a Monte Carlo study: Robustness in face of moderate skew

The Internet • Most of calculations involved with these methods are simple (exception being CLs) • However, still tedious and error prone • Therefore, we have written computer programs that implement these methods • Freely available on the web www.abdn.ac.uk/~psy086/dept/SingleCaseMethodsComputerPrograms.HTM • Calculations can be performed literally in seconds

In this example, the patient’s score is significantly below controls and so we conclude he/she has a deficit. Also, it is estimated that only 1.13% of the control population would exhibit this poor a score; the 95% CI on this estimate of abnormality is from 0.05% to 4.68%

Modifed T-Test Versus Modified ANOVA • Mitchell and colleagues (Mycroft et al, 200; Mitchell et al, 2004) have criticised the foregoing method • They argue that (a) a notional patient population will have markedly increased variance relative to the control population, and (b) our method will therefore produce inflated Type I errors • Mitchell and colleagues propose an ANOVA that employs more conservative critical values to overcome this perceived problem

Modifed T-Test Versus Modified ANOVA • We believe there are two major problems with Mitchell et al’s position: • The argument over Type I errors is untenable (Crawford et al, Cognitive Neuropsychology, 2004) • Statistical power to detect a deficit is very low for Mitchell et al’s method (Crawford & Garthwaite, Cognitive Neuropsychology, in press)

A graphic illustrating Mitchell et al’s. scenario: A notional patient population (gray line) has same mean as controls (dark line) but has greater variability This scenario is not realistic: If the means do not differ but patients are more variable, then scores below control mean must be exactly balanced by scores above control mean

If the patient mean is lower than control mean (even marginally) then issue of Type I error does NOT arise: a deficit is present and the question is whether it can be detected (i.e., it is a power issue)

Power to detect a (2 SD) deficit: comparison of three methods (Crawford & Garthwaite, Cognitive Neuropsychology, in press)

PART 2Dissociations in Neuropsychology and Statistical Tests on Differences

DISSOCIATIONS • In neuropsychology, deficits are of limited theoretical interest unless they are accompanied by preserved or less impaired performance on other tasks; i.e. the aim of many single-case studies is to demonstrate dissociations of function

Conventional Definition of a Classical Dissociation “If patient X is impaired on task 1 but performs normally on task 2, then we may claim to have a dissociation between tasks” (Ellis and Young, 1996, p. 5)

A Classical Dissociation(based on Shallice, 1988) Performance Task Y Task X

The Importance of Dissociations “Dissociations play an increasingly crucial role in the methodology of cognitive neuropsychology… they have provided critical support for several influential, almost paradigmatic, models in the field.” (Dunn & Kirsner, 2003, p. 2)

Criteria for Dissociations: Three Problems • What constitutes a “deficit” and being “within normal limits” is very poorly specified • One half of the typical definition essentially involves an attempt to prove the null hypothesis • A patient’s score on the “impaired” task could lie just below the critical value for defining impairment and the performance on the other test lie just above it (see Caramazza & Shelton, 1998 for similar point)

Problems with Conventional Criteria for a Classical Dissociation Performance Task Y Task X

Crawford, Garthwaite & Gray ( 2003):

Potential Solutions to the Three Problems • Crawford et al. (2003) provided fully explicit criteria for a deficit (using Crawford & Howell’s test) • They also introduced a requirement that the patient’s performance on Task X should be significantly poorer than performance on Task Y • This criterion deals with the problem of trivial differences • It also provides us with a positive test for a dissociation (thereby lessening reliance on what boils down to an attempt to prove the null hypothesis of no deficit or impairment on Task Y)

Crawford et al’s. criteria for a classical dissociation Y not significantly different from controls on Crawford & Howell’s test (one-tailed) Performance X significantly different from Y X significantly below controls on Crawford & Howell’s test (one-tailed) Task Y Task X

Testing for a difference between a patient’s performance on Tasks X and Y • How should we test for significant difference between a patient’s score on Tasks X and Y ? • In most single-case studies the two tasks of interest will have different means and SDs • For example a patient’s performance on a ToM task with (mean=35, SD=12) is to be compared with performance on an executive task (mean=22 SD=6) • In order to meaningfully compare performance it is necessary to standardize scores on the two tasks

ThePayne and Jones method • A long established method is that of Payne & Jones (1957):

Testing for a difference between a patient’s performance on Tasks X and Y: Crawford, Howell & Garthwaite (1998) method

Revised Standardized Difference Test (Crawford & Garthwaite, 2005b; Garthwaite & Crawford, 2004): • Looks nasty but is essentially of familiar form…

Monte Carlo Evaluation of tests for differences between tasks • 5 control sample sizes (N) were examined: 5, 10, 20, 50 and 100 • For each value of N and for each of 4 values of r (the correlation between tasks), one million pairs of observations of N +1 were drawn from a bivariate normal distribution • The first N pairs were taken as the control sample data and the N+1th pair was as the control case • The alternative tests for differences were applied to these data and the percentage of Type I errors compared to the specified rate of 5%

X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y X , Y Perform statistical tests comparing control case with control sample and record if significant, i.e. record if Type I error Simulation study of Type I errors for tests on differences Meanwhile, have a noodle about on the…

Monte Carlo simulation: Type I error rate for Revised Standardized Difference Test (rxy =0.5 in this example)

Evaluating criteria for classical dissociations • To recap: we have considered two sets of criteria for detecting dissociations – the conventional criteria and Crawford & Garthwaite’s (2005b) criteria • We now have suitable test for Crawford & Garthwaite’s third criterion (i.e. it requires a significant difference between a patient’s scores on X and Y) • We (Crawford & Garthwaite, 2005a, Neuropsychology) have examined performance of these two sets of criteria • Same approach as used for evaluating foregoing tests; i.e. sample from bivariate distributions but apply the sets of criteria rather than individual tests for components of these criteria

Type I error rate for Crawford & Garthwaite’s (2003; 2005b) criteria and conventional criteria for a classical dissociation (in this example rxy = 0.5

Evaluating criteria for a classical dissociation: Conclusions • Conventional criteria for a classical dissociation will misclassify a worryingly high percentage of healthy controls as exhibiting a classical dissociation regardless of the size of the control sample (rate was as high as 18.6% in one of the scenarios) • In contrast, Crawford & Garthwaite’s (2003; 2005b) criteria are conservative; i.e. very low percentage of controls misclassified • Results underline importance of testing the difference between patient’s X and Y scores • Crawford & Garthwaite’s criteria were relatively robust in face of skewed control data

INS Investigators’ Workshop: Methods for Single-Case Studies in Neuropsychology