420 likes | 433 Views
Understand the complexities of measuring change in psychotherapy, focusing on reliability and validity in measurements, and new approaches in psychometrics. Learn about real data examples and how to overcome measurement challenges efficiently.
E N D
Complexities of measuring change in psychotherapy Chris Evans
Acknowledgements • Phil Richardson, Kevin Jones & others • Jan Lees, Mark Freestone, Nick Manning & others • Michael Barkham and many others • Mark Ashworth, Mel Shepherd, Susan Robinson, Maria Kordowicz & others • Susan McPherson • Jo-anne Carlyle
Classical psychometric model • We all have a position on an unmeasurable (“latent”) dimension of interest and of change (“true” value). • Quality of measurement a function of two issues: • Reliability • Validity • “No validity without reliability”
Reliability Extent to which the measure is uncontaminated by random noise Example 1: (“hard measurement”) working with obesity and using a poor scales to measure people’s weight it may fluctuate a lot entirely randomly. Example 2 (“our measurement”) measuring depression using a visual analogue rating scale the measurement may be contaminated by imprecision in where the person places their mark (and much else potentially).
Validity Extent to which the measure measures what it is supposed to and is uncontaminated by systematic corrupting by measuring other non-random issues Example 1: obesity – measuring people’s weight is pretty useless unless you also measure height as obesity is (largely) a function of weight and height so measuring the one without the other leaves your measure systematically biased: invalid. Example 2: in a multi-item measure of depression an item asking about weight loss will be systematically affected by recent deliberate dieting, drinking alcohol rather than eating wisely or by serious physical illness causing weight loss (or famine but rarely in the western world).
Reliability: a graphical model • circles are “latent”, unmeasurable, variables • squares are measurables • straight arrows show directional influence • everything is nomothetic,... • ...i.e. something on which each person has a value
Psychometrics: classical model • assume one source of common variance... • ... the latent trait to measure • … only source of covariation between items • each item is also affected by “error” • errors are independent, and • ... uncorrelated with the latent trait of interest
Cronbach’s alpha • Reliability: proportion of the measured variation (sum of the boxes) • … from the latent trait • ... not to the sources of error • estimated as coefficient alpha, proportion of covariance to variance
Challenges of “our measurement” • We have little time or money for measuring • What we measure is often either complex (“quality of life”) or • idiosyncratic (“recovering from death of partner bringing back abuse in childhood and early death of abusing parent”).
Recent measures • Format: • Short(ish), • multi-item, • self-report measures • Intention • Not so much designed to provide strong measurement of a unidimensional latent variable but • … to provide rapid coverage of a broad range of issues likely to cover many clients’ likely change. • Typical e.g.s: • Brief Symptom Inventory • CORE-OM • OQ-45.
Typical measures • Multiple items, e.g. • “I have felt terribly alone and isolated” • Time focus • “Over the last week” • Use rating anchors by frequency: • “Not at all”, “Only occasionally”, “Sometimes”, “Often”, “Most or all the time” • Or intensity • “Not at all” to “Extremely”
Issues about items • I have felt I have someone to turn to for support when needed • What does “turning to” involve? What is “support”? How much does “when needed” limit applicability? • I have felt O.K. about myself • How OK is OK?! • I have felt able to cope when things go wrong • How wrong is wrong? What is coping? (Quite a few European languages don’t have a verb “to cope”) • Tension and anxiety have prevented me doing important things • What if it was only tension? Or only anxiety? How important do things have to be to be important?
Issues about time frame • “Over the last week” • Do people really anchor to that? • Could it mean: • since Sunday? • since Monday? • the last seven days?
Issues about anchors • Not at all • Only occasionally • Sometimes • Often • Most or all the time • Is my “Only occasionally” your “Sometimes”?
Simple change variance model Instead of modelling each occasion separately … look at the variance of the differences between … observed scores … for each individual Get internal reliability of item change
Item change • Binary, Y/N item now have three possible change scores: -1, 0, +1 • Three level item: five scores: -2, -1, 0, +1, +2 • Four level item: seven scores: -3, -2, -1, 0, +1, +2, +3 • n-level item: always 2n – 1 differences
Real data for the simple model • Exploratory, pragmatic RCT • “Slim” paradigm RCT: • Twelve weeks of • Group based AT cf. • Treatment as usual • Design was N = 120 (60 per arm) • Minimisation randomisation • Richardson, Jones, Evans, Stevens & Rowe (2007) An exploratory randomised trial of group based art therapy as an adjunctive treatment in severe mental illness. Journal of Mental Health 16(4): 483-491.
Diversity & complexity of change • Naturalistic study of Therapeutic Communities in the UK • Borderline Syndrome Index • Lees, Evans, et al. (2006) Who comes into therapeutic communities? A description of the characteristics of a sequential sample of client members admitted to 17 therapeutic communities Therapeutic Communities 27(3): 411-433 • Lees, Evans, et al. (2005) A cross-sectional snapshot of therapeutic community client members Therapeutic Communities 26(3): 295-314
Idiographic & hybrid measures • “Patient generated” measures: • Problem rating & target rating • Personal questionnaire • PSYCHLOPS (from MYMOPS) • www.psychlops.org • Ashworth, Robinson, et al. (2005) Measuring mental health outcomes in primary care: the psychometric properties of a new patient-generated outcome measure, 'Psychlops' ('Psychological Outcome Profiles') Primary care mental health3: 261-270. • Ashworth, Evans, et al. (2009) Measuring psychological outcomes after cognitive behaviour therapy in primary care: a comparison between a new patient-generated measure ‘PSYCHLOPS’ (Psychological Outcome Profiles) and ‘HADS’ (Hospital Anxiety and Depression Scale) Journal of Mental Health18(2): 169-177.
Conventional psychometrics • 110 pre and post PSYCHLOPS from primary care largely CBT interventions • Cronbach alpha t1 .79 and t2 .87 (cf. usual .94/.95 for CORE-OM) • Change effect size large 1.53 cf. 1.06 for CORE-OM (p <.001) • Correlations with CORE-OM .48 to .61
Conclusions • Applying cross-sectional psychometric models (same for IRT/Rasch) is hiding complexity in our change data • Group summaries are hiding non-linearity and diversity in change profiles • Nomothetic questionnaires should be complemented with patient generated measures (PSYCHLOPS/PQ) • We need to stop hiding the complexity of our therapies! • … but we need a paradigm shift if we’re to manage the organisational anxieties that provokes • … and we need money and time to explore complexity • … and we won’t get money/time without a paradigm shift that answers questions
Thanks! chris@psyctc.org