1 / 49

The Psychometrics Behind Neurocognitive Evaluation for Concussion

The Psychometrics Behind Neurocognitive Evaluation for Concussion. Philip Schatz, PhD Department of Psychology Saint Joseph ’ s University schatzSJU@gmail.com. Disclosures. Consulting/support: International Brain Research Foundation Department of Defense

deva
Download Presentation

The Psychometrics Behind Neurocognitive Evaluation for Concussion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Psychometrics Behind Neurocognitive Evaluation for Concussion Philip Schatz, PhD Department of Psychology Saint Joseph’s UniversityschatzSJU@gmail.com

  2. Disclosures • Consulting/support: • International Brain Research Foundation • Department of Defense • Sports Concussion Center of New Jersey • ImPACT Applications, Inc. • Disclaimer: • No role in the conceptualization, design, collection or • analysis of data, manuscript preparation or decision to • submit for publication.

  3. Concussion Publications – to date

  4. Concussion Publications-projected

  5. Concussion Publications-psychometrics

  6. Overview • Basics of correlation and variance • Psychometric properties of concussion tests, in context of: • common psychological tests • other tests • Psychometric properties of a two-factor theory of concussion

  7. Reliability vs.Validity Highly Reliable and Valid Highly Reliable but Not Valid Neither Reliable or Valid

  8. Variance: What We Have Learned

  9. Variance: What We Expect

  10. Variance: What We Often See

  11. Psychometric Issues • Reliability in a nutshell: • Test-retest reliability assumes: • Fluctuations/changes are due to deficiencies in measure • Human behavior does not deviate from Time 1->Time 2 • We are measuring traits and not states

  12. Psychometric Issues • Test-retest reliability assumes: • Fluctuations/changes are due to deficiencies in measure • Human behavior does not deviate from Time 1->Time 2 • We MAY BE measuring states and not traits • Broglio, et al., 2007: 118 student “volunteers” completed: • ImPACT, HeadMinders, CogSport, MACT • One test session • 40 subjects (34%) had invalid baselines • Cameron, Schatz (unpublished thesis): 90 student “volunteers” completed ImPACT back-to-back • One test session • 18 subjects (20%) had invalid baselines • An additional 15 subjects (21%) had “red flag” scores (<1.5 SD)

  13. Psychometric Issues • Reliability in a nutshell: • Random error: • Situational fluctuations or changes in mood or environment • sleep, fatigue, diet, metabolism • distractions, noise, equipment

  14. Psychometric Issues • Random error: • Situational fluctuations or changes in mood or environment • sleep, fatigue, diet, metabolism • Athletes sleeping <7 hrsperformed worse on 3/4 ImPACT composite scores, and endorse more symptoms (McClure, et al, In Review, AJSM) • distractions, noise, equipment • Athletes in Group Setting scored significantly worse than athletes tested in Individual setting on: • Verbal: 83.4 vs 86.5 (p=.003) • Visual: 71.6 vs76.7 (p=.0001) • Motor: 35.6 vs38.4 (p=.0001) • RT: 0.61 vs0.57 (p=.001) • (Moser et al., 2001, AJSM)

  15. Psychometric Issues • Reliability in a nutshell: • Systematic error: • Factors that consistently effect measurement across sample • practice effects • increased exposure, familiarity with measure, device

  16. Psychometric Issues • Evidence of Systematic error on ImPACT?: • e.g., practice effects • No significant differences on Any Composite Score: • Back-to-back Administrations (Cameron, Schatz: MS thesis) • Pre-Season->Mid-Season->Post-Season (Miller et al., 2007) • Significant improvement on: • Processing Speed at 30 days, 1 year (Schatz, Ferris, 2013, Elbin et al, 2011) • Vis Memory, RT at 1 year (Elbin et al, 2011)

  17. Psychometric Issues

  18. Psychometric Issues • Reliability in a nutshell: • Can we measure or distinguish between specific types of “error” • Can we measure or distinguish between specific “error” at X1 versus X2?

  19. Psychometric Issues • Reliability in a nutshell: • Can we measure or distinguish between specific types of “error” • Can we measure or distinguish between specific “error” at X1 versus X2? • Cameron(unpublished MS thesis): 90 student “volunteers” completed ImPACT back-to-back • Using Iverson’s RCI cut-offs: • 8% showed significant decreases at T2 on Verbal Mem • 8% showed significant decreases at T2 on Visual Mem • 7% showed significant decreases at T2 on Motor Speed • 7% showed significant increasesat T2 on Reaction Time • 26% showed significantly worse performance at T2 on 1 composite score

  20. Psychometric Issues: Reliability • How do we measure reliability?Pearson’s r?Intra-class correlations? • “There is literally no such thing as the reliability of a test, unqualified; the coefficient has meaning only when applied to specific populations”Streiner and Norman, 1995

  21. Psychometric Issues: Reliability • How do we measure reliability? • Pearson’s r: • general measure of strength of linear relationship considered a weak measure of reliability when • group means are similar but • there is variation in individual scores • does not allow for correlation of multiple trials • “inter-class” correlation, does not account for variation within trials • cannot detect “systematic error” (e.g., practice effects; Weir, 2005)

  22. Psychometric Issues: Reliability • How do we measure reliability? • Pearson’s r: Example • considered a weak measure of reliability when group means are similar but there is variation in individual scores • Back-to-back administrations of ImPACT • Similar Group Means: 94.5 to 92.7 • Similar Standard Deviation: 4.8 to 5.6 • t(48)=1.22. p=.23 • r=.01

  23. Psychometric Issues: Reliability • How do we measure reliability? • Intra-Class Correlation Coefficient (ICC): • originally developed for analysis of “inter-judge” (inter-rater) effects • large differences between “judges” will result in low coefficients • indicates proportion of variability in the measure (e.g., mean) that is due to variation between individuals • as applied to test-retest reliability • ICC is used to analyze “trial-to-trial” consistency • Thus, reflective of the reliability of the measure

  24. Psychometric Issues: Reliability Five published articles on reliability of ImPACT, listed chronologically: Iverson, G., Lovell, M. R., & Collins, M. W. (2003). Interpreting change on ImPACT following sport concussion. ClinNeuropsychol. Broglio, S. P., Ferrara, M. S., Macciocchi, S. N., Baumgartner, T. A., & Elliott, R. (2007). Test-retest reliability of computerized concussion assessment programs. J Athl Train Schatz, P. (2009). Long-term test-retest reliability of baseline cognitive assessments using ImPACT. Am J Sports Med Elbin, R. J., Schatz, P., & Covassin, T. (2011). One-Year Test-Retest Reliability of the Online Version of ImPACT in High School Athletes. Am J Sports Med Schatz, P., Ferris. C. (2013). One-month test-retest reliability of the ImPACT test battery. Arch ClinNeuropsych

  25. Psychometric Issues: Reliability • Update to Broglio’s 2007 study: • Nakayama (MSU Dissertation) replicated Broglio’s 2007 study using only ImPACT. • Nakayama used ACSM standard for “athletically active” • 75mod-150vig min/wk cardio, 2-3 days/wk resistance training • <3% of subjects had Invalid results (vs. 34% for Broglio) • Higher ICCs across all Composite scores

  26. ImPACT: Reliability Data

  27. ImPACT: Reliability Data

  28. Working Memory: Reliability Data Test-retest reliability of other Working Memory measures: ImPACT(VrM) 1 month ICC .79 Schatz, Ferris, 2013 ImPACT (VrM) 45 days ICC .76Nakayama, 2013 ImPACT(VrM) 1 year ICC .62 Elbin, et al, 2011 CogSport (WM):1 year ICC .51 Collie, et al., 2001 ImPACT (VrM) 2 years ICC .46 Schatz, 2010 ANAM (CPT) 1 week ICC .32 Segalowitz, et al 2007 CogSport (WM) 1 hour ICC .24 Collie, et al., 2001 Digit Span 60 days r .70 Barr, et al., 2003 WMS (LM) 11 months r .70 Tulsky, et al., 2003 WMS (VR) 11 months r .62 Tulsky, et al., 2003 WMS (PA) 11 months r .57 Tulsky, et al., 2003 RAVLT 1 year r .55 Snow, et al, 1988 RVDLT-R 1 month r .45 Benedict, 1997

  29. Reaction Time: Reliability Data Test-retest reliability of Reaction Time measures: ImPACT (RT) 1 month ICC .77 Schatz, Ferris, 2013 ImPACT (RT) 1 year ICC .76 Elbin, et al, 2011 CogSport (RT) 1 week ICC .76 Collie, et al., 2001 ImPACT (RT) 2 years ICC .68 Schatz, 2010 ImPACT (RT) 45 days ICC .68 Nakayama, 2013 Analog (RT) 1 year ICC .65 Eckner et al, 2011 CogState (RT): 1 year ICC .51 Eckner et al, 2011 ANAM (RT) 1 week ICC .46 Segalowitz, et al 2007 CPT-II (child) 6 months ICC .65 Zabel, et al., 2009 BOT* (adults) 1 session ICC .53 Mercer, et al, 2009 CANTAB* (kids) 10 weeks ICC .37 Fisher et al., 2011 Laser 1 session r .99 Matsumura, et al., 2013 *Bruininks-Oseretsky Test of Motor Proficiency, Cambridge Neuropsychological Test Battery

  30. PsychoMotor Speed: Reliability Data Test-retest reliability of other Pro. Speed/Coding measures: ImPACT (PS) 1 month ICC .88 Schatz, Ferris, 2013 ImPACT (PS) 45 days ICC .86Nakayama, 2013 ImPACT (PS) 1 year ICC .82 Elbin, et al, 2011 CogSport (CM) 1 week ICC .76 Collie et al, 2003 ImPACT (PS) 2 years ICC .74 Schatz, 2010 ANAM (CDS) 1 week ICC .54 Segalowitz, et al 2007 SDMT: 10 days r .74 Hinton-Bayre, et al. 1997 Digit Symbol 60 days r .73 Barr, et al., 2003 Tapping: 6 months r .71 Ruff, Parker, 1993 Trails B: 60 days r .65 Valovich2006, Barr 2003 BVMT-R 55 days r .60 Benedict, 1997

  31. Other Tests: Reliability Data Test-retest reliability of other/common tests: Systolic BP 3 months r .50 Diastolic BP 3 months r .53 Heart Rate 4 visits ICC .56 Heart Rate 1 week ICC .74 Gluc. Metabimmediate r .77 BESS 60 days r .70 Field Sobriety/Blood ETOH: Actual BAC immediate r .97 Saliva ETOH 10 mins r .90 1-leg Stand immediate r .61 Arrest Decis. immediate r .54 Est. BAC immediate r .68

  32. “State-Trait” Issues Test-retest reliability of other constructs: (Adult) Manifest Anxiety 1 week .67-.90 Children’s Manifest Anxiety 1 week .54-.76 Trait Anxiety-Adult 1 hour .84(M) .76(F) State Anxiety-Adult 1 hour .33(M) .16(F) Trait Anxiety-Adult 20 days .86(M) .76(F) State Anxiety-Adult 20 days .54(M) .27(F) Trait Anxiety-College 30 days .73(M) .86(F) State Anxiety-College 30 days .51(M) .36(F) Trait Anxiety-Children 30 days .65(M) .71(F) State Anxiety-Children 30 days .31(M) .47(F)

  33. Two-Factor Theory • Rationale • Verbal Memory: • Information presented visually • Can be encoded verbally • Visual Memory: • Information presented visually • Can not be easily encoded verbally • Reaction Time: • Speed of responses: SimpleChoice->Complex Choice • Visual Motor Speed: • Speed of information processing • Confusion in interpretation • Simplified by using “Memory”, “Speed”?

  34. Two-Factor Theory • Factor analysis: • Reduce a larger number of variables to a smaller number of factors • Analogy: see bumps under covers on bed, hear laughing • one “cluster” of bumps moves in one direction • the other “cluster” moves in another direction • identify them as “Child 1” and “Child 2” • each “Child” is a unique “Factor” • Can also be used to select a subset of variables from a larger set, basedon which variables have the highest correlations with the principalcomponents (or factors)

  35. Two-Factor Theory Factor analysis results: Baseline Group (N=22k) Concussion Group (N=560)

  36. Two-Factor Theory

  37. Two-Factor Theory Factor analysis results (data from Schatz & Sandel, 2012) Baseline Group Concussion Group

  38. Two-Factor Theory Factor analysis results (data from Schatz & Sandel, 2012)

  39. Two-Factor Theory Calculated Z-scores, using normative data (Mean, SD) for both baseline and post-concussion scores: Baseline: Z= Athlete’s Score – Baseline Mean Baseline SD Post-concussion: Z= Athlete’s Post-concussion Score – Baseline Mean Baseline SD Averaged Verbal/Visual, Visual Motor/Reaction Time

  40. Two-Factor Theory Calculated Z-scores, using normative data (Mean, SD) for both baseline and post-concussion scores:

  41. Validity The extent to which a test measures what it is intended to measure. Traditionally achieved using a criterion group (e.g., clinical, diagnosed) and a control group (e.g., absence of diagnosis) Expressed in terms of “sensitivity” and “specificity” Highly Reliable and Valid

  42. Validity Calculating sensitivity Correct “positive” hits = 81.9% (e.g., the probability that a test result will be positive when a concussion is present)

  43. Validity Calculating specificity Correct “negative” hits = 89.4% (e.g., the probability that a test result will be negative when a concussion is not present)

  44. Validity Data Sensitivity of “concussion” measures: Sensitivity ImPACT(online-72h)91% Schatz, Sandel, 2013 ImPACT(desktop-72h) 82% Schatz, et. al., 2005 PnP, Posture, Sym 96% Broglio, et al., 2007 ImPACT, Posture, Sym 92% Broglio, et al., 2007 ImPACT (desktop-24h) 79% Broglio, et al., 2007 HeadMinder CRI 79% Broglio, et al., 2007 Symptoms 68% Broglio, et al., 2007 Posture 62% Broglio, et al., 2007 Pencil/Paper (battery) 44% Broglio, et al., 2007 BESS, SAC, PnP 56% McCrea, et al., 2005 PnP battery (Day 2) 23% McCrea, et al., 2005

  45. Validity Data Sensitivity/Specificity measures: Sens Spec. ImPACT(ONL-72hr)91% 69% Schatz, Sandel, 2013 ImPACT(DT-72hr) 82%89% Schatz, et. al., 2005 SAC (immediate) 94% 76% McCrea, et al., 2001 RapScrCon, Tr B (24h) 70% 74% DeMonte, et al, 2010 Full Battery (Day 2) 56% 79% McCrea, et al., 2005 ANAM/SOT* 50% 96% Register-Mihalik et al, 2012 Symptoms (Day 2) 27% 100% McCrea, et al., 2005 Symptom Clusters (D2) 47% 77% Lau, et al., 2011 BESS (Day 2) 24% 91% McCrea, et al., 2005 PnP Battery (Day 2) 23% 93% McCrea, et al., 2005 SAC (Day 2) 22% 89% McCrea, et al., 2005 *Sensory Organization Test

  46. Validity Data Sensitivity/Specificity of common medical conditions: Sens Spec. ImPACT(online)91% 69% Schatz, Sandel, 2013 ImPACT(desktop) 82%89% Schatz, et. al., 2005 Oxidative Stress (Alz) 88% 70% Lopez, et al., 2013 HBP (Hypertension) 84% 82% Nascimento, et al., 2011 Mammogram (1yr) 82% 91% Hofvind, et al., 2012 Echocardiogram 77% 61% Tanaka, et al., 2010 Stress Echo 76% 87% Sicari, et al., 2007 Prostate Exam 75% 44% Ojewola, et al, 2013 PSA Test (>4) 72% 46% Rashid, et al, 2012 Cholesterol ‘At-Risk’ 71% 76% Gelsky, et al., 1994 Rapid Strep Test 65% 97% Gurol, et al., 2010

  47. Two-Factor Theory Applied to Validation Data (Schatz & Sandel, 2012): Two-Factor versus composite score and sub-scale score sensitivity and specificity.

  48. “Psychometric” Issues? • Is concussion testing falling under a unique level of scrutiny? • Is there an ulterior motive for the criticism of the psychometric properties of computer-based concussion tests and not other tests? • Would a more reliable instrument be valid (e.g. crystallized intelligence) • Is it necessary to focus solely on one measure (e.g., ImPACT), as part of a more comprehensive assessment, when: • other measures have equal or worse psychometrics • lone measures are not recommended for concussion diagnosis/management

  49. Collaborators: Tracey Covassin, Ph.D. Mickey Collins, Ph.D. RJ Elbin, Ph.D. Robin Karpf, M.D. Anthony Kontos, Ph.D. Mark Lovell, Ph.D. Rosemarie Moser, Ph.D. Summer Ott, Psy.D. Gary Solomon, Ph.D. Student Collaborators: Nicole Cameron Charles Ferris Timothy Kelley Stacey Robertshaw Natalie Sandel

More Related