Risk Evaluation: Maximizing Risk Accuracy

Risk Evaluation:Maximizing Risk Accuracy MATSA/MASOC Presentation to SORB 1/31/2013

Overview of Presentation Brief history of risk assessment and the different kinds of assessment that have been developed; Indication of where MA SORB Classification fits in these strategies; Summary of the criteria for evaluating risk instruments; Quick overview of the recent empirical evaluations of risk instruments; Suggest strategies for improving the MA SORB Classification.

Brief history of Risk Assessment

Brief History Fixed or historical factors that cannot be changed (such as age at first offense) Potentially changeable factors, both stable, but potentially changeable risk traits, and acute, rapidly changing factors. First generation – Unstructured clinical judgment, including structured clinical guidelines (SCG). Second generation– Actuarial risk scales comprising static, historical factors. Third generation – the assessment of “criminogenic needs” or dynamic risk factors. Bonta, 1996

Brief History First Generation • Characteristics of Unstructured Clinical Judgments – • No items specified for considering risk level; • Method for combining items is not specified. (Hanson & Morton-Bourgon, 2009)

Brief History First Generation • Characteristics of SCGs– • They identify items to use in the decision and typically provide numerical values for each item; • Although they also usually provide a method for combining the items into a total score, they do not specify a priori how the clinician should integrate the items; • No tables linking the summary scores to recidivism rates. (Hanson & Morton-Bourgon, 2009)

Brief History Second Generation • Requirements of Empirical Actuarials – • Provide specific items to make the decision with quantitative anchors, which are derived from empirical investigation; • Method for combining the items into an overall score is specified; • Tables linking the summary scores to recidivism rates are provided. (Hanson & Morton-Bourgon, 2009)

Brief History Second Generation • Requirements of Mechanical Actuarials – • They provide specific items for the decision with numeric values for each item, which are derived from a review of literature and theory; • Method for combining the items into an overall score is specified; • Tables linking the summary scores to recidivism rates are not provided. (Hanson & Morton-Bourgon, 2009)

Brief History Second Generation • Additional condition Adjusted Actuarials – • Use appropriate actuarials (empirical or mechanical); • The clinician adjusts the score (and the recommendation) using factors external to the actuarial. (Hanson & Morton-Bourgon, 2009)

Where Does It Fit? MA SORB Classification Factors

MA SORB Classification Factors Where Does It Fit? • Somewhere between an unstructured judgment and an SCG – • It specifies a set of factors to be considered; but • It does not provide any quantification of these factors (i.e., numeric item scores). • In many items it does not provide clear specification of where the cutoff for “presence” or “absence” of a factor would be. • Thus, it provides limited guidance both on the presence of items and on the combining of items.

MA SORB Classification Factors Example of SVR-20 • Item 3. Psychopathy Code this by reference to the PCLR. Code PCLR scores of 30 or above as “Y”, scores of 21-29 as “?”, and scores of 20 or lower as “N”. Y = 2 ? = 1 N = 0

MA SORB Classification Factors Example of SORB Factors ?charges, convictions, self-report? ?includes both impulsive and compulsive behavior? • Item 2. Repetitive and Compulsive Behavior

Evaluating Reliability and Validity Evaluating Risk Tools

Do Evaluators Agree? Assessing Reliability

Reliability Reliability is -- • Accuracy • Consistency • Across raters • Across time • Across different measures of the same construct • Freedom from variable error.

Reliability Interrater

R R 1 2 Interrater Reliability Agreement

Reliability Interrater Internal Consistency

Internal Consistency

Advantages of Quantification Reliability Checks • Allows one to calculate various forms of reliability – • Item reliability • Reliability of subscales (e.g., sexual deviance, criminality, etc.) • Internal consistency of items in the instrument • Thus, quantification allows us to restructure items and their anchors to improve reliability.

SCGs and Actuarials Reliability Results Most popular SCGs and actuarials assessed in the comparative literature have acceptable reliability. Unstructured judgments have poor reliability. The reliability of MA SORB Classification Factors have not been assessed.

Predicting Recidivism Assessing Validity

Validity Answers the Question Does a test measure what it is suppose to measure? What does a test measure? What can one do with the test? What does a test score predict?

Predicting Sexual Recidivism (Hanson & Morton-Bourgon, 2009)

Predicting Sexual Recidivism Overall, controlling for a large number of study variables, Empirical and Mechanical were significantly better predictors of recidivism; SCGs using clinical judgment and SCGs that calculate total scores do not differ. In all studies examined, clinicians’ adjustment of actuarial scores consistently lowered predictive accuracy. (Hanson & Morton-Bourgon, 2009)

Why Is Clinical Judgment Inferior? Across multiple areas of prediction, mechanical actuarial prediction (statistical prediction rules [SPRs]) has been shown to be superior to clinical judgment. A recent meta-analysis summarizes the results of years of research (Grove et al., 2000).

(Grove et al., 2000) All studies published in English from 1920s to mid 1990s. 136 studies on the prediction of health-related phenomena or human behavior.

(Grove et al., 2000)

Why Is Clinical Judgment Inferior? A large body of research has documented the reasons for the cognitive errors that clinicians make. For instance, clinicians are great at making observations and rating items, but they are worse than a formula at adding the items together and combining them.

Advantages of Quantification Validity Checks • Allows one to use various strategies for improving validity of a measure– • Assess item correlation with outcome; • Adjust item cutoffs to maximize prediction; • Assess the validity of subscales (e.g., sexual deviance, criminality, etc.); • Optimize item weights for decision-making and predicting. • Thus, one can restructure items, their anchors, cutoffs, and combinations to improve validity.

Strategies for Improving MA SORB Classification

Potential Strategies Improving the Current MA SORB Criteria Create separate adult and juvenile actuarials; Create separate male and female actuarials; Divide instrument into static and dynamic item subsets; Use recent meta-analytic literature to purge items that are not likely to have predictive validity;

Examples of Poor Predictors Released from civil commitment vs. not committed (Knight & Thornton, 2007) Maximum term of incarceration; Current home situation (?vague and unspecified?); Physical condition; Documentation from a licensed mental health professional specifically indicating that offender poses no risk to reoffend;

Examples of Poor Predictors Recent behavior while incarcerated; Recent Threats; Supplemental material; Victim impact statement.

Potential Strategies Improving the Current MA SORB Criteria Create separate adult and juvenile actuarials; Create separate male and female actuarials; Divide instrument into static and dynamic item subsets; Use recent meta-analytic literature to purge items that are not likely to have predictive validity; Transform items into a quantifiable format with clear cutoffs; Do a preliminary check on the predictive validity of revised items using existing data bases.

Potential Strategies Improving the Current Criteria Do a small study on a subset of offenders to establish reliability. When using the revised instrument, require item and total scores for future validation studies.

Potential Strategies Alternatively Follow the lead of some other states and use existing static and dynamic instruments on which substantial research has already been done. MATSA and MASOC would be happy to consult on any strategy that MA SORB wishes to implement to improve the reliability and validity of current classification method.

Risk Evaluation: Maximizing Risk Accuracy