170 likes | 365 Views
Human Subject Abuse Potential Studies: Validated Scales and Statistical and Clinical Assessments. Kerri A. Schoedel, PhD Scientific Director, Clinical Pharmacology INC Research. Science of Abuse Liability Assessment 10-Nov-2011. Disclosure.
E N D
Human Subject Abuse Potential Studies: Validated Scales and Statistical andClinical Assessments Kerri A. Schoedel, PhD Scientific Director, Clinical Pharmacology INC Research Science of Abuse Liability Assessment10-Nov-2011
Disclosure • I work through INC Research (Kendle) as a consultant for various pharmaceutical companies, but the opinions expressed in this presentation are my own and don’t represent the views of any pharmaceutical company 2
Abuse potential measures • Abuse potential measures are subjective: • Model recreational drug abuse, not dependence or reinforcement (directly) • ‘Abuse’ is largely a subjective construct; no existing objective measures accurately predict potential for abuse across classes • FDA 2010 draft guidance: • Measures most directly relevant to abuse: • Measures of ‘liking’ and subjective effects • Take Drug Again • Drug Similarity/Identification • Drug Class Effects (not directly relevant – used in interpretation) • Strength, mood scales such as POMS, ARCI etc • Cognitive and behavioral effects • Physiological measures • Other: • Drug value/choice procedures 3
Derived Endpoints • Multiple timepoints for ‘at this moment’ measures to capture onset, peak and offset - summarized using derived endpoints • Peak effect (Emax) or peak change from baseline (CFBmax) for each dose • Most sensitive but does not account for when effect occurs and how long it lasts • Susceptible to “expectancy” (i.e., need only a single sporadic response at one timepoint) → does not necessarily reflect the whole experience • Partial area under the effect curve (e.g., AUE0-2h; AUE0-3h; AUE0-4h) • Generally well-correlated with Emax (Spearman >0.700) across drug classes/studies • Assumes that early effects are most important • Time-weighted mean, AUE measures • Provides a measure of “overall” effect across the drug-taking experience • But can also visualize this from timecourse data • May be preferable to rely on subject’s rating of the overall experience (i.e., end-of-day/next-day measures 4
Development of Measures • Many of the measures were developed years ago, e.g., ARCI (Haertzen, 1941) DEQ, NDQ and SDQ (Fraser et al., 1961) • Terms may no longer be widely used or have the same meaning • Many scales were developed in “addicts” or “post-addicts” (in many cases incarcerated): • Non-dependent recreational drug users may have different motivations/responses • Developed largely using stimulants/opioids • Content/nature of items based primarily on stimulant/opioid pharmacology • Opioid/stimulant effects can be easily detected regardless of scale properties • Typical batteries contain many redundant items and those not directly relevant to abuse potential (e.g., >90 individual items with ARCI + DEQ/NDQ/SDQ) • Complicates statistical analysis and interpretation • May lead to subject fatigue • But… is consistent with traditional “exploratory” design of these studies 5
Selecting appropriate measures • FDA 2010 draft guidance: • “The assessment of abuse potential can include co-primary endpoints and some secondary endpoints of interest, if appropriate.” • More consistent with a “confirmatory”type of design • How to select these “primary” measures? • Human abuse potential studies determine intrinsic potential for abuse to determine appropriate controls at pre-market. Thus, in selecting measures and analyzing data: • We need to avoid risk of false positive results and applying more control than necessary • But we must be sure that a negative study is a “true” negative • Must ensure that measures are sensitive, reliable, and valid for modern subject populations and drugs, including novel drugs • Retrospective analysis of multiple human abuse potential studies completed at our site (N>300 subjects) - based on FDA PRO Guidance (Schoedel, CPDD 2010) • Emax for “at this moment” Drug Liking, Good Effects, High, Bad Effects VAS, ARCI MBG • Emax for “Global” (end-of-day/next-day) measures: Overall Drug Liking, Take Drug Again VAS, Subjective value (choice procedure) 6
Sensitivity, Variability and Reliability • Emax of “at this moment” and Global effects VAS: • Highly sensitive - large effect sizes (>1.0) for drugs of abuse • ARCI MBG less sensitive with lower effect sizes (<1.0) • Variability • Pooled placebo response (difference Emax vs. 'neutral‘) lower for bipolar Drug Liking VAS/Overall Drug Liking VAS compared to unipolar VAS (e.g., Good Effects, High, Take Drug Again) • % CV lowest for bipolar Drug Liking VAS / Overall Drug Liking VAS for placebo and across active drug doses compared to unipolar measures • Within study test-retest reliability (at this moment measures) • Good test-retest reliability for most measures with hydromorphone 8 mg given on 2 occasions (>0.7) and repeated fentanyl infusions (>0.9) • Repeated saline in fentanyl study: Drug Liking VAS showed good test re-test (0.825), but not Good Effects (0.435) or High VAS (-2.10) 7
Construct Validity • Convergentvalidity: Item correlated to other scales measuring same concept • Drug Liking, Good Effects, High VAS: significant and relatively high correlations (>0.700) • ARCI MBG less consistently correlated with other positive measures • Drug Liking VAS/Good Effects VAS significantly correlated with Overall Drug Liking/Take Drug Again VAS; High VAS and ARCI MBG less consistently correlated • Discriminantvalidity: Item not correlated with concepts other than one it is intended to measure • Drug Liking VAS/ARCI MBG and most Global measures: were not correlated or inversely correlated with Bad Effects VAS • Good Effects /High VAS and Drug Value: significant positive correlations with Bad Effects VAS in some cases, in particular for placebos and negative controls 8
Measures Performance Conclusions • Subjective VAS and other measures show good sensitivity with positive control drugs of abuse • Some unipolar (VAS) measures, such as High VAS, show higher variability, placebo response, and less selectivity • May be ambiguous or more subject to “expectancy” • Were correlated in some cases with effects other than those they are intended to measure (e.g., Bad Effects) • These measures should be interpreted with some caution • Bipolar Drug Liking VAS and Overall Drug Liking VAS are correlated with unipolar Good Effects and Take Drug Again VAS, but have better measurement properties • ARCI MBG shows low variability but generally less ability to detect change and lower correlation with other measures 9
Simplifying Measures Selection and Interpretation In general, recommend using bipolarmeasures where possible “Forced choice” for subject reduces difficulties with interpretation when opposing effects are observed (i.e., concurrent good and bad effects) Better measurement properties and easier for subject to understand Measures of “drug liking” as “Primary Endpoints”: Used for many years, high face validity and relevance to concept of abuse Bipolar Drug Liking (“at this moment”): Sensitive, specific, reliable, good construct validity Still subject to some interpretation (e.g., impact of onset, relative duration, etc.), expectancy effects 10
Interpreting At this moment Measures Peak Drug Liking = 80 Abuse potential = TEmax, AUE, Cmax/Tmax, Liking (Emax)/Disliking (Emin)…? Strong liking Or…ask the subject Neutral Strong disliking Hours post-dose
Bipolar Overall Drug Liking VAS • High face validity and relevance of end-of-session/next-day measures • Public health risk is from repeated administration • Highly correlated with unipolar Take Drug Again but has better measurement properties • Less variability, lower placebo response, etc. • Slightly less sensitive than “at this moment” Liking (effect sizes are still “large” [>0.8] for abused drugs) • Subjects decide which aspects of the experience are most important (duration, onset, balance of positive, negative and other effects) > >> Less subject to interpretation 12
Limitations of Measures Analysis • Substantial data on criterion validity is lacking: • There is currently no gold standard for measurement of abuse potential • Subjective measures are generally correlated with animal studies and other human laboratory studies (i.e., self-administration) • Publically available postmarket data also have significant limitations and abuse-related morbidity is influenced by other factors besides intrinsic abuse potential • Availability, fads • Safety, overdose risks • Dependence potential • Pattern of abuse (occasional, sporadic vs repeated, escalating) • Head-to-head data from different classes would be needed from a single study for correlations with good epidemiological data 13
Contrasts and Hypothesis Testing Abuse potential studies typically include a large number of pairwise contrasts: Large number of endpoints and multiple arms Comparator serves 2 purposes: Study validity + comparison with IP Potential multiplicity issue Current analysis methods have some features of confirmatory and exploratory approaches To determine appropriate analysis methods, need to determine: Are these studies Exploratory or Confirmatory ? 14
DATA Analysis: Pair-wise Comparisons • More consistent with Exploratory Approach: • All doses of positive control and investigational product (IP) vs. placebo and each other • Conservative, but large number of contrasts and conclusions of relative abuse potential are not straightforward • More consistent with a Confirmatory Approach: • One-sided sequential approach with all doses of IP and positive control • Involves a large number of contrasts, but performed sequentially • Like “All Contrasts” approach, doesn’t clearly define relativeabuse potential • Sequential True Emax approach: • Compare Emax of all doses combined: Comparator vs placebo (validity), IP vs. placebo (any abuse potential?) and Comparator vs IP (relative abuse potential) • Conceptually logical based on requirement to control substances (not doses) and individual abuser preferences for different doses • Simplifies analysis (fewer contrasts, multiplicity issues) and interpretation 15
Power and sample size • Power Calculation Issue - lack of data for IP, large number and different types of contrasts (superiority for comparator, inferiority of IP to comparator, ‘equivalence’ to placebo) • Typical sample size – Approx. 30 subjects and getting higher (up to 49 subjects) • Risk of False Negative is low >>> False positive? • Statistical difference vs. clinical relevance • With larger sample sizes – more important to understand clinical relevance 16
Summary and Discussion Points Traditional Exploratory Design? No primary and secondary endpoints; all contrasts are performed and all data are considered in overall conclusions Consistent with original design of studies and with abuse liability as a complex “safety” issue However, conclusions of abuse potential are subject to varied interpretation • From endpoint selection and analysis perspective, current studies are “mixed” exploratory/confirmatory approach with some primary and multiple secondary endpoints • Usually all data (primary and secondary) are still considered in conclusions • Confirmatory-Type Design? • Select 1 primary endpoint: Overall Drug Liking > Subject decides overall abuse potential • Secondary endpoints and data can be used for discussion points • Simplified interpretation within and across studies (regulatory purpose of studies) • May oversimplify a complex issue; however, other data (preclinical, AEs, etc) are still used in Overall Conclusions of abuse liability 17