340 likes | 442 Views
Using QResearch for development & validation of risk prediction tools. Prof Julia Hippisley-Cox, University of Nottingham, 5 th Sept 2013. My roles & interests. Professor Clinical E pidemiology & GP University Nottingham NHS General Practitioner
E N D
Using QResearch for development & validation of risk prediction tools Prof Julia Hippisley-Cox, University of Nottingham, 5th Sept 2013
My roles & interests • Professor Clinical Epidemiology & GP University Nottingham • NHS General Practitioner • Member Confidentiality Advisory Committee (s251) • Director QResearch & QSurveillance (EMIS/Notts) • Director ClinRisk Ltd (medical software) • Member EMIS National User Group • UoN also license holder THIN, CPRD, HES, ONS datasets
Acknowledgements • Co-authors Drs Carol Coupland, Peter Brindle, John Robson • QResearch database • University of Nottingham • EMIS & contributing practices & user group • ClinRisk Ltd (software) • Oxford University (independent validation, Prof Altman’s team)
Outline • QResearch database +linked data • General approach to risk prediction • QRISK2 • QIntervention • Any questions
QResearch Database • One of the worlds largest and richest research databases • Over 700 general practices across the UK, 14 million patients • Joint NFP venture between EMIS (largest GP supplier > 55% practices) and University of Nottingham • Patient level pseudonymised database for research • Available for peer reviewed academic research where outputs made publically available • Data from 1989 to present day.
Information on QResearch – GP derived data • Demographic data – age, sex, ethnicity, SHA, deprivation • Diagnoses • Clinical values –blood pressure, body mass index • Laboratory tests – FBC, U&E, LFTs etc • Prescribed medication – drug, dose, duration, frequency, route • Referrals • Consultations
QResearch Data Linkage Project • QResearch database already linked to • deprivation data in 2002 • cause of death data in 2007 • Very useful for research • better definition & capture of outcomes • Health inequality analysis • Improved performance of QRISK2 and similar scores • Developed new open source technique for data linkage using pseudonymised data
www.openpseudonymiser.org • Scrambles NHS number BEFORE extraction from clinical system • Takes NHS number + project specific encrypted ‘salt code’ • One way hashing algorithm (SHA2-256) • Cant be reversed engineered • Applied twice in two separate locations before data leaves source • Apply identical software to external dataset • Allows two pseudonymised datasets to be linked • Open source – free for all to use
QPrediction ScoresA new family of Risk Prediction tools • Individual assessment • Who is most at risk of preventable disease? • What is level of that risk and how does it compare? • Who is likely to benefit from interventions? • What is the balance of risks and benefits for my patient? • Enable informed consent and shared decisions • Population level • Risk stratification • Identification of rank ordered list of patients for recall or reassurance • GP systems integration • Allow updates tool over time, audit of impact on services and outcomes
Criteria for choosing clinical outcomes • Major cause morbidity & mortality • Represents real clinical need • Related intervention which can be targeted • Related to national priorities (ideally) • Necessary data in clinical record • Help inform decisions at the point of care • Can be implemented into everyday clinical practice
Change in research question • Leads to • Novel application of existing methods • Development of new methods • Better utilisation different data sources • Leads to • Lively academic debate! • Changes in policy and guidance • New utilities to implement research findings • (hopefully) Better patient care
Primary prevention CVD:(slide from NICE website) • Offer information about: • absolute risk of vascular disease • absolute benefits/harms of an • intervention • Information should: • present individualised risk/benefit • scenarios • present absolute risk of events • numerically • use appropriate diagrams and text
Challenge: to develop a new CVD risk score for use in UK • New cardiovascular disease risk score • Calibrated to UK population • Use routinely collected GP data • Include additional known risk factors (eg family history, deprivation) • Better calibration and discrimination than US derived Framingham score
Why a new CVD risk score? • Framingham has many strengths but some limitations: • Small cohort (5,000 patients) from one American town • Almost entirely white • Developed during peak incidence CVD in US • Doesn’t include certain risk factors (body mass index, family history, blood pressure treatment, deprivation) • Over predicts CVD risk by up to 50% in European populations • Underestimates risk in patients from deprived areas
Derivation of QRISK2 Score • Derivation cohort • 355 practices; 1,591,209 patients; • 96,709 events • Traditional Risk Factors • Additional risk factors: • ethnic group • type 2 diabetes, treated hypertension, rheumatoid arthritis, renal disease, atrial fibrillation • Interactions with age J Hippisley-Cox, C Coupland, et al. Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2. BMJ 2008; 336: 1475-1482
Model Derivation • Separate models in males and females • Cox regression analysis • Fractional polynomials to model non-linear risk relationships • Multiple imputation of missing values
Validation • Separate sample of 176 QResearch practices; 750,232 patients; 43,396 events • Validation statistics (for survival data) • D statistic1 (discrimination) • R squared (% variation explained) • Predicted vs. observed CVD events • Clinical impact in terms of reclassification of patients into high/low risk 1 Royston and Sauerbrei. A new measure of prognostic separation in survival data. Stat Med 2004; 23: 723-748.
Calculation of risk scores • Risk scores calculated in validation dataset • Risk score calculation: • Used coefficients for risk factors obtained from Cox model using multiple imputed data • Combined these with patient characteristics in validation data to give prognostic index • Combined with baseline survival function estimated at 10 years to give estimated risk of CVD at 10 years for each person
Validation statistics Hippisley-Cox J et al. BMJ 2008;336:1475-1482
External validation using THIN database • Additional validation carried out using the THIN database • Based on practices in UK using Vision system • One validation carried out by QRISK authors • Hippisley-Cox J et al. The performance of the QRISK cardiovascular risk prediction algorithm in an independent UK sample of patients from general practice: a validation study. Heart 2007:hrt.2007.134890. • An independent validation carried out by a separate group • Collins GS, Altman DG. An independent and external validation of QRISK2 cardiovascular disease risk score: a prospective open cohort study. BMJ 2010;340:c2442
External validation using THIN database Collins GS, Altman DG. An independent and external validation of QRISK2 cardiovascular disease risk score: a prospective open cohort study. BMJ 2010;340:c2442
Annual updates to QRISK2 • Reasoning: • Changes in population characteristics – • e.g. incidence of cardiovascular disease is falling; obesity is rising; smoking rates are falling • Improvements in data quality - recording of predictors and clinical outcomes becomes more complete over time (e.g. ethnic group now 50%). • Inclusion of new risk factors • Changes in requirements for how the risk prediction scores can be used - e.g. changes in age ranges.
Risks and Benefits of Statins • Two recent papers: • Unintended effects statins (Hippisley-Cox & Coupland, BMJ, 2010) • Individualising Risks & Benefits of Statins (Hippisley-Cox & Coupland, Heart, 2010) • Conclusions: • New tools to quantify likely benefit from statins • New tools to identify patients who might get rare adverse effects eg myopathy for closer monitoring