480 likes | 901 Views
Translational Research IT (TraIT). “ TraIT and OpenClinica: partners in translational research ” Marinel Cavelaars, Cuneyt Parlayan, Jacob Rousseau, Sander de Ridder, Jan Willem Boiten and Jeroen Beliën Boston; June 21 st 2013. Overview. Introduction and background CTMM
E N D
Translational Research IT (TraIT) “TraIT and OpenClinica: partners in translational research” Marinel Cavelaars, Cuneyt Parlayan, Jacob Rousseau, Sander de Ridder, Jan Willem Boiten and Jeroen Beliën Boston; June 21st 2013
Overview • Introduction and background • CTMM • Translational Research • TraIT • Three real-life examples: OpenClinica, BMIA, tranSMART • OpenClinica.com – TraIT partnership • CTMM-TRACER and OpenClinica by Sander de Ridder • Scripts, Long Lists, Tools developed • Things we learned/found useful
Who am I? • My name: Jeroen Beliën, PhD, MSc • Associate Professor, medical informatics, dept. of Pathology, VU University medical center, Amsterdam • Digital Pathology, Image processing, IT in translational research • String of Pearls • IT-lead 2 CTMM projects: DeCoDe and TRACER • CTO CTMM-TraIT • BioMedBridges • Member of taskforce Stichting Palga • Palga: Dutch National Electronic Pathology Archive • Faculty member of NBIC jam.belien@vumc.nl
CTMM, TIPharma and BMM offer an integrated approach for innovations in the Dutch health care sector • TIPharma: drugs • Translational research on novel pharmaceutical therapies • Target finding, animal models and lead selection • Drug formulation, delivery and targeting • Special Theme focusing on the efficiency of the process of drug development • CTMM: diagnosis • Early detection of disease by in-vitro and in-vivo diagnostics • Stratification of patients for personalized treatment • Assessing efficiency and efficacy of medicines by imaging • Image guided delivery of medication • Focus on cancer, cardiovascular, neurodegenerative and infectious /autoimmune disease. Biomarkers Image guided drug delivery Imaging for regenerative medicine Drug delivery • BMM: devices • Smart drug delivery systems • Innovations in contemporary organ replacement therapies • Passive and active scaffolds, including cell signalling functions
Government € 37,5 mln Kind € 37,5 mln CASH € 75 mln In kind Public-private partnerships: Financial modelSubsidy: 50% of research cost Industry Academia € 150 mln Subsidy CTMM projects € 300 mln 50%
CTMM projects Stroke Heart Failure Arrhythmia Diabetes Kidney Failure Breast Lung Thrombosis Peripheral Vascular Disease Prostate Colon Leukemia Alzheimer Rheumatoid Arthritis Sepsis
Translational research process Guiding principle: connecting phenotype to biology
Patient enters medical center Clinical Procedures Electronic Health Record Imaging Samples Experiments Clinical database Image database Biobank database Experimental data Data Integration External data Downstream analysis Scientific Output Intellectual Property Improved Healthcare
TraIT consortium - Started Oct. 2011status 2013: 26 partners Growing TraIT project team
The TraIT approach • IT infrastructure = main goal • No research on the side • Workflow-oriented approach • Create data pipelines to link data production and data analysis • User driven priority setting • Regular reprioritization possible (agile) • Avoid reinventing wheels • Adopt/adapt existing technology and expertise • Connect with other initiatives • Organizations (NBIC, EBI, PSI, IMI, etc.) • Think big; start small; act now • Short term focus on immediate needs CTMM projects
Division in work packages TraIT has been subdivided into four work packages (WPs) supporting data generating domains, and two work packages dealing with the overarching TraIT requirements: data integration and professional support respectively: Imaging Data Five data generating work packages WP 1 Clinical Data WP 3 Bio-banking Data WP 4 Experi-mental Data WP 7 Patho-logy Imaging WP 2 Clinical Imaging Data Data integration & analysis across the four platforms WP 5 Core Infrastructure Shared service center for hardware, training & support WP 6 Deployment
High-level TraIT data flows Hospital (IT) Translational Research (IT) Pseudonymization data domains HIS clinical data integrated data translational analytics workbench Open Clinica PACS imaging data annotations LIS e.g. tranSMART/i2b2 cohortexplorer NBIA biobanking Research Data CBM-NL LIMS e.g. R … experimental data Public Data Various solutions … e.g. Galaxy
TraIT Pseudonymization Hospital (IT) Translational Research (IT) TTP data domains HIS integrated data translational analytics workbench clinical data PACS imaging data tranSMART/cohort explorer LIS NBIA + AIM biobanking Research Data TTP R LIMS e.g.CBM catalog e.g.caTissue … experimental data Public Data Galaxy e.g. Galaxy, Chipster e.g. PhenotypeDB, Annai Systems e.g. GEO, EMBL-EBI
TraIT - study driven approach 2013 2014 ··· • Task 1: • study selection Study1 Study2 Study… • Task 2: • use cases & prototypes UC 1 UC 2 UC … ··· • Task 3, 4, 5: • development of • data integration platform • analytics workbench • shared components Data Integration Translational AnalyticsWorkbench Data Integration Translational AnalyticsWorkbench pseudo Data Integration Translational AnalyticsWorkbench ··· ETL integrated translational data warehouse Analytics AAA
Three real-life examples Hospital (IT) Translational Research (IT) TTP Example 1: CTMM INCOAG Example 3: CTMM PCMM clinical integrated data Open Clinica Example 2: CTMM AIRFORCE PACS imaging e.g. tranSMART NBIA
Real-life example 1 - CTMM Incoag • Discover new risk factors for thrombotic diseases • Approach: Combine existing clinical studies into one OpenClinica data set for higher statistical power • OpenClinica: • Clinical data capture • Web-based • Open-source • Full audit-trail • 10,000+ installations • TraIT tool of choice
Incoag - Technical integration Out-of-the-box OpenClinica can be applied in most projects: currently used in CTMM projects AirForce, Cohfar, DeCoDe, Parisk, PCMM, and Tracer Specific Incoag question: how to combine 5+ independent existing studies from mixed sources into one OpenClinica installation? Study 2 Study 1 Study 3 ? Sustainable storage in TraIT environment
Incoag - Technical integration Solution: TraIT-team created a batch upload toolbox for OpenClinica Will be submitted to the OpenClinica open-source community Study 2 Study 1 Study 3 Sustainable storage in TraIT environment
Incoag - Semantic integration Second question from Incoag project: how to identify common fields and data items? Study 1 Study 3 Study 5 Study 2 Study 4 How to determine the overlap?
Incoag - Semantic integration Second question from Incoag project: how to identify common fields and data items? Study 1 Study 1 Study 3 Study 3 Study 5 Study 5 Study 2 Study 2 Study 4 Study 4 100-150 fields in each study How to determine the overlap? Studies speak different “languages”: A biomedical “Esperanto” needed Common ground? More than 1005 combinations to consider!
Incoag - Semantic integration Project 1: Provide tools to standardize studies at data registration (as far as possible): TraIT building blocks to rapidly build CRFs for new studies based on common dictionary Study n Project 2: First test with tools for automatic “after-the-fact” harmonization for historical data: Automatic mapping against multiple dictionaries (SNOMED-CT, LOINC, NCI thesaurus & Gene Ontology) Study 3 Study 1 Harmonized Incoag dataset Study 5 Study 4 Study 2
Real-life example 2 – CTMM AirForce • Personalized chemo-radiation of lung and head & neck cancer • Lung cancer patients with PET-CT (and clinical data & tissue) • VUMC, MUMC+, NKI, UMCG + 35 patients from Policlinico Gemelli in Rome (via MUMC+) • Transfer of images from Rome using TraIT’s BioMedical Image Archive (www.bmia.nl)
WP2 High level design – Upload (Implemented) Image storage & simple web-shop like image viewing (based on NBIA) Image pseudonymization pipeline (based on CTP from the RSNA)
AirForce - de-identification of images • Install TraIT de-identification client in Rome • Adopt: Clinical Trial Processor (RSNA, open source, Java) • Configure DICOM de-identification • Remove identifying DICOM tags • Replace Codice Sanitario (PatientID) with AirForce ID • Keep important tags (e.g. some tags are crucial for downstream analysis of PET) • Result: A pipeline to TraIT’s BMIA from the local Rome Image Archive DICOM TAGS DICOM IMAGE
AirForce - QC of de-identification • Perform QC step by collection administrator before images are visible in BMIA to prevent privacy breach (esp. burnt-in names).
AirForce - Resulting image archive in BMIA • Collection AirForce on www.bmia.nl with 35 patients from Rome • Web shop model where you can fill a basket with patients for download
Real-life example 3 – CTMM PCMM • Develop and validate biomarkers for diagnosis of prostate cancer • Requires correlation of phenotype data to biomarker data • Potential solution: tranSMART; to be validated with real-life data from CTMM projects like PCMM Can we address the generic translational question with the tranSMART solution?
PCMM – tranSMART as a candidate solution • tranSMART: • Developed in J&J • Made open-source • “Data workbench” for translational researchers • Searching across studies • Data exploration
PCMM - Import of prostate data Reference to public data sources available Gleason score, PSA values, etc. Prostate data Usually gene expression data will be loaded as well; not yet done for PCMM
PCMM - QC of the data set Drag-and-drop data parameters to create simple distribution plots and statistical values
PCMM: tranSMART for correlation analysis Easy to create correlation plots between existing and potential predictors for prostate cancer
Second tranSMART developer/user meeting, June 17th-19th 2013, Amsterdam Recombinant / Deloitte CDISC Thomson Reuters Pfizer eTRIKS/ Imperial College CTMM-TraIT Sanofi Philips Johnson & Johnson University of Michigan University of Luxembourgh
OpenClinica.com – TraIT partnershipStatement of Work • TraIT: automate data capture in OC as much as possible • E.g. automate upload of excel data and hospital lab data • Approach: OC’s Web Services • Requires Improvements on OIDs and Bug Fixes • Support configurable role based authentication and authorization within OC • E.g. Central review of images for all subjects in the different sites. Each image is reviewed by three reviewers who are not allowed to see each other’s reports in the CRFs • Parameterized links in CRFs • E.g. Links to images or to other subjects, with a dynamic URL based on data in CRF
Other wishes • Study migration • E.g. Users want to switch to different OC server • Currently only "ClinicalData" ODM is imported • Studies can be exported in full detail but cannot be imported as such • Support reference to ontologies in the CRF • Standardization of data • Easy view for data entry • E.g. tree structure that indicates where you are while entering data for easy navigation to other CRF for subject
47 studies 77 sites 256 users Pre TraIT effect: all multicenter VUmc studies Also multicenter studies UMCU, UMCN, EMC, Meander MC Start DeCoDe OpenClinica Start TraIT OpenClinica • The load on TraIT OpenClinica increased significantly in 2012 • Considerable time and energy was spent on delivery management (availability, capacity and security) and on improvement of the TraIT OpenClinica user support
Who am I? • My name: Sander de Ridder • Computer Science (MSc) & Bioinformatics (MSc) • Inflammatory Disease Profiling, Dept. of Pathology, VU University medical center, Amsterdam • Bioinformatics for Inflammatory Disease Profiling Group • IT implementation CTMM TRACER s.deridder@vumc.nl
CTMM-TRACER Background information on TRACER • CTMM TRACER: Rheumatoid Arthritis • Prospective data • Retrospective data (To Do) • Go Live: • Wednesday the 5th of June • Started at 9:00 - Finished at 12:00 • Approximately 1 hour/study
Age Calculation After entering the DOB and the date of signing… The age is calculated Age calculation script: http://en.wikibooks.org/wiki/OpenClinica_User_Manual/AgeField Created by Sander de Ridder and improved by Gerben Rienk
Long List Implementation • Problem: • Maximum of 4000 characters for single-select response options text • Some lists need more characters: e.g. medication list > 9000 characters • Solution: • Created external list • Add field to CRF which opens new page with list • Allows user to select option; selected value is copied back to CRF
Example: Medication User selects “Other” and then clicks on question 3)’s field A new tab/window opens with an HTML page with a single-select The user can select desired medication from the list Selected medication is copied to the CRF
Some tools we created: CRF validator • Compares items between CRFs based on uids and ensures they match • CRF1 • ID: Patient_Weight; DATA_TYPE: INT • CRF2 • ID: Patient_Weight; DATA_TYPE: REAL Mismatch for Patient_Weight! • Checks NULL-flavour coding integrity • Coding: -1=No Information, -2=Not Applicable, -3=Unknown, … • CRF1 • RESPONSE_OPTIONS_TEXT: No Information RESPONSE_VALUES_OR_CALCULATIONS: -2 Incorrect NULL-flavour coding! Prevents errors and inconsistencies
Some tools we created: ID-Translator • Move rules file to new OC server replace all item IDs • Automatic translation of item identifiers in rules • Prevented replace errors and saved many hours of work • Requires: • ViewCRFVersion file • Contains item ID information for CRF on new server • Rule file with properly specified header • Contains item ID information for CRF on old server
ITEM_NAME OC_ID ViewCRFVersion (new Server) Rules for old server Parse ViewCRFVersion mapping ITEM_NAME – new OC_ID MedicatieBijgewerkt = I_TRACE_MEDICATIEBIJGEWERKT_4714 Parse Header of rule file mapping ITEM_NAME – old OC_ID MedicatieBijgewerkt = I_TRACE_PATIENTSTUDIE_MOMENT_AFROND Translate rule file old OC_ID new OC_ID via ITEM_NAME I_TRACE_PATIENTSTUDIE_MOMENT_AFROND = I_TRACE_MEDICATIEBIJGEWERKT_4714 ITEM_NAME OC_ID Translated Rules for new server
Things we learned/found useful • ITEM_NAME max 64 characters • SPSS compatibility • Truly unique identifiers (description label) • Easy to link to study definition (CTMMC) • Useful for consistency checking • Negative NULL-flavour coding • Prevent conflict with retrospective data • Easy to keep NULL-flavour coding consistent • Specify identifiers in header of rule file • Automatic translation • JavaScript code • $.noConflict(); • Prevents our code from interfering with OC’s code • Reference to jquery • <script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"> • Prevents dependency on OC’s jQuery version • Create a checklist and follow it during go-live
Goal: make researchers want to use OpenClinica and tranSMART
Acknowledgements And many more…