460 likes | 785 Views
Open Source Software for BioBanks. Vincent Ferretti, Philippe Laflamme. What’s OBiBa?. OBiBa is a P3G core project Key working units of P3G, independently funded IWG2 led by Samuli Ripatti The project objectives are to Create a resource for biobank open source software
E N D
Open Source Software for BioBanks Vincent Ferretti, Philippe Laflamme
What’s OBiBa? • OBiBa is a P3G core project • Key working units of P3G, independently funded • IWG2 led by Samuli Ripatti • The project objectives are to • Create a resource for biobank open source software • Ultimately provide a free, integrated, secure and highly performing open source information management system for biobanks • Subject recruitment • Clinical evaluation • Data warehouse • Genotyping • Sample Management • Others
www.obiba.org Two on-going projects GenoByte • a storage system for high-throughput genotyping • Java API able to store and analyze billions of genotypes very efficiently • Developed at the McGill University and Genome Quebec Innovation Center, Montréal, Québec • Source code, documentation, demo application are available on the website.
CPAC: Canadian Partnership Against Cancer A new Canadian network of 5 prospective cohorts on cancer 300,000 participants Recruitment starts in 2009
CPAC: Canadian Partnership Against Cancer A new Canadian network of 5 prospective cohorts on cancer 300,000 participants Recruitment starts in 2009 30, 000 participants 50, 000 participants 30, 000 participants 30, 000 participants 150, 000 participants
CPAC: Canadian Partnership Against Cancer 5 harmonized cohorts • Cohorts will share a core set of • Questions or variables on risk factors, health outcomes, … • Physical measurements (blood pressure, bone density, etc. ) • Cohorts will use similar or identical SOPs for physical measurements and blood samples Harmonized but not identical • Each cohort will extend this CPAC common core of data with their own specific questions, physical measurements and tests. CPAC harmonization task force led by Isabel Fortier • Specialize the generic DataShaper for cancer
A common IT infrastructure • CPAC cohorts decided to join their efforts for building a common IT solution. • A three level infrastructure CPAC Federated Database 300 000 participants Data networking Data Coordinating Center 30 000 p Data Coordinating Center 30 000 p Data Coordinating Center 150 000 p Data warehousing Data collection Local AssessmentCenter Local AssessmentCenter Local AssessmentCenter Local AssessmentCenter Local AssessmentCenter Local AssessmentCenter Ontario Cohort Atlantic Cohort BC Cohort
The Obiba Consortium • CPAC decided to fund OBiBa to build its IT infrastructure • Creation of the OBiBa Consortium • All software produced will be open source and made available freely to the biobanking community via P3G and the OBiBa web site. • OBiBa developer team: 8 people in Montreal in Cartagene Office
Development Phases • Phase I • Data and sample collection software at the assessment centers • Participant interview management • Consent form • Questionnaires (self-administrated (touch screen) and assisted) • Physical measurements • Cognitive test • Sample collection and shipping • Phase II • Data Coordinating Center • The cohort database application (data import/export, variable catalogue, reporting tools, etc.) • Phase III (together with the CPAC Harmonization Task Force) • The CPAC federated databases system
Phase I: Assessment center • Interviews of invited participants are conducted at assessment centers • Interviews consist of sequence of data capture stages e.g. consent signature, health questionnaire, sample collection, etc.
Phase I: Assessment Centers Workflow Overview Assessment Center Sample collection Health questionnaire Consent signature Physical measurements Application Server INTRANET Welcome / Reception Conclusion Enrolment & Scheduling Units Bio-repository Data Coordinating Center Data Warehouse Quality control reports Study monitoring
Phase I: Assessment Centers Workflow Overview Assessment Center Sample collection Health questionnaire Consent signature Physical measurements Onyx Application Server INTRANET Welcome / Reception Conclusion Enrolment & Scheduling Units Bio-repository Data Coordinating Center Data Warehouse Quality control reports Study monitoring
Appointment List Stage 1 Stage 2 Stage 3 Stage n Participant Reception Data Export ONYX – A modular and flexible architecture • Onyx is a web application hosted on the assessment center server and stores all the data collected by stages • Onyx is modular i.e. provides the backbone into which independent data collection software used by stages can be easily plugged. • Biobanks can configure Onyx for their own specific stages and software
Configurable state machine engine 1 - After reception, only consent stage is available Consent SampleCollection Conclusion Questionnaire Participant Reception Onyx controls the availability of the stages in the interview
Configurable state machine engine 1 - After reception, only consent stage is available Consent SampleCollection Conclusion Questionnaire Participant Reception Onyx controls the availability of the stages in the interview 2 - After consent Consent SampleCollection Conclusion Questionnaire Participant Reception
Configurable state machine engine 1 - After reception, only consent stage is available Consent SampleCollection Conclusion Questionnaire Participant Reception Onyx controls the availability of the stages in the interview 2 - After consent Consent SampleCollection Conclusion Questionnaire Participant Reception Consent SampleCollection Conclusion Questionnaire Participant Reception
Configurable state machine engine 1 - After reception, only consent stage is available Consent SampleCollection Conclusion Questionnaire Participant Reception Onyx controls the availability of the stages in the interview 2 - After consent Consent SampleCollection Conclusion Questionnaire Participant Reception Consent SampleCollection Conclusion Questionnaire Participant Reception 3 – After all stages completed Consent SampleCollection Conclusion Questionnaire Participant Reception
Onyx Screenshot Appointment management
Onyx Screenshot Participant Reception
Onyx Screenshot Starting an interview stage
Onyx Screenshot Participant Interview
Highly secure IT infrastructure architecture for assessment centers
Data Collection Modules • OBiBa team is currently developing a series of stage modules for the CPAC cohorts • Each of them uses generic data models and tools that can be configured to meet requirements of other biobanks • These modules are: • Jade - Physical Measurements • Marble - Consent • Quartz - CAPI • Mica - Conclusion • Ruby - Sample Collection
Jade: The physical measurement module Two kinds of physical measurement instrument • Manual instruments • No software required to take the measurement • E.g. Stadiometer for height, Measuring tape for Hip/Waist Circumference • Data need to be entered manually into the Onyx database • Electronic instruments • Proprietary software required to run the instrument • E.g. Lunar Achilles Express for bone density measurement • Software requires input values (e.g. gender, weight) • Data need to be exported from proprietary software into the Onyx database
Jade: The physical measurement module Goals • Integrates any number of instrument type, both manual or electronic, into a common model and architecture. • Reduce the amount of programming work required for adding new instruments. Some Features • Manual instruments are completely defined (input, output, validation, contraindications, etc.) within a single XML file. Adding an instrument is simply a matter of adding an XML file. • Electronic instruments also have an XML file, but they require custom code (inevitable). • When possible, input to the instrument is pushed directly into the proprietary software (ie: participant's age, gender, etc.) • When possible, output of the instrument is read directly from the software and sent to the server.
Jade: The physical measurement module Some Features • Allows several types of validation criteria: valid range (10-200), valid spread between values (+/- 5%), likely range (20-180). etc. • Contraindications: each instrument may define its own set of contraindications. These may be observed (by the nurse) or asked (questions to the participant). • Interpretive variables: each instrument may define a set of questions that may be used later to interpret the measurement. • Output of one instrument can become the input of another (unit conversion is done dynamically).
Jade Screenshot Contraindications
Jade Screenshot Electronic instrument
Quartz: the questionnaire module • CPAC cohorts will have similar but different questionnaires to administer • Each CPAC cohort wants at least two questionnaires • One self-administrated using a touch screen monitor • One more complex administrated by a nurse • Each administered questionnaire will represent a different stage in Onyx • Cohorts will be able to define their questionnaires and upload them into Quartz • Required a generic data model and software architecture. No hardcoded questions or screens
Quartz: the questionnaire module • Multi-language versions of the same questionnaire • Supports many types of questions / answers • e.g. free text, single-selection, multi-selection, dropdown lists, boilerplates • Possibility of grouping the questions by section • Timestamps taken to evaluate the efficiency of various sections • Answers validation rules, e.g. valid ranges • The display of each question may be made to depend on answers to earlier questions (skip patterns) • Possibility of modifying answers to previous questions • Questionnaire administration may be interrupted and restarted at last question answered
Marble: the Consent Module • To explain the consent to participants and get their signature • Each cohort has its own consent form • Marble use PDF forms with eSignature fields. Easy, secure and very flexible • Marble automatically fill consent forms with participant name, ID with barcode and date of the current day
Marble Screenshot Both manual and electronic mode is supported
Mica: the Conclusion module • Module that ends the interview at the assessment center • Used to print and give the signed consent form to the participant • Used to print a summary report giving some physical measures taken previously by Jade • Each cohort has its own summary report • Mica uses PDF templates. Easy and flexible
Biobank Ruby Primary tubes barcodes scanning Shipping Secondary tubes barcodes scanning Labs Within Onyx Ruby: the sample collection module • Currently in development. First iteration released in two weeks • Scan barcodes • Configurable validation rules • Structured barcodes (E.g. participant ID, sample types) • Unstructured barcodes i.e. meaningless random numbers • Aliquots and shipping tracked using another software
The DCC Application • Initially developed with and for Cartagene • Phase II: No current development activity • Some application goals • Store data collected at the assessment centers • Produce administrative and scientific reports • E.g. Number of participants by assessment centers, Statistical distributions for specific variables, etc. • Export data in SPSS, SAS or other formats • Node of the CPAC federated databases network (long term) • Work closely with P3G and DataShaper team
Conclusion • OBiBa BIMS is developed for 5 similar but different cohorts • OBiBA designs generic data models, software architecture and configurable solutions • Make this system potentially interesting to other cohorts worldwide • Only few months of development but code is already available on www.obiba.org • Not documentation yet on the web site • New biobanks interested to collaborate can contact us directly
Thanks • The OBiBa Team • Vincent Ferretti, Philippe Laflamme, Nathalie Emond, Nancy Lambert, Alice Carey, Yannick Marcon, Dennis Spathis, Martin boulanger, Meryam Belhiah, Halley Lin • Cartagene • Serge Mani, the administration team • The CPAC cohort IT teams • The CPAC harmonization task force • Isabel Fortier, Jean-Pierre Le Cruguel, Anne Vilain • Funders • Genome Quebec, Cancer Care Ontario, OICR