240 likes | 354 Views
ICT in the IT Future of Medicine Project. Babette Regierer Daniel Jameson. ITFoM Consortium. 170 Partners from 34 countries: 21 EC Member States Associated Countries: Switzerland , Iceland , Israel, Croatia , Turkey, Norway Other countries:
E N D
ICT in the IT Future of Medicine Project Babette Regierer Daniel Jameson
ITFoM Consortium 170 Partners from34 countries: 21 EC Member States Associated Countries: Switzerland, Iceland, Israel, Croatia, Turkey, Norway Other countries: Australia, New Zealand, Canada, USA, Libanon, Korea, Japan Numberofpartners in onecountry
ITFoM - Project vision • Assimilation of data about individuals (‘omic, health records). • Incorporate these data into mathematical models of each individual’s “health”. • Use these models to make predictions about the health of individuals and, if necessary, courses of treatment best suited to them.
The Virtual Patient: Integration of various models Tissues Anatomy Statistics Molecules
Structure of IFoM • Medical Platform(Kurt Zatloukal) • Analytical Technologies (Hans Lehrach) • Infrastructure Hard- and Software • (Nora Benhabiles/Oskar Mencer) • Data Pipelines (Ewan Birney) • Computational Methodologies (Mark Girolami) • ICT Integration (Hans Westerhoff) • Coordinationand Management • (Hans Lehrach/Markus Pasterk)
Challenges for ICT Security Automation Scalability
Scale • 12 million new cancer cases world wide / year • To address all you would need to sequence and analyse1 cancer every 2 seconds, that’s at least two complete sequences, at least one for the tumour and one somatic.
Scalability • All technology must be developed with an eye on scalability • What is appropriate now is guaranteed not to be in 10 years • All data formats, standards and paradigms must be flexible and extensible
Security • ITFoM aware that a huge amount of the data involved in the proposal was sensitive • Proposal to develop a robust, federated security framework and policies. • Mindful of the location of data objects – certain objects must remain within the EU. • Identity Management to build on the experience of a variety of partners (EUDAT, UCL, EBI, IBM).
Acquisition: Data gathered • For data generation we need to consider: - heterologous data produced (molecules, physiology, patient, society…?) - various technologies for data generation - different user groups (skilled vs. naïve) - different data management systems - different professional level
Acquisition: ICT to facilitate • Easy user-oriented process from machine to knowledge: - data analysis pipelines must be easy to handle and fast (e.g. flow computing) - fast data transfer systems - “online” data generation in the future? - development of automated processes - standards for data formats and processes - Suitable data management systems, data storage (local or distributed, security issues) - new database structure needed to speed up data storage, transfer, use? (e.g. HANA system) - responsibility for data curation - where, when, how, who?
Integration: Pipelines to models • Complete genomes provide the framework to pull all biological data together such that each piece says something about biology as a whole • Biology is too complex for any organisation to have a monopoly of ideas or data • The more organisations provide data or analysis separately, the harder it becomes for anyone to make use of the results
Integration: Pipelines to models • The data being gathered must be marshaled into something useful • Processing, Storage, Retrieval • It must be stored • It must be annotated • It must be auditable
Integration: ICT to facilitate • Federated data warehouse with standardised interfaces • Includes auditing services • Must integrate with security layer • Processing pipelines feed into the warehouse • Compute tasks handled on HPC platform using already established middleware (EBI). • Pipelines – several, draw on existing databases for automation of annotation where possible. • Data specific compression algorithms
Processing: Simulating models • Variety of model types
Processing: ICT to facilitate • New algorithms and techniques. • HPC platforms. • Protocols. • New hardware. • Once size will not fit all, but all must communicate with each other.
Utilising: Making use of models • Closing the loop
Utilising: Making use of models • We need to consider: - different target groups - easy access to data/information needed - make them work in the field/on the bedside - technology must be available at low price (e.g. computing power must be cost-effective = green technology)
Utilising: ICT to facilitate • Aim is an approach that is easy-to-handle, cost-efficient and running on all systems - automated data analysis/modeling system - elaborated human-computer interface (visualization) - automated updating of the information (e.g. by text mining in publications) - must be easy to plug in new systems - legal issues - results instantly
ICT Components for Genomic Medicine Personal Data Healthcare Professional decision support system Component 5 Electronic Health Record eHR system (e.g. emis): ~10 Mb Variant file as attachment per record Component 4 Individual query analysis Variant file open data Component 3 Additional clinical annotation Anonymised Data Component 6 Research on Clinical data SHIP, GPRD, LSDBs, Research Capability Programme (RCP) Component 2 Genotype and Phenotype relationship capture Add genomics: Up to 60 million variant files = 600 terabytes* Reference genome sequence ~3 gigabytes BII, SMEs etc. cloud based, secure services Biomedical Informatics Institute (BII) EBI: repositories (petabytes of genome sequence data) Sanger: sequencing (1000 genomes, uk10K) Component 1 Human sequence data repositories Summary Data
Importance of Automation • Mentioned frequently in ITFoM. • Pipelining and utilising data on this scale is impossible if all steps are conducted manually. • This includes processing, annotation, hypothesis generation and testing. • Text mining, machine learning • No one’s actually cracked this.
Conclusions • A virtual, or digital, patient has the potential to revolutionise healthcare, but it will rely completely on the creation of a broad, probably federated, IT infrastructure. • An infrastructure such as this is non-trivial. • Any project as ambitious as a virtual patient requires vastly more expertise than any one individual can hold, but all elements of the project must interact. • Rigorous definition of data standards, interfaces and pipelines must be coupled with a broad view of the topology within which they play a part.