340 likes | 484 Views
Use of Data Provenance and the Grid in Medical Image Analysis and Drug Discovery – an IXI exemplar. Kelvin K. Leung 1 , Mark Holden 1 , Rolf A. Heckemann 2 , Nadeem Saeed 3 , Keith J. Brooks 3 , Jacky B. Buckton 4 , Kumar Changani 3 , David G. Reid 3 ,
E N D
Use of Data Provenance and the Grid in Medical Image Analysis and Drug Discovery – an IXI exemplar Kelvin K. Leung1, Mark Holden1, Rolf A. Heckemann2, Nadeem Saeed3, Keith J. Brooks3, Jacky B. Buckton4, Kumar Changani3, David G. Reid3, Daniel Rueckert5, Joseph V. Hajnal2, Derek L.G. Hill1 1Division of Imaging Sciences, King's College London, UK 2Imaging Sciences Department, Imperial College (Hammersmith Hospital Campus), UK 3Imaging Centre, 4RA Disease Biology, ri-CEDD, GlaxoSmithKline, UK 5Department of Computing, Imperial College, UK
Overview • Background • Motivations • Virtual data system • Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis (RA) • Image registration and segmentation propagation • Methods • Prototype • Results • Conclusions
Motivations • Medical imaging is going to play an important part in drug discovery • Recent £76m investment by GlaxoSmithKline (GSK) and Imperial College on a new clinical imaging center • Automatic analysis of medical image data requires: • Lots of storage space (each image is about 32Mb in this work) • Computational power (running time is about 20-24 hours for processing an image on a single desktop computer in this work) • Motivated by the need of computational resources
Motivations • The Grid has the potential to allow better collaboration between industry and university with the idea of virtual organisation • University can provide image analysis algorithms as services to the industry, such as GSK, over the Grid • Motivated by the need of better and more effective collaboration with the industry
Motivations • Detail and reliable documentation of data provenance of all the analysis is very important in order to obtain regulatory approval for new drug. • Part 11 of Guidance on industry issued by US Food and Drug Administration (FDA) • Good Laboratory Practice (GLP) and Good Clinical Practice (GCP) • Motivated by the need of data provenance
Overview • Background • Motivations • Virtual data system • Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis • Image registration and segmentation propagation • Methods • Prototype • Results • Conclusions
Virtual data system (VDS or Chimera) • A system to “enable documentation of data provenance, discovery of available methods and on-demand data generation (so-called ‘virtual data’)” • Developed by I. Foster, J. Vöckler, M. Wilde and Y. Zhao of University of Chicago • It consists of: • A virtual data catalogue is a virtual data schema that provides a representation of computational procedures and their invocations. • A virtual data language interpreter handles all the requests for constructing and querying the database entries. • Data objects, such as input and output files, are described by logical file names (LFN), which are mapped to physical files via Globus replica catalog (RC) or Globus replica location service (RLS)
Virtual data system • Virtual data language (VDL) is used to describe computational procedures and their invocations • Computational procedures are defined by transformation (TR) statements. Example: • TR foo(input file1, output file2) { … } • Invocations are defined by derivation (DV) statements. Example: • To invoke foo with logical filenames file_a (input) and file_b (output) • DV call_foo->foo(file1=@{input:”file_a”},file2=@{output:”file_b”}); • Virtual data schema allows the storage of TR’s and DV’s
Virtual data system • Compound TR can be built so that workflow can be defined. Example: • To call foo twice and pass the output of the first call to the input of the second call • TR compound_foo(input file_in, output file_out, io file_io) { call foo(file1=@{input:”file_in”}, file2=@{output:”file_io”}); call foo(file1=@{input:”file_io”}, file2=@{output:”file_out”}); }; • When requesting an output file from the system, an abstract DAG (contains only LFN) will be generated. • A planner called “Planning for Execution in Grid (Pegasus)” converts the abstract DAG into a Condor DAGman script and submit it to the Globus universe of Condor.
Overview • Background • Motivations • Virtual data system • Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis • Image registration and segmentation propagation • Methods • Prototype • Results • Conclusions
Automatic delineation of multiple bones • Rheumatoid Arthritis (RA) • Is a chronic, systemic, autoimmune inflammatory disease. • Targets synovial joints, in which there is a massive accumulation of blood-borne cells such as T cells and macrophages. • Blood vessels are formed to support this new tissue and the whole mass is called pannus. • Progressive erosion to cartilage and bone leads to disability in patients • MR images were acquired in a disease model of RA • Interested in the talus bone and the calcaneus bone in the ankle • Delineate them from the MR images and study them, e.g. calculate volume to measure any erosion
Sagittal plane of image 2 Sagittal plane of image 1 Sagittal Transaxial Coronal Image registration • Refers to the spatial alignment of two images so that corresponding features in the two images are matched • The result is a spatial mapping or transformation that transforms positions from one image to positions in another image. • Example: Movie showing the rigid registration of two 3D MR images of a knee
Movie showing the green MR image of a knee overlaid on top of the grey MR image of a knee before and after warping. White arrows show the amount of translation of the control points. Image registration • Rigid registration: translation + rotation = 6 degrees of freedom (dof) • Affine registration: rigid + skewing + scaling = 12 dof • Nonrigid registration: warp one image into another one • Very computationally demanding because of lots of dof • Example: Free form deformation (FFD) models local deformation as translation of a regularly spaced grid of points (control points)
Segmentation propagation • Makes use of the spatial mapping calculated from the registration of two image to perform segmentation • Requires an atlas • An atlas is a reference image with labelled structures
Rigid + non-rigid registration Spatial mapping Apply spatial mapping Computed boundary of calcaneus Atlas Segmentation propagation calcaneus All image analysis workflows were entered into VDS Reference image Target image Manual segmentation of calcaneus
Overview • Background • Motivations • Virtual data system • Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis • Image registration and segmentation propagation • Methods • Prototype • Results • Conclusions
Prototype • Simple web interface to replace some command line tools of VDS, Globus Toolkit 2.4 and Condor • Researchers or clinicians working on medical image analysis may not be comfortable with command line tools and the virtual data language • Developed using Java servlet on Apache Tomcat • Web pages for • Querying VDS for transformations and derivations • Invoking transformations in VDS • Querying, uploading and downloading files to and from Globus RLS • Displaying job status in Condor
Prototype Web portal machine running Apache Tomcat, Globus client, personal Condor (job submission site) Grid machine running Globus Gatekeeper, GridFTP server, Globus RLS and Condor Experimental condor pool of 4 machines (storage and execution site)
Overview • Background • Motivations • Virtual data system • Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis • Image registration and segmentation propagation • Methods • Prototype • Results • Conclusions
Results services
target reference_image Rigid registration aregdof talus_seg cal_seg Segmentation propagation Segmentation propagation tal_dof talus calcaneus cal_dof Service to delineate the calcaneus and talus from the target image Results
Results Jobs generated
Results Job status in Condor
Click to download files and view in vtkview Results
Service to render the surfaces of the bones Results
Results Job submitted Job status
and click on a file to view its provenance Results Browse all the executed services
Overview • Background • Motivations • Virtual data system • Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis • Image registration and segmentation propagation • Methods • Prototype • Results • Conclusions
Conclusions • We integrated Grid middleware and data provenance tool with medical image processing software in a prototype system with collaboration with GSK • Data provenance of the results were kept in VDS. They can be queried and retrieved easily. • Aim to satisfy guidelines issued by US FDA, GLP and GCP on the maintenance of “audit trail” of electronic records. • The total processing time of delineating 12 bones from 6 subjects were cut down from about 132 hours to about 33 hours (a factor of 4) by running the computing tasks on a Condor pool instead of on a single desktop computer
Further work • More user feedback is required to evaluate and improve the system • Further validation and application to a larger amount of subjects are required to determine the sensitivity of the delineation technique to disease progression
Acknowledgements • EPSRC • GlaxoSmithKline (GSK) • Links • IXI: www.ixi.org.uk • VDS: www.griphyn.org/chimera