280 likes | 401 Views
Outline. Introduction Image Registration High Performance Computing Desired Testing Methodology Reviewed Registration Methods Preliminary Results Future Work Cool App Demo. Introduction. Primary Motivation After some research, the scope of this project increased tenfold.
E N D
Outline • Introduction • Image Registration • High Performance Computing • Desired Testing Methodology • Reviewed Registration Methods • Preliminary Results • Future Work • Cool App Demo
Introduction • Primary Motivation • After some research, the scope of this project increased tenfold
Image Registration • Image Registration is the process of determining a spatial transformation that establishes the correspondence of two images
Image Registration • Applications of Image Registration • Cartography • Computer Vision • Image Guided Surgery • Brain Mapping • Detection of Disease state change over time • And many more…
Image Registration • Software packages, libraries, and frameworks capable of Image Registration • Automated Image Registration Package (AIR) • Insight Segmentation and Registration Toolkit (ITK) • FLexible Image Registration Toolkit (FLIRT) • Mathworks Image Processing Toolkit • Others… None currently support registration by means of parallel computing!
Image Registration • Depending on the application, registration can be highly demanding of resources • Large amounts of data to be worked on can be too large for physical memory (results in disk swapping) • Search spaces (deformable problems can get as large as say 9.8 * 10^6)
High Performance Computing • Extremely efficient in reducing performance and memory issues • Steadily decreasing prices and a high increase availability of high performance machines has made parallel computing for many a reality • Most image registration specialists are not familiar with parallel and distributed computing techniques • Many researchers have successfully applied such methods, but none have a created a generic software module
High Performance Computing • My Role • Administer and maintain the two clusters Nick and Optimus • Head of the USC High Performance Computing Group • Assist users • Developed and (try to) maintain the HPCG Webpage
High Performance Computing Systems: Nick • HARDWARE: 76 Compute Nodes: Dual 3.4 Xeon 2ML2, 4GB RAM, 1-40GB1 Master Node: Dual 3.2 GHz Xeon 2ML2, 4GB RAM, 3-73GB disks RAID 5 • INTERCONNECT : Topspin Infiniband • SOFTWARE: Platform Rocks 4 (RHEL 4), Platform LSF, OpenMPI (Compiled with Infiniband Libraries), 64bit GCC compiles, Intel Compilers, Star-CD, ITK, others… • Will support starting Summer: GAMESS, NWCHEM, …
High Performance Computing Systems: Optimus • HARDWARE: 64 Compute Nodes: Dual, Dual-core 2.2 GHz Opteron 2ML2, 8GB RAM, 1-250GB1 Master Node: Dual, Dual-core 2.2 GHz Xeon 2ML2, 8GB RAM, 2-500GB disks • INTERCONNECT : GigE • SOFTWARE: Fedora Core 4, ABC Management Software, OpenPBS scheduling software. OpenMPI (Compiled with Infiniban Libraries), 64bit GCC compiles, Intel Compilers, ITK, others… • Will support starting Summer: GAMESS, NWCHEM, …
High Performance Computing • Message Passing • In distributed memory systems, the most prevalent means of communication is message passing • Message Passing Interface (MPI) • Takes care of low-level details such as buffering, error handling, and data-type conversion • Middleware component in conjunction with standard programming language like C, C++, and Fortran
High Performance Computing • Issues with Multi-core [6] • Memory Contention • Interconnect Contention • Program Locality • "--mca mpi_paffinity_alone 1"
Desired Testing Methodology • Research and analyze existing registration frameworks to determine if their workload can be distributed in a parallel environment • Thoroughly test all methods sequentially and in parallel to determine Speedup • Testing in 2-D and 3-D, intermodal and intramodal, and rigid and non-rigid image registration • Focus on Intensity based methods • Address known multi-core issues
Desired Testing Methodology • Two strategies • Parallelizing the optimization method • Parallelizing the metric function
Desired Testing Methodology • The measure of quality will be defined using Parallel Speedup and Parallel Efficiency Parallel speed up is defined as SN = TS/TN where TS is the execution time of the best sequential algorithm, and TN is the execution time on N processors Parallel efficiency is defined as EN = SN/N where N is the number of processors
Reviewed Registration Methods • Warfield’s Approach [3] • Cachier'sdemons algorithm [5] as used in [7] • Claims it’s precise, robust, relatively low computation time • Structure makes it a good candidate for parallelization • Can be divided into three main “bricks”: • Oversampling needed by the pyramidal approach • Search for the matches • Parallel gaussian filtering
Reviewed Registration Methods • Cachier's demons algorithm [5] as used in [6]
Reviewed Registration Methods • Acceleration of Genetic Algorithm with Parallel Processing with Application in Medical Image Registration(B. Laksanapanai* W. Withayachumnankul * C. Pintavirooj * P.Tosranon*) • Very intriguing, but such a short paper and didn’t really dive into how it was implemented
Reviewed Registration Methods Distributed Registration Framework as proposed by Michael Kuhn [1] • The metric calculation is organized in a master/slave design. • The master process is responsible for data distribution as well as communication of the existing framework • Each slave is assigned a region of the fixed image, and calculates an intermediate metric value • Master node coordinates all steps required to collect and process the partial results and passes the final result to the registration framework
Reviewed Registration Methods • Implemented these concepts through: • DistributedImageToImageMetric • RegistrationCommunicator • DistributedImageToImageMetric class is divided into master and slave, and is derived from itk::ImageToImageMetric class • RegistrationCommuncator provides an interface for all communication tasks and uses MPI
Reviewed Registration Methods • Whole registration process consists of two stages: Initialization and Optimization • Initialization: distribute data to nodes • Optimization: optimizers in ITK work iteration based • During each iteration, metric values and derivatives are requested from metric function • When new values are required, optimizer requests a metric from the master, master then asks slaves to compute the partial value associated with their fixed region and transmits back to master • Master processes and repeats until complete
Preliminary Results • Sequential Runs: MeanSquaresImagetoImageMetric
Preliminary Results • Sequential Runs: MeanSquaresImagetoImageMetric
Future Work • Implement an attachable parallel image registration framework (that supports Multi-core as well) to existing tools such as ITK • Thorough Testing on both clusters • The usage of multiple cores in one node requires a new programming model • Forms of Data Decomposition