160 likes | 285 Views
Parallelization of System Matrix generation code. Mahmoud Abdallah Antall Fernandes. SPECT System. SPECT System. Inverse Cone. Back Projection. Filtered Back Projection is applying a ramp filter on the back projected image. Still widely used for its high speed and easy implementation.
E N D
Parallelization of System Matrix generation code Mahmoud Abdallah Antall Fernandes
Back Projection • Filtered Back Projection is applying a ramp filter on the back projected image. • Still widely used for its high speed and easy implementation. Ref figure: Tomographic Reconstruction of SPECT Data • Bill Amini, Magnus Björklund, Ron Dror, Anders Nygren oo
Maximum Likelihood-Expectation Maximization Algorithm • Is found to reduce noise in reconstruction iteratively • An iterative algorithm is used to solve the following linear problem • FX = P • P – vector of projection data • X – voxelized image • F – projection matrix operator • Needs a large number of iterations to reconstruct an image
EM Algorithm • The EM algorithm is given by • Summation over k is projection operation • Summation over j is the back projection operation
System Matrix • Maps the image space to the data space • Takes detector geometry as input • Generates detector data for every bin for each angle (usually there are 72 angles/frames)
System Matrix Algorithm for each angle DO // number of angles = 72 for each detector bin in U direction Do // bins: around 14 for each detector bin in V direction Do // bins: around 64 for each row in the inverse cone grid Do// <= 99 for each Column in the inverse cone grid Do //<= 99 for each voxel intersected the Ray Do calculate point response end end end end end end Number of loops = 72 x 14 x 64 x 99 x 99 = 632282112
System Matrix Parallelization Observation: At each angle, each bin’s calculations are independent from other bins’. Proposal: Parallelize all calculations for each angle. • E.g. use GPU.
Parallelized System Matrix Algorithm Host Program: for each angle DO Run all kernels for all bins at the same time end GPU Kernel: for each voxel intersected the Ray Do calculate attenuation and store it in SysMat end
SIMD (Architecture of GPU) From: (AMD) Advanced Micro Devices INC 2010 (Introduction to OpenCL Programming)
OpenCL • Based on ISO C99 with some extensions & restrictions • provides parallel computing using task-based and data-based parallelism Architecture • Host Program • Kernel
Program Architecture Host Program Executes on the host system Sends kernels to execute on OpenCL™ devices using command queue. Kernels Similar to C function. Executed on OpenCL™ devices ( GPU).