1 / 16

Parallelization of System Matrix generation code

Parallelization of System Matrix generation code. Mahmoud Abdallah Antall Fernandes. SPECT System. SPECT System. Inverse Cone. Back Projection. Filtered Back Projection is applying a ramp filter on the back projected image. Still widely used for its high speed and easy implementation.

bazyli
Download Presentation

Parallelization of System Matrix generation code

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallelization of System Matrix generation code Mahmoud Abdallah Antall Fernandes

  2. SPECT System

  3. SPECT System

  4. Inverse Cone

  5. Back Projection • Filtered Back Projection is applying a ramp filter on the back projected image. • Still widely used for its high speed and easy implementation. Ref figure: Tomographic Reconstruction of SPECT Data • Bill Amini, Magnus Björklund, Ron Dror, Anders Nygren oo

  6. Maximum Likelihood-Expectation Maximization Algorithm • Is found to reduce noise in reconstruction iteratively • An iterative algorithm is used to solve the following linear problem • FX = P • P – vector of projection data • X – voxelized image • F – projection matrix operator • Needs a large number of iterations to reconstruct an image

  7. EM Algorithm • The EM algorithm is given by • Summation over k is projection operation • Summation over j is the back projection operation

  8. System Matrix • Maps the image space to the data space • Takes detector geometry as input • Generates detector data for every bin for each angle (usually there are 72 angles/frames)

  9. System Matrix Algorithm for each angle DO // number of angles = 72 for each detector bin in U direction Do // bins: around 14 for each detector bin in V direction Do // bins: around 64 for each row in the inverse cone grid Do// <= 99 for each Column in the inverse cone grid Do //<= 99 for each voxel intersected the Ray Do calculate point response end end end end end end Number of loops = 72 x 14 x 64 x 99 x 99 = 632282112

  10. System Matrix Parallelization Observation: At each angle, each bin’s calculations are independent from other bins’. Proposal: Parallelize all calculations for each angle. • E.g. use GPU.

  11. System Matrix Parallelization on GPU

  12. Parallelized System Matrix Algorithm Host Program: for each angle DO Run all kernels for all bins at the same time end GPU Kernel: for each voxel intersected the Ray Do calculate attenuation and store it in SysMat end

  13. SIMD (Architecture of GPU) From: (AMD) Advanced Micro Devices INC 2010 (Introduction to OpenCL Programming)

  14. OpenCL • Based on ISO C99 with some extensions & restrictions • provides parallel computing using task-based and data-based parallelism Architecture • Host Program • Kernel

  15. Program Architecture Host Program Executes on the host system Sends kernels to execute on OpenCL™ devices using command queue. Kernels Similar to C function. Executed on OpenCL™ devices ( GPU).

  16. Thank You

More Related