Improvement of CT Slice Image Reconstruction Speed Using SIMD Technology

Improvement of CT Slice Image Reconstruction Speed Using SIMD Technology Xingxing Wu Yi Zhang Instructor: Prof. Yu Hen Hu Department of Electrical & Computer Engineering University of Wisconsin, Madison

Motivation • CT Slice Image Reconstruction is a very important part which will affect the reconstructed image quality and scanning speed • CT Slice Image Reconstruction is very time-consuming • Traditional methods for speedup: • Specially designed hardware • Parallel algorithm running on super computer • Explore a new method: SIMD implementation

Parallel-Beam FBP Image Reconstruction Algorithm • The Algorithm consists on three parts: • data rebinning: • data filtering • back-projection

Parallel-Beam FBP Image Reconstruction Algorithm • Projection: • Data Rebinning: • Data Filtering: • Data Backprojection:

CT Slice Image Reconstruction Is Very Time Consuming A Whole Head Spiral Scanning will generate several GB projection data

Function Profiling

Can FBP Algorithm Benefit from SIMD? • The Algorithm has the following features: • Small, highly repetitive loops that operate on sequential arrays of integers and floating-point values • Frequent multiplies and accumulates • Computation-intensive algorithms • Inherently parallel operations • Wide dynamic range, hence floating-point based • Regular memory access patterns • Data independent control flow

Analysis of Data Dynamic Range and Quantization Errors • Wide Dynamic Range • Relative Error Metric • 32-Bit Single-Precision Floating Point and SSE2

Updated Algorithm to Fit SIMD • Update the algorithm to eliminate some conditional branches • Reduce the on-the-fly calculations which are not suitable for the SIMD implementation

A4 A5 A6 A7 A0 A1 A2 A3 A0*B0+A4*B4 B4 A1*B1+A5*B5 B5 B6 A2*B2+A6*B6 B7 A3*B3+A7*B7 B0 B1 B2 B3 Parallel Implementation of Data Filtering In SIMD Rebinned Data * * * * * * * * Weight Filtered Data

E0 C0 F0 B0 G0 -0.5 D0 H0 A0 +0.5 H1 E1 A1 B1 F1 C1 -0.5 G1 +0.5 D1 B2 A2 +0.5 -0.5 D2 H2 C2 E2 G2 F2 D3 H3 A3 +0.5 G3 -0.5 E3 B3 C3 F3 Parallel Implementation of Backprojection in SIMD Index Calculation Index Floor (index) Ceil (index) (fetch data) Filtered Data Weight Reconstructed Image

Optimization of The Implementation • Optimize Memory Access • Ensure proper alignment to prevent data split across cache line boundary: data alignment, stack alignment, code alignment • Observe store-forwarding constraints • Optimize data structure layout and data locality to ensure efficient use of 64-byte cache line size and also reduce the frequency of memory loading and storing • Use prefetching cacheability instructions control appropriately • Minimize bus latency by segmenting the reads and writes into phases • Replace Branches with Logic Operations • Optimize Instruction Scheduling • Optimize the Parallelism • Loop Unrolling • Break dependence chains

Optimization of The Implementation • Optimize Instruction Selection • avoid longer latency instruction • avoid instructions that unnecessarily introduce dependence-related stalls • Optimize the Floating-point Performance • avoid exceeding the representable range • avoid change floating-point control/status register • enable flush-to-zero and DAZ mode

Improvement of Performance The differences of the reconstructed image pixel values between C implementation and SIMD implementation are less than 0.01

Improvement of CT Slice Image Reconstruction Speed Using SIMD Technology

Improvement of CT Slice Image Reconstruction Speed Using SIMD Technology

Presentation Transcript

Image Reconstruction

Multi-Slice CT

USING TECHNOLOGY FOR IMPROVEMENT

IMAGE RECONSTRUCTION

Image Reconstruction using Dynamic EPI Phase Correction

Sample CT Image

Image Reconstruction

Image Restoration and Reconstruction (Image Reconstruction from Projections)

Statistical image reconstruction

CT image testing

Image Reconstruction

Multi-Slice CT

Image Reconstruction

IMAGE RECONSTRUCTION

Siemens 16 Slice Ct Scanner

Image Reconstruction Methods

Sample CT Image

Super Resolution Image Reconstruction Using Kriging

Single Slice Spiral - Helical CT

Tomographic Image Reconstruction