Implementing Memory & Run Time Efficient Image Texture Classification using NVIDIA GPU

Implementing Memory & Run Time Efficient Image Texture Classification using NVIDIA GPU SHREYAS PARNERKAR

Motivation • Texture analysis is important in many applications of computer image analysis for classification or segmentation of images based on local spatial variations of intensity or color. • Applications include industrial and biomedical surface inspection, for example for defects and disease, segmentation of satellite or aerial imagery, segmentation of textured regions in document analysis. • Most texture classification methods derive features based on output of large filter banks (13 – 48 dimensional feature space).

Motivation • Tuzel et al. use image intensities and first and second order derivatives of intensities in both x and y direction for texture classification which results in a 5 dimensional feature space. • These features are used to calculate co-variance matrices using Integral images (P & Q). • Calculation of integral images is computationally intensive because of highly nested loops.

Algorithm: Integral Image Calculations

Dependence Graph ROWS COLUMNS

GPU Utilization concerns • Such scheduling results in a maximum of W or H elements to be executed in parallel. • But at other instances, it is always less than the maximum. • GPU utilization drops down resulting in slow-down since plenty of threads are idle. • Such scheduling is hence not good for GPU implementation.

Memory Concerns • Shared Memory Limited to 4kB . Cannot put entire image in shared memory. • Global memory is slow compared to shared memory. • Uploading entire image in global memory causes interference with the graphic display (??). • Put just the required data in shared memory. • Required data can be entire image.

Updated Dependence Graph ROWS ROWS COLUMNS COLUMNS + =

Results

Results CPU Over-Head

Results

Yet to come…. Scope to improve the speed up

In Conclusion… • Implement parallel reduction for even more speed up. (In progress) • Use calculated P-Q integral images to calculate covariance. ( Can be done on CPU ) • Read Data from actual images (Currently sample random data is generated). • Compare Memory Usage for CPU vs GPU implementation.

Implementing Memory & Run Time Efficient Image Texture Classification using NVIDIA GPU