Softassign and EM-ICP on GPU

Softassign and EM-ICP on GPU Toru Tamaki, Miho Abe, BisserRaytchev, KazufumiKaneda 19th Nov. 2010

Contribution of this talk • Fast GPU implementations of registration algorithms for 3D point sets. • Softassign [Gold et al., 1998] • EM-ICP [Granger et al., 2002] • (Weighted) Horn’s method [Horn, 1987] • So, what is “registartion” ?

What is “Registration” or “Alignment” ? Image registration A set of images

3D registration algorithm • Input • Two point sets: and • Output • Rotation matrix • Translation vector and X Y

Algorithms for registration

Horn’s method: correspondence is known. Known correspondence X Y Unknown correspondence ? X Y

Horn’s method: correspondence is known. 3 4 5 Computer 1st Eigenvector : quaternion Convert to Compute centers Centering 2 1

ICP: correspondence is unknown. Find closest (nearest) point to in Put the point to

ICP: correspondence is unknown. Horn’s method with and Estimate and Find closest (nearest) point to in Put the point to

ICP: correspondence is unknown. Horn’s method with and Estimate and Repeat Find closest (nearest) point to in Put the point to Fast, but easy to fail due to hardcorrespondence.

Softassign: soft correspondence. GPU! Weighted Horn’s method with and Each row and column should be normalized to 1 by Shinkhorn iterations Estimate and GPU! Repeat GPU!

Shinkhorn iterations sum up to 1 sum up to 1 sum up to 1 Each row and column should be normalized to 1 by Shinkhorn iterations sum up to 1 Repeat row and column normalization until converge.

Shinkhorn iterations Each row and column should be normalized to 1 by Shinkhorn iterations sum up to 1 sum up to 1 sum up to 1 sum up to 1 Repeat row and column normalization until converge.

Shinkhorn.GPU (row normalization) Using sgemv of CUBLAS Each row and column should be normalized to 1 by Shinkhorn iterations

Shinkhorn.GPU (row normalization) Using CUDA kernel Row-wise division Each row and column should be normalized to 1 by Shinkhorn iterations Column normalization is done by the same way.

Weighted Horn’s method Normal version Weighted version 3 3 Using CUBLAS sgemvtwice.

Centering.GPU (weighted version) CUDA kernel Weighted sum CUBLAS sasum CUBLAS sasum Weighted center Same as for

Pipeline of Softassing.GPU CPU GPU Compute with CUDA kernel Shinkhorn.GPU Centering.GPU and Solve Eigenvalue problem Weighted Horn’s method

EM-ICP: soft correspondence. Pseudo correspondence GPU! Weighted Horn’s method with and Estimate and Each row is normalized once. GPU! Repeat GPU!

Row normalization on GPU Using sgemv of CUBLAS Not normalized yet.

Row normalization on GPU Using CUDA kernel Row-wise division + sqrt Now normalized.

Computing weights Using sgemv of CUBLAS Now normalized.

Pseudo correspondence CUBLAS sgemv Now normalized. Centering: same with Softassing.GPU

Weighted Horn’s method Weighted version (not efficient) 3 Weighted version (2 steps) 3 CUDA kernel CUBLAS sgemm ’

Pipeline of EM-ICP.GPU CPU GPU Compute with CUDA kernel Row normalization on GPU Centering.GPU and Solve Eigenvalue problem 2 step weighted Horn’s method

Computing time over different number of points GPU: GeForce8800GT CPU: Intel Core2 Quad + OpenMP (4 cores) Successfully aligned 5000 points less than 7 seconds. Slightly fast, but failed.

Summary • Implemented 3D registration algorithms on a GPU are: • Softassign, • EM-ICP, • Weighted Horn’s method. • EM-ICP.GPU is • able to align 5000 points within 7 seconds, • 60 times faster than EM-ICP.CPU, • more robust than ICP.CPU. • Code, binary, and movies are available at: • http://home.hiroshima-u.ac.jp/tamaki/study/cuda_softassign_emicp/

Limitations • Number of points • Should be less than 8000 for GeForce8800GT with 512MB memory. • More memory, more points. • Stopping condition • requires to store whole matrix or , and compare with previous ones: inefficient. • Hence, currently, number of iterations is fixed.

Softassign and EM-ICP on GPU

Softassign and EM-ICP on GPU

Presentation Transcript

On the ICP Algorithm

AES on GPU

ICP

ICP

Particle Systems on GPU

ICP

Geant4 on GPU prototype

ICP

ICP

ICP

ICP

Programcao de Shaders em GPU´s

ICP

Final Gathering on GPU

GPU-to-GPU and Host-to-Host Multipattern String Matching on a GPU

ICP

ICP

ICP

ICP

ICP