330 likes | 521 Views
Softassign and EM-ICP on GPU. Toru Tamaki, Miho Abe, Bisser Raytchev , Kazufumi Kaneda 19 th Nov. 2010. Contribution of this talk. Fast GPU implementations of registration algorithms for 3D point sets. Softassign [Gold et al., 1998] EM-ICP [Granger et al., 2002]
E N D
Softassign and EM-ICP on GPU Toru Tamaki, Miho Abe, BisserRaytchev, KazufumiKaneda 19th Nov. 2010
Contribution of this talk • Fast GPU implementations of registration algorithms for 3D point sets. • Softassign [Gold et al., 1998] • EM-ICP [Granger et al., 2002] • (Weighted) Horn’s method [Horn, 1987] • So, what is “registartion” ?
What is “Registration” or “Alignment” ? Image registration A set of images
3D registration algorithm • Input • Two point sets: and • Output • Rotation matrix • Translation vector and X Y
Horn’s method: correspondence is known. Known correspondence X Y Unknown correspondence ? X Y
Horn’s method: correspondence is known. 3 4 5 Computer 1st Eigenvector : quaternion Convert to Compute centers Centering 2 1
ICP: correspondence is unknown. Find closest (nearest) point to in Put the point to
ICP: correspondence is unknown. Horn’s method with and Estimate and Find closest (nearest) point to in Put the point to
ICP: correspondence is unknown. Horn’s method with and Estimate and Repeat Find closest (nearest) point to in Put the point to Fast, but easy to fail due to hardcorrespondence.
Softassign: soft correspondence. GPU! Weighted Horn’s method with and Each row and column should be normalized to 1 by Shinkhorn iterations Estimate and GPU! Repeat GPU!
Shinkhorn iterations sum up to 1 sum up to 1 sum up to 1 Each row and column should be normalized to 1 by Shinkhorn iterations sum up to 1 Repeat row and column normalization until converge.
Shinkhorn iterations Each row and column should be normalized to 1 by Shinkhorn iterations sum up to 1 sum up to 1 sum up to 1 sum up to 1 Repeat row and column normalization until converge.
Shinkhorn.GPU (row normalization) Using sgemv of CUBLAS Each row and column should be normalized to 1 by Shinkhorn iterations
Shinkhorn.GPU (row normalization) Using CUDA kernel Row-wise division Each row and column should be normalized to 1 by Shinkhorn iterations Column normalization is done by the same way.
Weighted Horn’s method Normal version Weighted version 3 3 Using CUBLAS sgemvtwice.
Centering.GPU (weighted version) CUDA kernel Weighted sum CUBLAS sasum CUBLAS sasum Weighted center Same as for
Pipeline of Softassing.GPU CPU GPU Compute with CUDA kernel Shinkhorn.GPU Centering.GPU and Solve Eigenvalue problem Weighted Horn’s method
EM-ICP: soft correspondence. Pseudo correspondence GPU! Weighted Horn’s method with and Estimate and Each row is normalized once. GPU! Repeat GPU!
Row normalization on GPU Using sgemv of CUBLAS Not normalized yet.
Row normalization on GPU Using CUDA kernel Row-wise division + sqrt Now normalized.
Computing weights Using sgemv of CUBLAS Now normalized.
Pseudo correspondence CUBLAS sgemv Now normalized. Centering: same with Softassing.GPU
Weighted Horn’s method Weighted version (not efficient) 3 Weighted version (2 steps) 3 CUDA kernel CUBLAS sgemm ’
Pipeline of EM-ICP.GPU CPU GPU Compute with CUDA kernel Row normalization on GPU Centering.GPU and Solve Eigenvalue problem 2 step weighted Horn’s method
Computing time over different number of points GPU: GeForce8800GT CPU: Intel Core2 Quad + OpenMP (4 cores) Successfully aligned 5000 points less than 7 seconds. Slightly fast, but failed.
Summary • Implemented 3D registration algorithms on a GPU are: • Softassign, • EM-ICP, • Weighted Horn’s method. • EM-ICP.GPU is • able to align 5000 points within 7 seconds, • 60 times faster than EM-ICP.CPU, • more robust than ICP.CPU. • Code, binary, and movies are available at: • http://home.hiroshima-u.ac.jp/tamaki/study/cuda_softassign_emicp/
Limitations • Number of points • Should be less than 8000 for GeForce8800GT with 512MB memory. • More memory, more points. • Stopping condition • requires to store whole matrix or , and compare with previous ones: inefficient. • Hence, currently, number of iterations is fixed.