1 / 55

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit. Real-Time 3D Object Detection. Runs at 15 Hz. One class per keypoint: the set of the keypoint’s possible appearances under various perspective, lighting, noise. Nearest neighbor classification. Pre-processing

rolf
Download Presentation

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Low Complexity Keypoint Recognition and Pose EstimationVincent Lepetit

  2. Real-Time 3D Object Detection Runs at 15 Hz

  3. One class per keypoint: the set of the keypoint’s possible appearances under various perspective, lighting, noise... Nearest neighbor classification Pre-processing Make the actual classification easier Keypoint Recognition The general approach [Lowe, Matas, Mikolajczyk] is a particular case of classification: Search in the Database

  4. Training phase Classifier Used at run-time to recognize the keypoints

  5. A New Classifier: FernsJoint Work with Mustafa Özuysal

  6. Compromise: which is proportional to but complete representation of the joint distribution infeasible. Naive Bayesian ignores the correlation: We are looking for If patch can be represented by a set of image features { fi }:

  7. Presentation on an Example

  8. Posterior probabilities: The tests compare the intensities of two pixels around the keypoint: Invariant to light change by any raising function. Ferns: Training

  9. 0 1 1 1 0 0 1 0 1 Ferns: Training ++ 1 ++ 5 ++ 6

  10. Ferns: Training

  11. Ferns: Training Results

  12. Ferns: Recognition

  13. It Really Works

  14. 500 classes. No orientation or perspective correction. Ferns outperform Trees Ferns responses are combined multiplicatively (Naive Bayesian rule) FERNS Recognition rate TREES Trees responses are combined additively (average) Number of structures

  15. Optimized Locations versus Random Locations:We Can Use Random Tests Comparison of the recognition rates for 200 keypoints: Information gain optimization Randomness Recognition rate Number of trees

  16. For a small number of classes we can try several tests, and retain the best one according to some criterion. We Can Use Random Tests

  17. For a small number of classes we can try several tests, and retain the best one according to some criterion. When the number of classes is large any test does a decent job: We Can Use Random Tests

  18. Another Graphical Interpretation

  19. Another Graphical Interpretation

  20. Another Graphical Interpretation

  21. Another Graphical Interpretation

  22. Another Graphical Interpretation

  23. Another Graphical Interpretation

  24. Building the ferns takes no time (except for the posterior probabilities estimation); Simplifies the classifier structure; Allows incremental learning. We Can Use Random Tests:Why It Is Interesting

  25. Comparison with SIFTRecognition rate FERNS Number of Inliers SIFT Frame Index

  26. Comparison with SIFTComputation time • SIFT: 1 ms to compute the descriptor of a keypoint (without including convolution); • FERNS: 13.5 micro-second to classify one keypoint into 200 classes.

  27. Keypoint Recognition in Ten Lines of Code 1: for(int i = 0; i < H; i++) P[i ] = 0.; 2: for(int k = 0; k < M; k++) { 3: int index = 0, * d = D + k * 2 * S; 4: for(int j = 0; j < S; j++) { 5: index <<= 1; 6: if (*(K + d[0]) < *(K + d[1])) 7: index++; 8: d += 2; } 9: p = PF + k * shift2 + index * shift1; 10: for(int i = 0; i < H; i++) P[i] += p[i]; } Very simple to implement; No need for orientation nor perspective correction; (Almost) no parameters to tune; Very fast.

  28. The number of ferns, and The number of tests per ferns can be tuned to adapt to the hardware in terms of CPU power and memory size. Ferns Tuning

  29. Feature Harvesting Estimate the posterior probabilities from a training video sequence:

  30. Training examples Matches Feature Harvesting With the ferns, we can easily: - add a class; - remove a class; - add samples of a class to refine the classifier.  Incremental learning Detect Object in Current Frame Update Classifier  No need to store image patches;  We can select the keypoints the classifier can recognize.

  31. Test Sequence

  32. Handling Light Changes

  33. Low Complexity Keypoint Recognition and Pose Estimation

  34. EPnP: An Accurate Non-Iterative O(n) Solution to the PnP ProblemJoint Work with Francesc Moreno-Noguer

  35. 2D/3D correspondences known Internal parameters known Rotation, Translation ? The Perspective-n-Point (PnP) Problem How to take advantage of the internal parameters ? Solutions exist for the specific cases n = 3 [...], n = 4 [...], n = 5 [...], and the general case [...].

  36. A Stable Algorithm Rotation Error (%) MEAN MEDIAN Number of points used to estimate pose LHM: Lu-Hager-Mjolsness, Fast and Globally Convergent Pose Estimation from Video Images. PAMI'00. (Alternatively optimize over Rotation and Translation); EPnP: Our method.

  37. A Fast Algorithm MEDIAN Rotation Error (%) Computation Time (sec) - Logarithmic scale

  38. Estimate the coordinates of the 3D points in the camera coordinate system. Rotation, Translation [Lu et al. PAMI00] General Approach

  39. The 3D points are expressed as a weighted sum of four control points. Introducing Control Points  12 unknowns: The coordinates of the control points in the camera coordinates system.

  40. The Point Reprojections Give a Linear System For each correspondencei: Rewriting and Concatenating the Equations from all the Correspondences:

  41. The Solution as Weighted Sum of Eigenvectors • Mx = 0 • MTMx = 0 • x belongs to the null space of MTM: • with vi eigenvectors of matrix MTM associated to null eigenvalues. • Computing MTM is the most costly operation — and linear in n, the number of correspondences.

  42. From 12 Unknowns to 1, 2, 3, or 4 • The i are our N new unknowns; • N is the dimension of the null space of MTM; • Without noise: N = 1 (scale ambiguity). • In practice: no zero eigenvalues, but several very small, and N ≥ 1 (depends on the 2D locations noise). • We found that only the cases N = 1, 2, 3 and 4 must be considered.

  43. How the Control Points Vary with the i When varying the i: Reprojections in the Image Corresponding 3D points

  44. Imposing the Rigidity Constraint The distances between the control points must be preserved:  6 quadratic equations in the i.

  45. The Case N = 1 , and 6 quadratic equations: • 1 can easily be computed: • Its absolute value is solution of a linear system: • Its sign is chosen so that the handedness of the control points is preserved.

  46. The Case N = 2 , and 6 quadratic equations: We use the linearization technique. Gives 6 linear equations in 11 = 12, 12 = 1 2, and 22 = 22 :

  47. The Case N = 3 , and 6 quadratic equations: Same linearization technique. Gives 6 linear equations for 6 unknowns:

More Related