1 / 30

Geometric Hashing: A General and Efficient Model-Based Recognition Scheme

Geometric Hashing: A General and Efficient Model-Based Recognition Scheme. Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd 2004. Motivation. Object recognition (ultimate goal of most computer vision research). Inputs: A database of objects.

issac
Download Presentation

Geometric Hashing: A General and Efficient Model-Based Recognition Scheme

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd 2004

  2. Motivation • Object recognition (ultimate goal of most computer vision research). • Inputs: • A database of objects. • A scene or image to recognize. • Problems: • Objects in the scene undergo some transformations. • Objects may partially occlude each other. • Computationally expensive to retrieve each object from database and compare it against the observed scene.

  3. Problem Statement • Recognition under Similarity Transformation: • “Is there a transformed (rotated, translated and scaled) subset of some model point-set which matches a subset of the scene point-set?”

  4. Outline • Key idea • General Framework • Recognition under Various Transformations • Recognition of 3D Objects from 2D Images • Recognition of Polyhedra Objects • Comparisons • Alignment • Generalized Hough Transform

  5. Key Idea (1/8) Recognizing a pentagon in an image

  6. Key Idea (2/8) Blue: 1

  7. Key Idea (3/8) Red: 1

  8. Key Idea (4/8) Green: 5

  9. Key Idea (5/8) Purple: 1

  10. Key Idea (6/8) Brown: 1

  11. Key Idea (7/8) Blue: 1 Red: 1 Green: 5 Purple: 1 Brown: 1 Object is a pentagon!

  12. Key Idea (8/8) Blue: 1 Red: 2 Green: 2 Purple: 1 Brown: 1 Object is NOT a pentagon!

  13. Brute Force Recognition • Let m: points on the model, • n: points on the scene. • Recognize a single model: O((m x n)2 x t) • where t is the complexity to verify the • model against the scene. • If m=n, and t=n, then we have O(n5) to recognize a single model.

  14. General Framework (1/2) • Two stages algorithm: • Preprocessing (for each model): For each feature points pair: • Define a local coordinate basis on this pair. • Compute and quantize all other feature points in this coordinate basis. • Record (model, basis) in a hash table.

  15. General Framework (2/2) • Online recognition (given a scene, extract feature points): • Pick arbitrary ordered pair: • Compute the other points using this pair as a basis. • For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. • Matching candidates: (model, basis) pairs with large number of votes. • Recover the transformation that results in the best least-squares match between all corresponding feature points. • Transform the features, and verify against the input image features (if fails, repeat to 1).

  16. Two Stages Algorithm (1/2) [1]

  17. Two Stages Algorithm (2/2) [1]

  18. Complexity Assume m=n, and k is the number of point to define the basis. • Preprocessing: O(nk+1) for a single model. • Recognition: O(nk+1) against all objects in the database.

  19. Under Various Transformations (1/2) • Translation in 2D and 3D. • 1-point basis. • O(n2). • Similarity transformation in 2D. • 2-point basis. • O(n3). • Similarity transformation in 3D. • 3-point basis. • O(n4).

  20. Under Various Transformations (2/2) • Affine transformation • 3-point basis. • O(n4) • Projective transformation • 4-point basis. • O(n5)

  21. Recognition of 3D Objects from 2D Images (1/5) • Correspondence of planes • Preprocessing: consider planar sections of the 3D object which contain three of more interest points. • Hash (model, plane, basis) triplet. • Use either projective transformation or affine transformation. • Once the planes correspondence have been established, the position of the entire 3D body is solved.

  22. Recognition of 3D Objects from 2D Images (2/5) • Singular affine transformation A x + b = U where A : 2x3 affine matrix x : 3x1 3D vector b : 2x1 2D translation vector U : 2x1 image

  23. Recognition of 3D Objects from 2D Images (3/5) A set of four non-coplanar points in 3D defines a 3D affine basis: • One point as origin • The vectors between origin and the other three points as the unit (oblique) coordinate system. Preprocess the model points in this four-basis point.

  24. Recognition of 3D Objects from 2D Images (4/5) Recognition: • Pick four points: p0, p1, p2, and p3 --> three vectors: v1, v2, and v3 in the 2D image. • Exists:  v1 +  v2 +  v3 = 0, where(, , ) ≠ 0 • A point p in the image, with v be the vector from p0 to p. • Vote for all t ≠ 0 (a line with parameter t): v = (+t) v1 + ( + t) v2 + (t) v3, where (, ) is the coordinate of v in the v1, v2 basis.

  25. Recognition of 3D Objects from 2D Images (5/5) • Establishing a viewing angle with similarity transformation. • Tesselate a viewing sphere (uniform in spherical coordinates). • Record (model, basis, angle) in the hash table. • 2-point basis: O(n3) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene).

  26. Recognition of Polyhedral Objects Polygonal objects • Choose an edge as the basis, record (model, basis edge) in the hash table. • Preprocessing and recognition is O(n2). [1]

  27. Comparisons (1/2) • With alignment method. • Use exhaustive enumeration of all possible pairs in the objects and the images. • Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. • The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. • Geometric hashing more efficient if: • The scene contains enough features (6-10) for efficient recognition by voting. • There are many models.

  28. Comparisons (2/2) • With Generalized Hough Transform (GHT). • GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while • Geometric Hashing quantizes just the (discrete) transformation represented by the basis.

  29. Summary • Ability to recognize objects that have undergo an arbitrary transformation. • Can perform partial matching. • Efficient and can be parallelized easily. • Use transformation-invariant access key to the hash table. • Two phases (preprocessing and recognition). • Require a large memory to store hash table.

  30. References • [1] Yehezkel Lamdan and Haim J. Wolfson, Geometric Hashing: A General and Efficient Model-Based Recognition Scheme, ICCV, 1988.

More Related