300 likes | 582 Views
Geometric Hashing: A General and Efficient Model-Based Recognition Scheme. Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd 2004. Motivation. Object recognition (ultimate goal of most computer vision research). Inputs: A database of objects.
E N D
Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd 2004
Motivation • Object recognition (ultimate goal of most computer vision research). • Inputs: • A database of objects. • A scene or image to recognize. • Problems: • Objects in the scene undergo some transformations. • Objects may partially occlude each other. • Computationally expensive to retrieve each object from database and compare it against the observed scene.
Problem Statement • Recognition under Similarity Transformation: • “Is there a transformed (rotated, translated and scaled) subset of some model point-set which matches a subset of the scene point-set?”
Outline • Key idea • General Framework • Recognition under Various Transformations • Recognition of 3D Objects from 2D Images • Recognition of Polyhedra Objects • Comparisons • Alignment • Generalized Hough Transform
Key Idea (1/8) Recognizing a pentagon in an image
Key Idea (2/8) Blue: 1
Key Idea (3/8) Red: 1
Key Idea (4/8) Green: 5
Key Idea (5/8) Purple: 1
Key Idea (6/8) Brown: 1
Key Idea (7/8) Blue: 1 Red: 1 Green: 5 Purple: 1 Brown: 1 Object is a pentagon!
Key Idea (8/8) Blue: 1 Red: 2 Green: 2 Purple: 1 Brown: 1 Object is NOT a pentagon!
Brute Force Recognition • Let m: points on the model, • n: points on the scene. • Recognize a single model: O((m x n)2 x t) • where t is the complexity to verify the • model against the scene. • If m=n, and t=n, then we have O(n5) to recognize a single model.
General Framework (1/2) • Two stages algorithm: • Preprocessing (for each model): For each feature points pair: • Define a local coordinate basis on this pair. • Compute and quantize all other feature points in this coordinate basis. • Record (model, basis) in a hash table.
General Framework (2/2) • Online recognition (given a scene, extract feature points): • Pick arbitrary ordered pair: • Compute the other points using this pair as a basis. • For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and histogram them. • Matching candidates: (model, basis) pairs with large number of votes. • Recover the transformation that results in the best least-squares match between all corresponding feature points. • Transform the features, and verify against the input image features (if fails, repeat to 1).
Complexity Assume m=n, and k is the number of point to define the basis. • Preprocessing: O(nk+1) for a single model. • Recognition: O(nk+1) against all objects in the database.
Under Various Transformations (1/2) • Translation in 2D and 3D. • 1-point basis. • O(n2). • Similarity transformation in 2D. • 2-point basis. • O(n3). • Similarity transformation in 3D. • 3-point basis. • O(n4).
Under Various Transformations (2/2) • Affine transformation • 3-point basis. • O(n4) • Projective transformation • 4-point basis. • O(n5)
Recognition of 3D Objects from 2D Images (1/5) • Correspondence of planes • Preprocessing: consider planar sections of the 3D object which contain three of more interest points. • Hash (model, plane, basis) triplet. • Use either projective transformation or affine transformation. • Once the planes correspondence have been established, the position of the entire 3D body is solved.
Recognition of 3D Objects from 2D Images (2/5) • Singular affine transformation A x + b = U where A : 2x3 affine matrix x : 3x1 3D vector b : 2x1 2D translation vector U : 2x1 image
Recognition of 3D Objects from 2D Images (3/5) A set of four non-coplanar points in 3D defines a 3D affine basis: • One point as origin • The vectors between origin and the other three points as the unit (oblique) coordinate system. Preprocess the model points in this four-basis point.
Recognition of 3D Objects from 2D Images (4/5) Recognition: • Pick four points: p0, p1, p2, and p3 --> three vectors: v1, v2, and v3 in the 2D image. • Exists: v1 + v2 + v3 = 0, where(, , ) ≠ 0 • A point p in the image, with v be the vector from p0 to p. • Vote for all t ≠ 0 (a line with parameter t): v = (+t) v1 + ( + t) v2 + (t) v3, where (, ) is the coordinate of v in the v1, v2 basis.
Recognition of 3D Objects from 2D Images (5/5) • Establishing a viewing angle with similarity transformation. • Tesselate a viewing sphere (uniform in spherical coordinates). • Record (model, basis, angle) in the hash table. • 2-point basis: O(n3) (the same order as without viewing angle because the viewing angle introduces only a constant factor -- independent of the scene).
Recognition of Polyhedral Objects Polygonal objects • Choose an edge as the basis, record (model, basis edge) in the hash table. • Preprocessing and recognition is O(n2). [1]
Comparisons (1/2) • With alignment method. • Use exhaustive enumeration of all possible pairs in the objects and the images. • Geometric hashing can process all models simultaneously, while the alignment method processes models sequentially. • The alignment method does not require any additional memory, while geometric hashing requires a large memory to store hash table. • Geometric hashing more efficient if: • The scene contains enough features (6-10) for efficient recognition by voting. • There are many models.
Comparisons (2/2) • With Generalized Hough Transform (GHT). • GHT quantizes all possible (continuous) transformations between the model and the scene into a set of bins, while • Geometric Hashing quantizes just the (discrete) transformation represented by the basis.
Summary • Ability to recognize objects that have undergo an arbitrary transformation. • Can perform partial matching. • Efficient and can be parallelized easily. • Use transformation-invariant access key to the hash table. • Two phases (preprocessing and recognition). • Require a large memory to store hash table.
References • [1] Yehezkel Lamdan and Haim J. Wolfson, Geometric Hashing: A General and Efficient Model-Based Recognition Scheme, ICCV, 1988.