290 likes | 648 Views
Object Detection. 01 – Advance Hough Transformation JJCAO. Line and curve detection. The HTis a standard tool in image analysis that allows recognition of global patterns in an image space by recognition of local patterns in a transformed parameter space.
E N D
Object Detection 01 – Advance Hough Transformation JJCAO
Line and curve detection • The HTis a standard tool in image analysis that allows recognition of global patterns in an image space by recognition of local patterns in a transformed parameter space. • HT: Elegant method for direct object recognition • Edges need not be connected • Complete object need not be visible • Key Idea: Edges VOTE for the possible model Detect partially occluded lines
HT for Lines • polar representation of lines Parameter space Image space (b,m) (x,y) y=mx+b
Hough_Grd • Recall: when we detect an edge point, we also know its gradient direction • But this means that the line is uniquely determined! • Modified Hough transform: • For each edge point (x,y) θ = gradient orientation at (x,y)ρ = x cos θ + y sin θ A(θ, ρ) = A(θ, ρ) + 1end Θ=[0-360] so there is a conversion
Hough transform for circles r y (x,y) x x y image space Hough parameter space
f1 f2 . . . fm (r11,a11),(r12,a12),…,(r1n1,a1n1) (r21,a21),(r22,a12),…,(r2n2,a1n2) . . . (rm1,am1),(rm2,am2),…,(rmnm,amnm) fj aj rj fi ri ai Generalizing the H.T. • Suppose, there were m different gradient orientations: (m <= n) (xc,yc) Pi xc = xi + ricos(ai) yc = yi + risin(ai) R-table
Generalized Hough Transform Find Object Center given edges Create Accumulator Array Initialize: For each edge point For each entry in table, compute: Increment Accumulator: Find Local Maxima in • Assumption: translation is the only transformation here, i.e., orientation and scale are fixed
Voting schemes • Let each feature vote for all the models that are compatible with it • Hopefully the noise features will not vote consistently for any single model • Missing data doesn’t matter as long as there are enough features remaining to agree on a good model
visual codeword withdisplacement vectors training image Application in recognition • Instead of indexing displacements by gradient orientation, index by “visual codeword” Combined Object Categorization and Segmentation with an Implicit Shape Model_ECCV04 Object Detection Using a Max-Margin Hough Transform_CVPR09
Application in recognition • Instead of indexing displacements by gradient orientation, index by “visual codeword” test image Combined Object Categorization and Segmentation with an Implicit Shape Model_ECCV04 Object Detection Using a Max-Margin Hough Transform_CVPR09
Implicit shape models: Training • Build codebook of patches around extracted interest points using clustering (more on this later in the course)
Implicit shape models: Training • Build codebook of patches around extracted interest points using clustering • Map the patch around each interest point to closest codebook entry
Implicit shape models: Training • Build codebook of patches around extracted interest points using clustering • Map the patch around each interest point to closest codebook entry • For each codebook entry, store all positions it was found, relative to object center
Implicit shape models: Testing • Given test image, extract patches, match to codebook entry • Cast votes for possible positions of object center • Search for maxima in voting space • Extract weighted segmentation mask based on stored masks for the codebook occurrences
Implicit shape models: Details • Supervised training • Need reference location and segmentation mask for each training car • Voting space is continuous, not discrete • Clustering algorithm needed to find maxima • How about dealing with scale changes? • Option 1: search a range of scales, as in Hough transform for circles • Option 2: use scale-covariant interest points • Verification stage is very important • Once we have a location hypothesis, we can overlay a more detailed template over the image and compare pixel-by-pixel, transfer segmentation masks, etc.
Hough transform: Discussion • Pros • Can deal with non-locality and occlusion • Can detect multiple instances of a model • Some robustness to noise: noise points unlikely to contribute consistently to any single bin • Cons • Complexity of search time increases exponentially with the number of model parameters • Non-target shapes can produce spurious peaks in parameter space • It’s hard to pick a good grid size • Hough transform vs. RANSAC vs. Geometric hashing On Geometric Hashing and the generalized hough transform_tsmc94
Detection of multiple object instances Victor Lempitsky Yandex company Moscow Visual Geometry Group, University of Oxford – postdoc PushmeetKohli Machine Learning and Perception Microsoft Research Cambridge • Slides from CVPR 2010 [zip] • Talk at CVPR 2010 [link] • C++ code for pedestrians detection originalVisual Studio 2005 solutionor Linux Port by Dr. Rodrigo Benenson. • C++ code for lines detection the latest version, which is much faster and more accurate Olga Barinova Graphics&Media Lab Moscow State University Detection of multiple object instances using Hough transform_cvpr10
Major flaw of HT • Lacks a consistent probabilistic model • Does not allow hypotheses to explain away the voting elements • Maximum in Hough image corresponds to a correctly detected object • The voting elements that were generated by this object also cast votes for other hypotheses • The strength of those spurious votes is not inhibited => pseudo maximum • Various non-maxima suppression (NMS) heuristics have to be used to localized peaks in the Hough image, which involve specification and tuning of several parameters: • sweep-plane approach (Real-time line detection through an improved Hough transform voting scheme_pr08) • …