1k likes | 1.03k Views
Model fitting. ECE 847: Digital Image Processing. Stan Birchfield Clemson University. Three main questions: what object represents this set of tokens best? which of several objects gets which token? how many objects are there?
E N D
Model fitting ECE 847:Digital Image Processing Stan Birchfield Clemson University
Three main questions: what object represents this set of tokens best? which of several objects gets which token? how many objects are there? (you could read line for object here, or circle, or ellipse or...) Fitting • Choose a parametric object/some objects to represent a set of tokens • Most interesting case is when criterion is not local • cannot tell whether a set of points lies on a line by looking only at each point and the next. D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Line fitting • Least squares: • Represent line as y=mx+b • Stack equations yi=mxi+b for all i=1,...,N with N points • Rewrite in matrix notation: Az=dwhere z=(m,b) are the unknown parameters, and A and d collect the known (xi,yi) coordinates • Solve Az=d using least squaresResult is same as z=(ATA)-1ATd, but in practice using Gaussian elimination (must faster than computing inverse) • Result minimizes ||Az-d||, which is vertical distance • Alternative: • Represent line as ax+by+c=0 • Stack equations, write in matrix notation: Au=0where u=(a,b,c) are unknown parameters • This is homogeneous equation, so solve using SVD • right singular vector associated with smallest singular value gives result of u that minimizes ||Au|| subject to ||u||=1 • Result minimizes perpendicular distance to line • Either way, it is best to first shift origin to centroid of points (normalization)
Other curves • Many 2D curves that can be represented using linear equations (in the coeff of the curve) • ax+by+c=0 • Conics: x’Ax=0 • includes parabola, hyperbola, ellipses • Same procedure can be used for any of these
Different choices of , r>0 give different lines For any token (x, y) there is a one parameter family of lines through this point, given by(cos X + (sin Y = r Each point gets to vote for each line in the family; if there is a line that has lots of votes, that should be the line passing through the points Fitting and the Hough Transform • Purports to answer all three questions • We explain for lines • One representation: a line is the set of points (x, y) such that:(cos X + (sin Y = r D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Votes Tokens r: 0 to 1.55 Theta = 45º = 0.785 rad r = (1√2) / 2 = 0.707 Theta: 0 to 3.14 (rad) Brightest point = 20 votes D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
How many lines? count the peaks in the Hough array Who belongs to which line? tag the votes Hardly ever satisfactory in practice, because problems with noise and cell size defeat it Mechanics of the Hough transform • Construct an array representing , r • For each point, render the curve (, r) into this array, adding one at each cell • Difficulties • how big should the cells be? (too big, and we cannot distinguish between quite different lines; too small, and noise causes lines to be missed) D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Hough algorithm • initialize accumulator A(r,q) to zero for all r,q • for each (x,y) • if |I(x,y)| > threshold, • for each 0 ≤q < 2p, • compute r = (cos X + (sin Y • A(r,q) = A(r,q) + 1 • find peaks in A Note: Be sure to translate origin to center of image for best results
Hough transform image space line space • Note symmetry: • flip vertical • then slide by p Notice that (q,r) and (q+p,-r) are the same line That’s why we get two peaks Solution: Let 0 <= q < p
r Votes Tokens q Brightest point = 6 votes D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Noise Lowers the Peaks D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Noise Increases the Votes in Spurious Accumulator Elements D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Optimizations to the Hough Transform Noise: If the orientation of tokens (pixels) is known, only accumulator elements for lines with that general orientation are voted on. (Most edge detectors give orientation information.) Speed: The accumulator array can be coarse, then repeated in areas of interest at a finer scale. F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Real World Example Original Found Lines Edge Detection Parameter Space F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Circle Example The Hough transform can be used to fit points to any object that can be parameterized. Most common are circles, ellipses. With no orientation, each token (point) votes for all possible circles. With orientation, each token can vote for a smaller number of circles. F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Real World Circle Examples Crosshair indicates results of Hough transform, bounding box found via motion differencing. F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Finding Coins Original Edges (note noise) F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Finding Coins (Continued) Penny Quarters F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Finding Coins (Continued) Note that because the quarters and penny are different sizes, a different Hough transform (with separate accumulators) was used for each circle size. Coin finding sample images from: Vivik Kwatra F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Fitting other objects The Hough transform is closely related to template matching. The Hough transform can be used to fit points to any object that can be parameterized. Objects of arbitrary shape can be parameterized by building an R-Table. (Assumes orientation information for each token is available.) R and beta value(s) are obtained from the R-Table, based upon omega (orientation) F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Generalized Hough transform R-Table: Each edge point with gradient orientation indicates location of reference point reference point gradient orientation tangent vector radius Ballard and Brown, Computer Vision, 1982, p. 129
Generalized Hough transform Ballard and Brown, Computer Vision, 1982, p. 129
Conclusion • Finding lines and other parameterized objects is an important task for computer vision. • The (generalized) Hough transform can detect arbitrary shapes from (edge detected) tokens. • Success rate depends directly upon the noise in the edge image. • Downsides: Can be slow, especially for objects in arbitrary scales and orientations (extra parameters increase accumulator space exponentially). F. Dellaert, http://www.cc.gatech.edu/classes/AY2007/cs4495_fall/
Tensor voting • Another voting technique
Line fitting Line fitting can be max. likelihood - but choice of model is important D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Who came from which line? • Assume we know how many lines there are - but which lines are they? • easy, if we know who came from which line • Three strategies • Incremental line fitting • K-means (MacQueen 1967) • Probabilistic (later!) D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Robustness • As we have seen, squared error can be a source of bias in the presence of noise points • One fix is EM - we’ll do this shortly • Another is an M-estimator • Square nearby, threshold far away • A third is RANSAC • Search for good points D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Line fit to set of points D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
... with one outlier D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
... with a different outlier D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Zoom of previous – clearly a bad fit D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
M-estimators r(x;s)=x2/(s2+x2) D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
s is just right: noise is ignored D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Too small: all data is ignored D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Too large: noise influences outcome D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
Choose a small subset uniformly at random Fit to that Anything that is close to result is signal; all others are noise Refit Do this many times and choose the best Issues How many times? Often enough that we are likely to have a good line How big a subset? Smallest possible What does close mean? Depends on the problem What is a good line? One where the number of nearby points is so big it is unlikely to be all outliers RANSAC D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
In principle, an easy generalization The probability of obtaining a point, given a curve, is given by a negative exponential of distance squared In practice, rather hard It is generally difficult to compute the distance between a point and a curve Fitting curves other than lines D. Forsyth, http://luthuli.cs.uiuc.edu/~daf/book/bookpages/slides.html
K-means • Recall line fitting • Now suppose that you have more than one line • But you do not know which points belong to which line
Choose a fixed number of clusters (K) Choose cluster centers and point-cluster allocations to minimize error can’t do this by search, because there are too many possible allocations. Algorithm fix cluster centers; allocate points to closest cluster fix allocation; compute best cluster centers x could be any set of features for which we can compute a distance (careful about scaling) K-Means * From Marc Pollefeys COMP 256 2003
K-Means * From Marc Pollefeys COMP 256 2003
K-Means compute mean of data points assign data points to clusters S. Thrun, http://robots.stanford.edu/cs223b07/
Image Segmentation by K-Means • Select a value of K • Select a feature vector for every pixel (color, texture, position, or combination of these etc.) • Define a similarity measure between feature vectors (Usually Euclidean Distance). • Apply K-Means Algorithm. • Apply Connected Components Algorithm (to enforce spatial continuity). • Merge any components of size less than some threshold to an adjacent component that is most similar to it. * From Marc Pollefeys COMP 256 2003
Example Image Clusters on intensity Clusters on color S. Thrun, http://robots.stanford.edu/cs223b07/
Idea • Data generated from mixture of Gaussians • Latent (hidden) variables: Correspondence between Data Items and Gaussians S. Thrun, http://robots.stanford.edu/cs223b07/
Expectation-Maximization(Generalized K-Means) • Notice: • Given the mixture model, it’s easy to calculate the correspondence • Given the correspondence it’s easy to estimate the mixture models • K-Means involves • Model (hypothesis space): Mixture of N Gaussians • Latent variables: Correspondence of data and Gaussians • Replace hard assignments with soft assignments EM • EM is guaranteed to converge (EM steps do not decrease likelihood)