1 / 44

Discovering New Perspectives in Computer Vision with A Chessboard

Uncover the journey into computer vision through a unique chess project, exploring line drawings and edge detection techniques like Hough transform. Dive deep into feature detection methods and practical applications in building a chess system. Witness the evolution of computer vision from the '60s to '80s, and explore the intricacies of transforming 2D lines to 3D space. Join the quest for enhancing human-computer interaction with visual intelligence.

truong
Download Presentation

Discovering New Perspectives in Computer Vision with A Chessboard

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding A Chessboard: An Introduction To Computer Vision Martin C. Martin

  2. Motivation & Inspiration • A spare time project, because I thought it would be fun • Project: Chess against the computer, where board and your pieces are real, it’s pieces are projected • Working so far: very robust localizing of chessboard (full 3D location and orientation)

  3. Requirements • Interaction should be as natural as possible • E.g. pieces don’t have to be centered in their squares, or even completely in them • Should be easy to set up and give demonstrations • Little calibration as possible • Work in many lighting conditions • Although only with this board and pieces • Camera needs to be at angle to board • Board made by my father just before he met my mother • My brother and I learned to play on it • I’m not a big chess fan, I just like project

  4. Computer Vision ’70s to ’80s: Feature Detection • Initial Idea: Corners are unique, look for them • Compute the “cornerness” at each pixel by adding the values of some nearby pixels, and subtracting others, in this pattern: • Constant Image (e.g. middle of square): output = 0 • Edge between two regions (e.g. two squares side-by-side): output = 0 • Where four squares come together: output max (+ or -) + + + - - - + + + - - - + + + - - - - - - + + + - - - + + + - - - + + +

  5. Corner Detector In Practice • Actually, the absolute value of the output

  6. Problems With Corner Detection • Edge effects • Strong response for some non-corners • Easily obscured by pieces, hand • How to link them up when many are obscured? • Go back to something older: Find edges

  7. Computer Vision ’60s toEarly ’70s: Line Drawings

  8. Edge Finding • Very common in early computer vision • Early computers didn’t have much power • Very early: enter lines by hand • A little later: extract line drawing from image - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + • Basic idea: for vertical edge, subtract pixels on left from pixels on right • Similarly for horizontal edge

  9. Edge Finding • Need separate mask for each orientation? • No! Can compute intensity gradient from horizontal & vertical gradients • Think of intensity as a (continuous) function of 2D position: f(x,y) • Rate of change of intensity in direction (u, v) is • Magnitude changes as cosine of (u, v) • Magnitude maximum when (u, v) equals (∂f/∂x, ∂f/∂y) • (∂f/∂x, ∂f/∂y) is called the gradient • Magnitude is strength of line at this point • Direction is perpendicular to line

  10. Gradient

  11. Localizing Line • How do we decide where lines are & where they aren’t? • One idea: threshold the magnitude • Problem: what threshold to use? Depends on lighting, etc. • Problem: will still get multiple pixels at each image location

  12. Laplacian • Better idea: find the peak • i.e. where the 2nd derivative crosses zero • Image shows magnitude • One ridge is positive, one negative

  13. Gradient: Before And After Suppression

  14. Extracting Whole Lines • So Far: intensity image  “lineness” image • Next: “lineness” image  list of lines • Need to accumulate contributions from across image • Could be many gaps • Want to extract position & orientation of lines • Boundaries won’t be robust, so consider lines to run across the entire image

  15. Parameterize lines by angle and distance to center of image Discretize these and create a 2D grid covering the entire range Each unsuppressed pixel is part of a line Perpendicular to the gradient Add it’s strength to the bin for that line d θ Hough Transform

  16. Hough Transform Distance To Center Angle

  17. Take The 24 Biggest Lines

  18. Demonstration

  19. Coordinate Systems • In 2D (u, v) (i.e. on the screen): • Origin at center of screen • u horizontal, increasing to the right • v vertical, increasing down • Maximum u and v determined by field of view. • In 3D (x, y, z) (camera’s frame): • Origin at eye • Looking along z axis • x & y in directions of u & v respectively

  20. 2D  3D • Perspective projection • (u, v): image coordinates (2D) • (x, y, z): world coordinates (3D) • Similar Triangles • Free to assume virtual screen at d = 1 • Relation: u = x/z, v = y/z y v d z

  21. Lines in 3D map to lines on screen • Proof: the 3D line, plus the origin (eye), form a plane. • All light rays from the 3D line to the eye are in this plane. • The intersection of that plane and the image plane form a line

  22. From 2D Lines To 3D Lines • If a group of lines are parallel in 3D, what’s the corresponding 2D constraint? • Equation of line in 2D: Au + Bv + C = 0 • Substituting in our formula for u & v: • Ax/z + By/z + C = 0 • Ax + By + Cz = 0 • A 3D plane through the origin containing the 3D line • Let L = (A, B, C) & p = (x, y, z). Then L•p = 0

  23. Recovering 3D Direction • Let’s represent the 3D line parametrically, i.e. as the set of p0+td for all t, where d is the direction. • The 9 parallel lines on the board have the same d but different p0. • For all t: L•(p0+td) = L•p0 + t L•d = 0 • Since p0 is on the line, L•p0 = 0. • Therefore, L•d = 0, i.e. Axd + Byd + Czd = 0 • That is, the point (xd, yd, zd) (which is not necessarily on the 3D line) projects to a point on the 2D line • d is the same for all 9 lines in a group; so it must be the common intersection point, i.e. the vanishing point • Knowing the 2D vanishing point gives us the full 3D direction!

  24. Vanishing Point

  25. Viewing Direction Ames Room

  26. Finding The Vanishing Point • Want to find a group of 9 2D lines with a common intersection point • For two lines A1u+B1v+C1 = 0 and A2u+B2v+C2 = 0, intersection point is on both lines, therefore satisfies both linear equations: • In reality, only approximately intersect at a common location

  27. Representing the Vanishing Point • Problem: If camera is perpendicular to board, 2D lines are parallel  no solution • Problem: If camera is almost perpendicular to board, small error in line angle make for big changes in intersection location • Euclidean distance between points is poor measure of similarity

  28. Desired Metric • Sensitivity Analysis for Intersection Point: • Of the form u’/w, v’/w. If -1 ≤ A,B,C ≤ 1, then -2 ≤ u’, v’, w ≤ 2 • So, the only way for the intersection point to be large is if w is small; the right metric is roughly 1/w • A representation with that property is…

  29. Homogeneous Coordinates • Introduced by August Ferdinand Möbius • An example of projective geometry • Represent a 2D point using 3 numbers • The point (u, v, w) corresponds to u/w, v/w • Formula for perspective projection means any 3D point is already the homogeneous coordinate of its 2D projection • Multiplying by a scalar doesn’t change the point: • (cu, cv, cw) represents same 2D point as (u, v, w)

  30. Homogeneous Coordinates w=1 (u, v, 1)

  31. Project Intersection Point Onto Unit Sphere

  32. Lines Become “Great Circles”

  33. Clustering: Computer Vision In The Nineties • 60s and 70s: Promise of human equivalence right around the corner • 80s: Backlash against AI • Like an “internet startup” now • 90s: Extensions of existing engineering techniques • Applied statistics: Bayesian Networks • Control Theory: Reinforcement Learning

  34. K Means

  35. 2D to 3D: Distance • How do we get the distance to the board? From the spacing between lines. • While 3D lines map to 2D lines, points equally spaced along a 3D line AREN’T equally spaced in 2D:

  36. Let c = (0, 0, zc) be the point were the z axis hits the board For each line in group 1, we find the 3D perpendicular distance from c to the line, along d2 These should be equally spaced in 3D Find the common difference, like Millikan oil drop expr. Distance To Board

  37. Distance To Board • Let Li = (Ai, Bi, Ci) be the ith line in group 1. • We want to find ti such that the point c + ti d2 is on Li, i.e. • Li•(c + ti d2) = 0 • Li•c + ti Li•d2 = 0 • ti = - Li•c / Li•d2 = - Cizc / Li•d2 • zc is the only unknown, so put that on the left • Let si = ti/zc = -Ci / Li•d2 • If we choose our 3D units so that the squares are 1x1, then the common difference is 1/zc

  38. Distance To Board • So, given the si, we need to find t0 and zc such that: • si = ti/zc = (i + t0) / zc, i = 0…8 • But, we have outliers & occasional omission • So we use robust estimation

  39. Robust Estimation • Many existing parameter estimation algorithms optimize a continuous function • Sometimes there’s a closed form (e.g. MLE of center of Gaussian is just the sample mean) • Sometimes it’s something more iterative (e.g. Newton-Rhapson) • However, these are usually sensitive to outliers • Data cleaning is often a big part of the analysis • The reason why decision trees (which are extreemly robust to outliers) are the best all-round “off-the-shelf” data mining technique

  40. Robust Estimation • Find distance (and offset) to maximize score: • Involves evaluating on a fine grid • Isn’t time consuming here, since the number of data points is small

More Related