190 likes | 384 Views
Today we will move on to… Object Recognition. Pattern and Object Recognition. Pattern recognition is used for region and object classification, and represents an important building block of complex machine vision processes. No recognition is possible without knowledge.
E N D
Today we will move on to… • Object Recognition Computer Vision Lecture 13: Object Recognition I
Pattern and Object Recognition • Pattern recognition is used for region and object classification, and represents an important building block of complex machine vision processes. • No recognition is possible without knowledge. • Specific knowledge about both the objects being processed and hierarchically higher and more general knowledge about object classes is required. Computer Vision Lecture 13: Object Recognition I
Statistical Pattern Recognition • Object recognition is based on assigning classes to objects. • The device that does these assignments is called the classifier. • The number of classes is usually known beforehand, and typically can be derived from the problem specification. • The classifier does not decide about the class from the object itself — rather, sensed object properties called patterns are used. Computer Vision Lecture 13: Object Recognition I
Statistical Pattern Recognition Computer Vision Lecture 13: Object Recognition I
Statistical Pattern Recognition • For statistical pattern recognition, quantitative descriptions of objects’ characteristics (features or patterns) are used. • The set of all possible patterns forms the pattern space or feature space. • The classes form clusters in the feature space, which can be separated by discrimination hyper-surfaces. Computer Vision Lecture 13: Object Recognition I
Statistical Pattern Recognition Computer Vision Lecture 13: Object Recognition I
Statistical Pattern Recognition Computer Vision Lecture 13: Object Recognition I
Object Recognition • How can we devise an algorithm that recognizes certain everyday objects? • Problems: • The same object looks different from different perspectives. • Changes in illumination create different images of the same object. • Objects can appear at different positions in the visual field (image). • Objects can be partially occluded. • Objects are usually embedded in a scene. Computer Vision Lecture 13: Object Recognition I
Object Recognition • We are going to discuss an example for view-based object recognition. • The presented algorithm (Blanz, Schölkopf, Bülthoff, Burges, Vapnik & Vetter, 1996) tackles some of the problems that we mentioned: • It learns what each object in its database looks like from different perspectives. • It recognizes objects at any position in an image. • To some extent, the algorithm could compensate for changes in illumination. • However, it would perform very poorly for objects that are partially occluded or embedded in a complex scene. Computer Vision Lecture 13: Object Recognition I
The Set of Objects • The algorithm learns to recognize 25 different chairs: It is shown each chair from 25 different viewing angles. Computer Vision Lecture 13: Object Recognition I
The Algorithm Computer Vision Lecture 13: Object Recognition I
The Algorithm • For learning each view of each chair, the algorithm performs the following steps: • Centering the object within the image, • Detecting edges in four different directions, • Downsampling (and thereby smoothing) the resulting five images. • Low-pass filtering of each of the five images in four different directions. Computer Vision Lecture 13: Object Recognition I
The Algorithm • For classifying a new image of a chair (determining which of the 25 known chairs is shown), the algorithm carries out the following steps: • In the new image, centering the object, detecting edges, downsampling and low-pass filtering as done for the database images, • Computing the difference (distance) of the representation of the new image to all representations of the 2525 views stored in the database, • Determining the chair with the smallest average distance of its 25 views to the new image (“winner chair”). Computer Vision Lecture 13: Object Recognition I
x 1 2 3 4 5 1 2 y 3 4 5 The Algorithm Compute the center of gravity: • Centering the object within the image: • Binarize the image: Finally, shift the image content so that the center of gravity coincides with the center of the image. Computer Vision Lecture 13: Object Recognition I
Object Recognition • Detecting edges in the image: • Use a convolution filter for edge detection. • For example, a Sobel or Canny filter would serve this purpose. • Use the filter to detect edges in four different orientations. • Store the resulting four images r1, …, r4 separately. Computer Vision Lecture 13: Object Recognition I
Object Recognition • Downsampling the image from 256256 to 1616 pixels: • In order to keep as much of the original information as possible, use a Gaussian averaging filter that is slightly larger than 1616. • Place the Gaussian filter successively at 1616 positions throughout the original image. • Use each resulting value as the brightness value for one pixel in the downsampled image. Computer Vision Lecture 13: Object Recognition I
Object Recognition • Low-pass filtering the image: • Use the following four convolution filters: • Apply each filter to each of the images r0, …, r4. • For example, when you apply k1 to r1 (vertical edges), the resulting image will contain its highest values in regions where the original image contains parallel vertical edges. Computer Vision Lecture 13: Object Recognition I
Object Recognition • Computing the difference between two views: • For each view, we have computed 25 images (r0, …, r4 and their convolutions with k1, …, k4). • Each image contains 1616 brightness values. • Therefore, the two views to be compared, va and vb, can be represented as 6400-dimensional vectors. • The distance (difference) d between the two views can then be computed as the length of their difference vector:d = || va – vb || Computer Vision Lecture 13: Object Recognition I
Results • Classification error: 4.7% • If no edge detection is performed, the error increases to 21%. • We should keep in mind that this algorithm was only tested on computer models of chairs shown in front of a white background. • The algorithm would fail for real-world images. • The algorithm would require components for image segmentation and completion of occluded parts. Computer Vision Lecture 13: Object Recognition I