350 likes | 759 Views
Computer Vision Outline What is Computer Vision Introduction Overview of computer vision How is it being used Products and current use Doing Computer Vision What tools/software are needed Image acquisition - how you get the image - information about image acquisition .
E N D
Outline • What is Computer Vision • Introduction • Overview of computer vision • How is it being used • Products and current use • Doing Computer Vision • What tools/software are needed • Image acquisition - how you get the image - information about image acquisition. • Image conversion - converting the image into a usable format • Image processing image analysis- tweaking pixels • Tools • Open Source Computer Vision Library • RoboRealm
CV- Introduction • Computer vs Human Vision • We want robots to see like humans • Why? • Because a sighted robot is more adaptable and useful than a non sighted robot. • For example: “Imagine that a robot is needed to pick up and use bolts that are lying loose in a box. The bolts can be at any orientation with respect to the gripper and only a sighted robot will have the ability to adapt to this random configuration. A robot with no vision system would have to be pre-programmed by somebody to know exactly where every bolt was lying in the box—not a very cost-effective solution”
CV- Introduction • Want computer to see like a human • Human vision is very complex • Human eye is composed of a flexible lens, and muscles that can stretch the lens. Muscles stretch and contract automatically to focus on image. • Regardless of the distance we do not need to manually focus • Human eye has cones which are sensitive to wavelengths; allows humans to see color and light intensity • Computers have none of these items. • A serious challenge to get computers to replicate the human retina.
CV - Overview • Computer Vision is the process of capturing pixels • Converting those pixels into a format readable by the PC 11111111 1111111 11111111
CV – Overview • Process image features such as color, edges, and movement. • We will discuss each of the above points later in the lecture
CV-Products and Uses • Medicine • Ultrasound • Security • www.sarnoff.com/security/security.html • Facial Recognition for bars • http://www.engadget.com/2006/02/28/biobouncer-facial-recognition-system-for-bars-clubs/
CV-Products and Uses • Eye Tracking • Games • Eyetoy • Traffic Analysis • And more
What tools/software are needed • Need image acquisition software - enables graphics software to communicate with image capture devices • VFW – video for windows • Windows Image Acquisition WIA? • MatLab image acquisition toolkit • TWAIN • Check supporting image formats • Need an image acquisition device • Camera • Scanner • Video
2D Image Acquisition • Capturing the image • This is either through saved images stored on your system in various formats such as .jpg, .tiff’s etc • This is good for testing • Video capture – Capturing live video feed using an item such as a video camera. • Much more practical
2D Image Acquisition • Methods of image acquisition • Cameras • Scanners • Phones • Regions of Interest • Areas of interest within the picture • Apply specific image processing techniques to ROI’s • Image Acquisition may be the easiest of all the steps but is also one of the most important. A bad image can make image processing much more difficult
Image Conversion • Images must be digitized • Continuous to Discrete • Sampling process • From Wikipedia Defines the number of samples per second taken from a continuous signal to make a discrete signal. • Image Resolution - Number of pixels in image per inch dpi/ppi • More pixels per inch better image
Image Conversion • Digitizing images cont. • The image is divided into rows (W) and columns (H) • 2048(W)X1536(H) = 3,145,728 pixels or 3.1 Mega pixels • Convert into 8 bit digital color values • RGB • For most applications an image is converted into 24 bits • The first 8 bits for the redness (0-255) (R) • The second 8 bits for the greenness (0-255) (G) • The last 8 bits for the blueness (0-255) (B) • Black and white • 16 shades of grey • White black and 14 intermediate steps
Image Conversion • Picture to numbers • Picture a two dimensional array with the each cell having an (x, y) coordinate with a RGB value at that coordinate. The RGB decimal value is converted to binary format
CV - Image Processing • Probably most time consuming an difficult • Image processing involves taking the machine representation of the image and having the machine recognize various features • This is done with any number of techniques • These techniques will be described in the next section • Features • color - does the object have a unique color (i.e. neon green, bright purple, etc) • intensity - is the object brighter or darker than other objects in the image • object location - is the object always in the top of image, right corner of image, etc. • movement - does the object move in a specific way, i.e. does it wiggle, sway, move slowly, stationary • texture/pattern - does the object have a unique texture (i.e. tree bark, red bricks, pebble stones) • edges - does the object have well defined edges that are straight or circular • structure - given simpler blobs or parts of the image can the object be composed of simpler objects arranged in a specific manner?
CV – Image Processing • Image processing is in many cases concerned with taking one array of pixels as input and producing another array of pixels as output which in some way represents an improvement to the original array • This process • may remove noise • improve the contrast of the image • remove blurring caused by movement of the camera during image acquisition • it may correct for geometrical distortions caused by the lens.
CV – Image Processing • Two broad categories • Real Space • methods -- which work by directly processing the input pixel array. • Fourier Space • methods -- which work by firstly deriving a new representation of the input data by performing a Fourier transform, which is then processed, and finally, an inverse Fourier transform is performed on the resulting data to give the final output image.
CV - Image Processing Real Space Methods • Edge Detection – characterize boundaries • Edges in images are areas with strong intensity contrasts a jump in intensity from one pixel to the next . • Edge detection filters out useless information, while preserving the important structural properties in an image
CV - Image Processing RSM • The idea with edge detection is to go through every pixel and compare it to its neighbors. • If it is larger than its neighbor by a certain threshold than we have an edge and should be turned to white otherwise black.
CV - Image Processing RSM • An Edge detection algorithm • For every pixel ( i , j ) on the source bitmap • Extract the (R,G ,B) components of this pixel, its right neighbor (R1,G1,B1), and its bottom neighbor (R2,G2,B2) • Compute D(C,C1) and D(C,C2) using (R1) • If D(C,C1) OR D(C,C2) superior to a parameter K, then we have an edge pixel !
CV - Image Processing RSM • Various edge detection techniques using matrix convolution. • Convolution filter (note not necessarily an edge detection technique) • http://www.roborealm.com/help/Convolution.php • Sobel Edge • http://www.roborealm.com/help/Sobel.php • Prewitt Edge • http://www.roborealm.com/help/Prewitt.php • Roberts Edge • http://www.roborealm.com/help/Roberts_Edge.php
CV - Image Processing RSM • Canny edge detection algorithm known to many as the optimal edge detector • Intentions are to enhance many edge detectors already described • A list of criteria to improve current method • First criterion • Low error rate –No false positives and vice versa • Second criterion • Edges be localized - the distance between the edge pixels as found by the detector and the actual edge is to be at a minimum • Third criterion • One response to an edge
CV - Image Processing RSM • Canny edge detection cont. • Based on these criteria, the canny edge detector first smoothes the image to eliminate and noise. • It then finds the image gradient to highlight regions with high spatial derivatives. • The algorithm then tracks along these regions and suppresses any pixel that is not at the maximum (nonmaximum suppression). • The gradient array is now further reduced by hysteresis. • Hysteresis is used to track along the remaining pixels that have not been suppressed. • Hysteresis uses two thresholds and if the magnitude is below the first threshold, it is set to zero (made a nonedge). • If the magnitude is above the high threshold, it is made an edge. • if the magnitude is between the 2 thresholds, then it is set to zero unless there is a path from this pixel to a pixel with a gradient above T2. • http://www.pages.drexel.edu/~weg22/can_tut.html
CV - Image Processing RSM • Matrix convolution is also used to filter out noise, sharpen images, embossing, contrasting etc. • There a plenty of basic filter kernels to perform each of the above operations. • Check the references to find out more
CV - Image Processing RSM • Issues with matrix convolution • Requires an astronomical number of computations. • Reducing the size of the filter helps 3X3 matrixes are common • Decompose the convolution matrix of the filter into a product of an horizontal vector and a vertical vector which we saw in a few of the algorithims • FFT convolution – Discussed later
CV – Image Processing RSM • Color extraction • Can be done doing pixel comparison • Useful when looking for a specific color for example a green ball • Similar to the first edge filter algorithm • For every pixel ( i , j ) on the source bitmap • Extract the C = (R,G ,B) components of this pixel. • Compute D(C,C0) using (R1) • If D(C,C0) inferior to a parameter K, we found a pixel which color's matches the color we are looking for. We mark it in white. Otherwise we leave it in black on the output bitmap.
CV – Image Processing Fast Fourier Transformation • A color wave can be recorded as a set of values measured at equal spaced distances apart or at equivalent spatial frequencies • Each of these frequency values is referred to as a frequency component • A two-dimensional array of an image can be described as spatial frequencies. • A given frequency component now specifies what contribution is made by data which is changing with specified x and y direction spatial frequencies.
CV – Image Processing FFT • Frequencies • High vs Low • High frequency means data is changing rapidly in short distance e.g. a page of text. • Low frequency means large scale features are more important e.g. a single simple object that occupies most of the image
CV – Image Processing • Motion Detection and Tracking
CV – Image Processing • Shape recognition
CV – Tools • Open Source Computer Vision Library • RoboRealm
CV - Resources • Computer Vision article • Good high level summary to get you started with CV • RoboRealm • Great place to start for computer vision • Lots of examples and a simple tutorial • Vision Systems book online • Very good resource on the topics mentioned in this lecture and more • Image Processing Fundamentals • Similar to vision systems book, however it is not as in depth
CV - Resources • An introduction to digital image processing • A must read into understanding various methods used for image processing • Gives algorithms and C code • Edge detection tutorial • Canny edge detection tutorial • There are many more resources however these will be the most useful