Robot\Machine Vision

Robot\Machine Vision Cherevatsky Boris

Automatic understanding of images and videos by a computer (which could be plugged on the robot or standalone). Computing properties of the 3D world out of visual data. Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation) What is a Computer vision ?

Multi-view stereo forcommunity photo collections Real-time stereo Structure from motion Some applications: NASA Mars Rover Pollefeys et al. Goesele et al.

Some applications: Objects Activities Scenes Locations Text / writing Faces Gestures Motions Emotions… amusement park sky The Wicked Twister Cedar Point Ferris wheel ride ride 12 E Lake Erie water ride tree tree people waiting in line people sitting on ride umbrellas tree maxair carousel deck bench tree pedestrians

3D Reconstruction: Given many images of a certain scene we can use computer vision algorithms to reconstruct the 3D model.

Connection to other disciplines : Graphics Algorithms Robotics Artificial intelligence Image processing Mathematics Machine learning Computer vision

width 520 j=1 i=1 500 height I(176,201) = 164 I(194,203) has value 37 Image representation on Computer: Intensity : [0,255]

Color images, RGB color space : B R G

Image formation – Pinhole Camera: • Pinhole camera is a simple model to approximate imaging process, perspective projection. Image plane Virtual image pinhole If we treat pinhole as a point, only one ray from any given point can enter the camera.

Perspective Projection • Far away objects appear smaller

Perspective Projection

Mathematical Equations

Perspective Projection & Calibration

Perspective projection Intrinsic parameters: from idealized world coordinates to pixel values W. Freeman

Intrinsic parameters But “pixels” are in some arbitrary spatial units W. Freeman

Intrinsic parameters Maybe pixels are not square W. Freeman

Intrinsic parameters We don’t know the origin of our camera pixel coordinates W. Freeman

Intrinsic parameters May be skew between camera pixel axes W. Freeman

Intrinsic parameters, homogeneous coordinates Using homogenous coordinates, we can write this as: or: In pixels In camera-based coords W. Freeman

Non-homogeneous coordinates Homogeneous coordinates Extrinsic parameters: translation and rotation of camera frame W. Freeman

pixels World coordinates Camera coordinates Combining extrinsic and intrinsic calibration parameters, in homogeneous coordinates Intrinsic Extrinsic Forsyth&Ponce W. Freeman

Edge Detection

גילוי שפות - Edge Detection מפת שפות של התמונה עיבוד תמונות ואותות במחשב

גילוי שפות - Edge Detection • נתייחס לתמונה כאל פונקציה רציפה f(x,y) . • הגרדיאנט של פונקציה זו: • כיוון הגרדיאנט מציין את הכיוון שבו רמות האפור משתנות באופן מכסימלי. גודל הגרדיאנט הוא ערך השיפוע המכסימלי. עיבוד תמונות ואותות במחשב

הגרדיאנט - דוגמא עיבוד תמונות ואותות במחשב

הגרדיאנט – דוגמא – המשך >> i = double(imread('cameraman.tif')); >> gradFilt = [-1 0 1 ; -2 0 2 ; -1 0 1]/2; >> grad_x = imfilter(i , gradFilt , 'same' , 'replicate'); >> grad_y = imfilter(i , gradFilt' , 'same' , 'replicate'); >> [x,y] = meshgrid([1:size(i,2)] , [1:size(i,1)]); >> figure; imshow(i , []); hold on; >> quiver(x , y , grad_x , grad_y , 3 , 'm' , 'LineWidth' , 1); עיבוד תמונות ואותות במחשב

הגרדיאנט – דוגמא נוספת rice.png עיבוד תמונות ואותות במחשב

-1 0 1 1 1 2 1 1 1 -1 0 1 -1 0 1 0 0 0 0 0 0 -2 0 2 -1 0 1 -1 -1 -2 -1 -1 -1 -1 0 1 קירוב הגרדיאנט של התמונה • על מנת לחשב את הגרדיאנט יש צורך לחשב נגזרת בכיוון x ו- y: מסנן לחשוב נגזרת בכיוון y מסנן לחשוב נגזרת בכיוון x מסנן לחשוב נגזרת בכיוון x מסנן לחשוב נגזרת בכיוון y sobel prewitt עיבוד תמונות ואותות במחשב

גילוי שפות - Edge Detection עיבוד תמונות ואותות במחשב

מהו ערך טוב לבחירת הסף? T = 100 T = 70 T =40 T = 20 T = 10 T = 2 עיבוד תמונות ואותות במחשב

גילוי שפות ע"י Canny • אלטרנטיבה: Canny: • E=edges(I,’canny’) • עקרונות: • בוחרים אך ורק נקודות שהן "מקסימום מקומי" בעוצמת הגרדיאנט • בוחרים גרדיאנטים חלשים רק אם הם מחוברים לגרדיאנטים חזקים עיבוד תמונות ואותות במחשב

Robot\Machine Vision