Advanced Computer Vision

Advanced Computer Vision Lecture 05

Pedestrian Detection • Finding People in Images and Videos • Navneet DALAL • http://lear.inrialpes.fr/people/dalal/NavneetDalalThesis.pdf • Chapter 4: Histogram of Oriented Gradients Based Encoding of Images • http://lear.inrialpes.fr/pubs/2005/DT05/cvpr2005_talk.pdf • Histograms of Oriented Gradients for Human Detection http://lear.inrialpes.fr/pubs/2005/DT05/hog_cvpr2005.pdf

Datasets • INRIA Person Dataset • http://pascal.inrialpes.fr/data/human/ • Computer Vision Datasets • http://clickdamage.com/sourcecode/cv_datasets.html

Today’s Topics • Schedule • Motion Detection • Histogram Matching • Moments • Homework Assignment • Exam 1 on Thursday • Closed book • Exam 2 – Take home • February 10th, 15th and 17th • Technically evaluation each team’s project • Rate for solution approach • Performance of solution – correct vs incorrect • Rate each member of your own team • Rank all team’s projects

Non-stationary Camera • Example: A camera panning a scene • One approach is to register the adjacent frames • Find key points in adjacent frames • Determine offset • Adjust images so that they overlap • Take difference

Panning a Building Complex

Overall approach

Points of Interest

Correspondence

Difference

Two Sequential Frames - Color

Two Sequential Frames -Grayscale

abs(frame11-frame10)

Homework #3 • The frames were captured with a non-stationary camera • The goals are to align the two frames and demonstrate the alignment by calculating the absolute difference of the two frames (should be zero if the are no moving objects) • Approach: • Find key points using Harris Detector and SIFT • Match points using cross correlation • Determine the amount of shift in both the x and y direction • Shift one image relative to the second image • Calculate the difference • Submit the following (Jan 6th) • A write up describing your method and results. Compare Harris and SIFT approaches • Difference image should be included in writeup • Code

Region Description • Invariant to • Translation • Rotation • Scale changes • Binary images – shape only • Object of interest  1’s • Background  0’s • Gray scale images • Object appearance and shape

Goals • Recognize an object in another image by its shape independent of position, orientation and scale

Region Representation • Color Histogram used to represent a region of an image • No spatial information • Can be used for tracking – robust to changes in pose and shape • Template Window of pixel intensities • Template matching frame to frame • Assumes spatial arrangement does not change

Histograms • Image- two dimensional mapping I : x v (from pixels (x,y)’ to value v) Histogram hI(0) (b) = nb , b = 1,2…B B is the number of bins in the histogram nb is number of pixels in bth bin

Similarity between Histograms • Similarity between histogram bins: ρn(nb, n’b) = min(nb, n’b) / ∑nj j=1…B • Sum all the bin similarities ∑ρn(nb, n’b) Assuming both histograms have ∑nj j=1…B pixels M. Swain and D. Ballard. “Color indexing,” International Journal of Computer Vision, 7(1):11–32, 1991.

Histogram Intersection • A simple example: • g = [ 17, 23, 45, 61, 15]; (histogram bins) • h = [ 15, 21, 42, 51, 17]; • in=sum(min(h,g))/min( sum(h),sum(g)) • in = 0.9863

If Histograms Identical • g = 15 21 42 51 17 • h = 15 21 42 51 17 • >> in=sum(min(h,g))/min( sum(h),sum(g)) • in = 1

Different Histograms • h = 15 21 42 51 17 • g = 57 83 15 11 1 • >> in=sum(min(h,g))/min( sum(h),sum(g)) • in = 0.4315

Use Gray Scale for Example

Region and Histogram Similarity with itself: >>h = hist(q(:),256); >> g=h; >> in=sum(min(h,g))/min( sum(h),sum(g)) in = 1

>> r=236;c=236; >> g=im(1:r,1:c); >> g= hist(g(:),256); >> in=sum(min(h,g))/min( sum(h),sum(g)) in = 0.5474

Partial Matches >> g= hist(g(:),256); >> in=sum(min(h,g))/min( sum(h),sum(g)) in = 0.8014 in=sum(min(h,g))/min( sum(h),sum(g)) in = 0.8566

Lack of Spatial Information • Different patches may have similar histograms

in=sum(min(h,g))/min( sum(h),sum(g)) in = 1

Note • The examples can be easily extended to color images; rgb, hsv, etc. • 256 bins were used in the histograms • Reduced number of bins will allow for better matches of similar, but not identical patches • Too few bins will result in poor performances – too many mis-matches

Attempt to Include Spatial Information • Spatiograms Versus Histograms for Region-Based Tracking Stanley T. Birchfield Sriram Rangarajan

Moments Moments, Central moments and Invariant moments

Moments • 2D moment of order (p+q) of image f(x,y): mpq = ΣΣxp yq f(x, y) p,q = 0,1,2,… x y • Central moment: µpq = ΣΣ(x-x)p (y-y)q f(x, y) Where x = m10/ m00 and y = m01/ m00 • Normalized central moment: ηpq = µpq/µ γ00 where γ = {(p+q)/2}+1 _ _ _ _

M1 = (η20 + η02) M2 = (η20 − η02)2 + 4η211 M3 = (η30 − 3η12)2 + (3η21 − η03)2 M4 = (η30 + η12)2 + (η21 + η03)2 M5 = (η30 − 3η12)(η30 + η12)[(η30 + η12)2 − 3(η21 + η03)2] + (3η21 − η03)(η21 + η03)[3(η30 + η12)2 − (η21 + η03)2] M6 = (η20 − η02)[(η30 + η12)2 − (η21 + η03)2] + 4η11(η30 + η12)(η21 + η03) M7 = (3η21 − η03)(η30 + η12)[(η30 + η12)2 − 3(η21 + η03)2] − (η30 + 3η12)(η21 + η03)[3(η30 + η12)2 − (η21 + η03)2]

Images • Mean and Variance Consider: >> A = rand(20); >> B = rand(20); >> C1=[A;B]; >> C2=[B;A] >>figure, imshow(C1),colormap(hot),title('C1') >> figure, imshow(C2),colormap(hot),title('C2')

Mean? Variance?

>> mean(C2(:)) ans = 0.5064 >> var(C2(:)) ans = 0.0800 >> mean(C1(:)) ans = 0.5064 >> var(C1(:)) ans = 0.0800 CANNOT USE MEAN AND VARIANCE FOR RECOGNITION

Moment Invariants >> invMomentsC1=abs(log(invmoments(C1))) invMomentsC1 = 0.8741 2.7730 11.1135 12.6103 24.4960 15.2538 26.1970 >> invMomentsC2=abs(log(invmoments(C2))) invMomentsC2 = 0.8889 2.8229 10.2520 11.6699 23.2758 13.3174 22.7919

Image Creation im1 = imread('CornellClock.jpg'); im1rgb = double(rgb2gray(im1)); figure, imshow(im1rgb, [ ]), title('Cornell Clock') im1rgbCrop1 = im1rgb(450:549,825:924); figure, imshow(im1rgbCrop1, [ ]), title('Cornell Clock Cropped') im1rgbCrop1Sm = im1rgbCrop1(1:2:end, 1:2:end); im1rgbCrop1r45 = imrotate(im1rgbCrop1,45,'bilinear'); im1rgbCrop1flr = fliplr(im1rgbCrop1); figure, imshow(im1rgbCrop1r45, [ ]), title('45 Degree Rotation') figure, imshow(im1rgbCrop1flr, [ ]), title('Clock Flipped') figure, imshow(im1rgbCrop1Sm, [ ]), title('Small Clock')

phi 1 phi 2 phi 3 phi 4 phi 5 phi 6 phi 7 6.3014 17.8100 23.9168 23.0700 47.7443 32.4888 46.6251 6.3016 17.8126 23.9169 23.0717 47.7114 32.4889 46.6324 6.3014 17.8100 23.9168 23.0700 47.7443 32.4888 46.7308 6.2892 17.8693 23.8802 23.0783 47.6431 32.5994 46.6333

phi 1 phi 2 phi 3 phi 4 phi 5 phi 6 phi 7 6.6675 19.2909 26.4725 25.9716 52.4851 35.7716 52.7540 6.6677 19.2901 26.4667 25.9741 52.4872 35.7738 52.7522 6.6675 19.2909 26.4725 25.9716 52.4851 35.7716 52.8475 6.6682 19.2976 26.3441 25.9768 52.3908 35.7817 52.7870

6.5112 14.7902 23.4190 26.5393 51.6422 34.5953 52.2770 6.5112 14.7902 23.4191 26.5393 51.6424 34.5953 52.2770 6.5113 14.7897 23.4136 26.5266 51.6467 34.5529 52.1718

6.7102 16.2675 23.5469 26.5158 52.3896 35.0203 51.7451 6.7101 16.2675 23.5469 26.5158 52.3898 35.0204 51.7452 6.7107 16.2645 23.5353 26.4955 52.3316 34.9976 51.7141

Invariant Moment Summary CORNELL CLOCK REGIONS 6.3014 17.8100 23.9168 23.0700 47.7443 32.4888 46.6251 REGION 1 6.3016 17.8126 23.9169 23.0717 47.7114 32.4889 46.6324 6.3014 17.8100 23.9168 23.0700 47.7443 32.4888 46.7308 6.2892 17.8693 23.8802 23.0783 47.6431 32.5994 46.6333 6.6675 19.2909 26.4725 25.9716 52.4851 35.7716 52.7540 REGION 2 6.6677 19.2901 26.4667 25.9741 52.4872 35.7738 52.7522 6.6675 19.2909 26.4725 25.9716 52.4851 35.7716 52.8475 6.6682 19.2976 26.3441 25.9768 52.3908 35.7817 52.7870 PEOPLE 6.5112 14.7902 23.4190 26.5393 51.6422 34.5953 52.2770 PERSON 1 6.5112 14.7902 23.4191 26.5393 51.6424 34.5953 52.2770 6.5113 14.7897 23.4136 26.5266 51.6467 34.5529 52.1718 6.7102 16.2675 23.5469 26.5158 52.3896 35.0203 51.7451 PERSON 2 6.7101 16.2675 23.5469 26.5158 52.3898 35.0204 51.7452 6.7107 16.2645 23.5353 26.4955 52.3316 34.9976 51.7141

Useful Papers • Distinctive Image Features from Scale-Invariant Keypoints - David G. Lowe • Histograms of Oriented Gradients for Human Detection- Navneet Dalal and Bill Triggs • Finding People in Images and Videos-Navneet Dalal (PhD) • Image Description using Scale-Space Edge Pixel Directions Histogram Ant´onio M. G. Pinheiro

Advanced Computer Vision