270 likes | 546 Views
Problem Session #7. EE368/CS232 Digital Image Processing. 1. Robustness of the SIFT Descriptor
E N D
Problem Session #7 EE368/CS232 Digital Image Processing
1. Robustness of the SIFT Descriptor In this problem, we investigate how well SIFT descriptors still match if an image is modified in a variety of ways, e.g., by adding a brightness offset, by adding noise, or by blurring the image. Ideally, feature matches should not be affected by any of the following image modifications. Please download the image hw7_building.jpg from the handouts webpage.
Part A: Extract SIFT features using the vl_sift function in the VLFeat library. Show the feature keypoints overlaid on top of the image using the vl_plotframe function. [f,d] = vl_sift(single(imgGray)); imshow(imgGray); hold on; vl_plotframe(f);
Part B: Add a brightness offset to the grayscale intensity (assumed to be in the range [0,255]), for the following offset values: ∆ = -100, -80, …, 80, 100. Compute SIFT descriptors at the same keypoints (locations/scales/orientations) as in the original image. You can use vl_sift with the ‘Frames’ option to input custom keypoints. Match the original image’s SIFT descriptors and the modified image’s SIFT descriptors in the 128-dimensional feature space using nearest-neighbor search with a distance ratio test as implemented in the vl_ubcmatch function (with default threshold 1.5). Measure “repeatability”, which is defined as the number of matching features divided by the number of features in the original image. Display and submit a plot of repeatability versus ∆, and comment on the SIFT descriptor’s robustness against brightness changes. [fo,do] = vl_sift(single(imgOffset), 'Frames', f); Modified Image (x,y,scale,angle) of Features in Original Image matches = vl_ubcmatch(do, d); Descriptors of Modified Image Descriptors of Original Image
Part C: Repeat part (b), except instead of changing the brightness, adjust the contrast of the image by raising the grayscale intensity to a power γ, for the following γ values: γ = 0.5, 0.75, …, 1.75, 2.0. Display and submit a plot of repeatability versus γ, and comment on the SIFT descriptor’s robustness against contrast changes.
Part D: Repeat part (b), except instead of changing the brightness, add zero-mean Gaussian noise with standard deviation σn, for the following standard deviation values: σn = 0, 5, …, 25, 30. Use the function randn to generate the noise. Display and submit a plot of repeatability versus σn, and comment on the SIFT descriptor’s robustness against additive noise.
Part E: Repeat part (b), except instead of changing the brightness, convolve the image with a Gaussian kernel of standard deviation σb, for the following standard deviation values: σb = 1, 2, …, 9, 10. Use the function fspecial to construct a Gaussian kernel of standard deviation with finite extent (10σb +1) x (10σb +1). Display and submit a plot of repeatability versus σb, and comment on the SIFT descriptor’s robustness against blurring. EOP
2. Recognition of Posters with Local Image Features When you visit a poster at a conference/meeting, it would be useful to be able to snap a picture of the poster and automatically retrieve the authors’ contact information and the corresponding publication for later review. Please download the images hw7_poster_1.jpg, hw7_poster_2.jpg, and hw7_poster_3.jpg from the handouts webpage. These query images show 3 different posters during a previous EE368/CS232 poster session. Also download hw7_poster_database.zip, which contains clean database images of all posters shown during that poster session.
For each query image, use the following algorithm to match to the best database image: • Extract SIFT features from the query image using vl_sift in the VLFeat library. • Match the query image’s SIFT features to every database image’s SIFT features using nearest-neighbor search with a distance ratio test as implemented in vl_ubcmatch. • From the feature correspondences that pass the distance ratio test, find the inliers using RANSAC with a homography as the geometric mapping. • Report the database image with the largest number of inliers after RANSAC as the best matching database image.
Please submit the following results for each query image: • A side-by-side view of the query image and the best matching database image. • A side-by-side view of the query image and best matching database image, with SIFT keypoints overlaid on each image. • A side-by-side view of the query image and the best matching database image with feature correspondences after the distance ratio test overlaid and connected by lines (like in the example shown above). • A side-by-side view of the query image and the best matching database image with feature correspondences after RANSAC overlaid and connected by lines. • A measurement of the amount of time required to match each query image to the database. Exclude the time spent on feature extraction. You can use the functions tic and toc.
% SIFT_MATCH Match two images using SIFT and RANSAC % % SIFT_MATCH demonstrates matching two images based on SIFT % features and RANSAC. % % SIFT_MATCH by itself runs the algorithm on two standard test % images. Use SIFT_MATCH(IM1,IM2) to compute the matches of two % custom images IM1 and IM2. % % SIFT_MATCH can also run on two pre-computed sets of features. % Use SIFT_MATCH(IM1, IM2, FEAT1, FEAT2), where FEAT1.f and FEAT1.d % represent the SIFT frames and descriptors of the first image. % % SIFT_MATCH returns MATCHRESULT, where MATCHRESULT.RATIO_TEST % reports the number of correspondences after the distance ratio % test, MATCHRESULT.RANSAC reports the number of correspondences % after the distance ratio test + RANSAC with a homography, and % MATCHRESULT.MODEL contains the best homography found.
3. Image Stitching with Local Image Features Matching local image features between overlapping images of the same scene can be useful for stitching multiple images together into a panorama. Please download the images hw7_panorama_input_{1,2,3,4}.jpg and the script sift_mosaic.m from the handouts webpage. The script (which requires the VLFeat library) matches SIFT features between a pair of images using a distance ratio test, finds a homography using RANSAC, and blends the overlapping region of the two images together with equal weights.
% SIFT_MOSAIC Demonstrates matching two images using SIFT and RANSAC % % SIFT_MOSAIC demonstrates matching two images based on SIFT % features and RANSAC and computing their mosaic. % % SIFT_MOSAIC by itself runs the algorithm on two standard test % images. Use SIFT_MOSAIC(IM1, IM2) to compute the mosaic of two % custom images IM1 and IM2.
Part A: Generate and submit a single panorama from the four input images using the sift_mosaic.m script. One possible method is to first stitch together images 1 and 2, next stitch together images 3 and 4, and finally stitch together the two intermediate images. Briefly comment on any visual artifacts caused by the stitching algorithm.
Part B: Modify sift_mosaic.m perform a distance-weighted linear blending in the overlapping region of the two images, as illustrated below. For pixels where only one image has a valid sample, assign a weight of 1 to that image. Generate and submit a single panorama from the four input images using the new blending method. Briefly comment on if the visual artifacts from Part A have now been suppressed.
4. Recognition of Paintings with a Vocabulary Tree Although the pairwise image matching algorithm described in Problem 2 can accurately find the best matching database image for each query image, the computational cost is high. Depending on the speed of your machine, the matching procedure can take tens of seconds to several minutes for each query image. In this problem, we will use a vocabulary tree to substantially speed up the image retrieval process and quickly find the best matching database candidates.
Training Please download the file hw7_training_descriptors.mat from the homework webpage. This file contains 200,000 training SIFT descriptors extracted from 9,000 DVD cover images. Train a vocabulary tree with branch factor 10 and depth 4, so that the tree has 10,000 leaf nodes. You can use the function vl_hikmeans in the VLFeat library to perform hierarchical k-means clustering. load('training_descriptors.mat'); k = 10; numLeaves = 10000; [tree, assign] = vl_hikmeans(uint8(trainingDescriptors), k, numLeaves);
Testing For the painting images from Problem 1, quantize each database image’s SIFT descriptors and each query image’s SIFT descriptors through the vocabulary tree. You can use the function vl_hikmeanspush. Compute a histogram of visit counts over the leaf nodes. You can use the function vl_hikmeanshist. Normalize each histogram to have unit L1 norm; each histogram can then be considered as an empirical probability mass function. For two normalized histograms u and v, use the L1 distance between u and v as a measurement of their distance or dissimilarity from each other. Vocabulary tree-based retrieval finds the database images whose histograms have the lowest L1 distances (equivalently, the highest histogram intersection scores) to the query image’s histogram. % Quantize descriptors through vocabulary tree treePaths = vl_hikmeanspush(tree, descriptors); % Compute histogram over leaf bins treeHist = vl_hikmeanshist(tree, treePaths); leafHist = treeHist(end-numLeaves+1:end); leafPMF = leafHist / sum(leafHist);