1 / 27

Cross-Indexing of Binary Scale Invariant Feature Transform Codes for Large-Scale Image Search

Cross-Indexing of Binary Scale Invariant Feature Transform Codes for Large-Scale Image Search. Presented by Xinyu Chang. Introduction.

gerard
Download Presentation

Cross-Indexing of Binary Scale Invariant Feature Transform Codes for Large-Scale Image Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cross-Indexing of Binary Scale Invariant Feature Transform Codes for Large-Scale Image Search Presented by Xinyu Chang

  2. Introduction Image matching is a fundamental aspect of many problems in computer vision, including object or scene recognition, solving for 3D structure from multiple images, stereo correspondence, and motion tracking. In recent years, there has been growing interest in mapping visual features into compact binary codes for applications on large-scale image collections. Encoding high-dimensional data as compact binary codes reduces the memory cost for storage.

  3. Introduction Goal • Extracting distinctive invariant features Correctly matched against a large database of features from many images • Invariance to image scale and rotation • Robustness to • Affine distortion • Change in 3D viewpoint • Addition of noise • Change in illumination

  4. Introduction

  5. Content • Interest Point Detection • Scale-space extremadetection • Keypointlocalization • Orientation assignment • Keypoint descriptor • Flexible Binarization • Cross Indexing • Result

  6. Interest Point Detection

  7. Interest Point Detection

  8. Interest Point Detection

  9. Interest Point Detection

  10. Initial Outlier Rejection Dog is most stable across scale

  11. Interest Point Detection

  12. Rotation invariance • To achieve rotation invariance • Compute central derivatives, gradient magnitude and direction of L (smooth image) at the scale of key point (x,y)

  13. Rotation invariance

  14. Rotation invariance

  15. Rotation invariance

  16. Key point descriptor

  17. FLEXIBLE SIFT BINARIZATION Given an image, the detected interest points are denoted by { fi}n−1 i=0 , in which N represents the total number of the detected interest points. Each feature fiincludes a L2-normalized descriptor di ∈ RD, for SIFT descriptorD is 128. Our target is to transform local feature descriptor dito an L-bit binary code string B = {b0, b1, . . . , bL−1}

  18. FLEXIBLE SIFT BINARIZATION D where C represents the 3-D comparison array with size D × D × 2. And C(i, j ) means the comparison result between the magnitudes in the i -th and the j -th dimension of descriptor d. α is a scalar threshold whose impact will be studied in the experiment section.

  19. FLEXIBLE SIFT BINARIZATION And concatenate them into a comparison string S with β = 2D(D − 1) bits in total, as shown by the second step in Fig. 2. To simplify the notations, in the following, S is denoted as S = {s0, s1, s2, . . . , sβ−1}. To obtain an L-bit binary code B = {b0, b1, . . . , bL−1}, next we encode the comparison string S into L bits.

  20. FLEXIBLE SIFT BINARIZATION

  21. CROSS-INDEXING STRATEGY Code Word the first 32 bits of the binary code is code word. The visual words are generated by clustering the randomly selected SIFT descriptor. Each feature is assigned to a visual word by nearest neighbor approach or approximate nearest neighbor approach.

  22. CROSS-INDEXING STRATEGY In the BoVW model, an image is represented by a visual word histogram with tf-idfweighting strategy. The similarity between two images are measured by the L1 or L2 distance of their visual word vectors. In the binary code based retrieval system, the features’ binary codes are used to find the true matches and we use the number of matches to measure the similarity between two images, denoted by Scorei. And this strategy can be formulated by in which i represents the i -th database image. B(d) and B(q) denote the binary SIFT code of the database feature d and the query feature q, respectively. T is a pre-defined threshold value. The impact of T will be studied in our experimental part. H(・, ・) denotes the Hamming distance between two binary SIFT codes. If two images have the same score value, we favor the image with fewer features.

  23. CROSS-INDEXING STRATEGY

  24. CROSS-INDEXING STRATEGY

  25. Result

  26. Result

  27. Thank you

More Related