1 / 36

Aggregating local image descriptors into compact codes

Aggregating local image descriptors into compact codes. Presented by: Jiří Pytela Ayan Basu Nath. Authors : Hervé Jegou Florent Perroonnin Matthijs Douze Jorge Sánchez Patrick Pérez Cordelia Schmidt. Outline. Introduction Method Evaluation From vectors to codes Experiments

lea
Download Presentation

Aggregating local image descriptors into compact codes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Aggregatinglocal image descriptorsintocompactcodes Presented by: Jiří Pytela Ayan Basu Nath Authors: HervéJegou FlorentPerroonnin MatthijsDouze JorgeSánchez Patrick Pérez Cordelia Schmidt

  2. Outline • Introduction • Method • Evaluation • From vectors to codes • Experiments • Conclusion

  3. 100M Objective • Lowmemoryusage • Highefficiency • Highaccuracy

  4. Existingmethods • Bag-of-words • Aproximate NN searchfor BOW • Min-Hash • Pre-filtering • Lowaccuracy • Highmemoryusage

  5. Vectoraggregationmethods • Representlocaldescriptors by single vector • BOW • FisherVector • VectorofLocallyAggregatedDescriptors (VLAD)

  6. BOW • Requirescodebook – set of „visualwords“ • > k-meansclustering • Histogram

  7. FisherVector • Extends BOW • „differencefromanaveragedistributionofdescriptors“

  8. Fisher Kernel X – set of T descriptors - probability densityfunction λ - parameters Fishervector

  9. Image representation • GaussianMixture Model • Parameters • mixtureweight • meanvector • variance matrix • Probabilisticvisualvocabulary

  10. Image representation • Descriptorassignment • Vectorrepresentation Powernormalization L2-normalization

  11. FV – image specific data realdescriptordistribution

  12. FV – image specific data Estimationofparametresλ : -> Image independent informationisdiscarded

  13. FV – final image representation • proportionofdescriptorsassigned to single visualword • averageofthedescriptorsassigned to single visualword

  14. FV – final image representation • Includes: • thenumberofdescriptorsassigned to visualword • approximatelocationofthedescriptor • -> Frequentdescriptorshavelowervalue

  15. VectorofLacallyAggregatedDescriptors (VLAD) • Non-probabilisticFisher Kernel • Requirescodebook (as BOW) • Associateeachdescriptor to itsnearestneighbor • Computethedifferencevector

  16. Comparison – VLAD and FV • Equalmixtureweights • Isotropiccovariancematrices

  17. VLAD descriptors Dimensionalityreduction -> principalcomponentanalysis

  18. PCA comparison Dimensionalityreductioncanincreaseaccuracy

  19. Evaluation Compactrepresentation -> D‘ = 128 Highdimensionaldescriptionssufferfromdimensionalityreduction FV and VLAD use onlyfewvisualwords (K) !

  20. Evaluation Eachcollectioniscombinedwith 10M or 100M image dataset Copydays – nearduplicatedetection Oxford – limited object variability UKB – bestpreformanceis 4

  21. Given a D-dimensional input vector • A code of B bits encoding the image representation • Handling problem in two steps: • a projection that reduces the dimensionality of the vector • a quantization used to index the resulting vectors FROM VECTORS TO CODES

  22. Approximate nearest neighbour • Required to handle large databases in computer vision applications • One of the most popular techniques is Euclidean Locality-Sensitive Hashing • Is memory consuming

  23. The product quantization-based approximate search method • It offers better accuracy • The search algorithm provides an explicit approximation of the indexed vectors • compare the vector approximations introduced by the dimensionality reduction and the quantization • We use the asymmetric distance computation (ADC) variant of this approach

  24. ADC approach • Let x ϵ RDbe a query vector • Y = {y1,…,Yn} a set of vectors in which we want to find the nearest neighbour NN(x) of x • consists in encoding each vector Yiby a quantized version Ci= q(Yi) ϵRD • For a quantizer q(.) with k centroids, the vector is encoded by B=log2(k) bits, k being a power of 2. • Finding the a nearest neighbours NNa(x) of x simply consists in computing

  25. Indexation-aware dimensionality reduction • Dimensionality reduction • There exist a tradeoff between this operation and the indexing scheme • The D’ x D PCA matrix M maps descriptor x ϵ RDto the transformed descriptor x’ = M x ϵ RD’. • This dimensionality reduction can also be interpreted in the initial space as a projection. In that case, x is approximated by

  26. Therefore the projection is xp = MTMx • Observation: • Due to the PCA, the variance of the different components of x’ is not balanced. • There is a trade-off on the number of dimensions D’ to be retained by the PCA. If D’ is large, the projection error vector εp(x) is of limited magnitude, but a large quantization error εq(xp) is introduced.

  27. Joint optimization of reduction/indexing • The squared Euclidean distance between the reproduction value and x is the sum of the errors and • The mean square error e(D’) is empirically measured on a learning vector set L as:

  28. EXPERIMENTS • Evaluating the performance of the Fisher vector when used with the joint dimensionality reduction/indexing approach • Large scale experiments on Holidays+Flickr10M

  29. Dimensionality reduction and indexation

  30. Comparison with the state of the art

  31. The proposed approach is significantly more precise at all operating points • Compared to BOW, which gives mAP=54% for a 200k vocabulary, a competitive accuracy of mAP=55.2% is obtained with only 32 bytes.

  32. Large-scale experiments • Experiments on Holidays and Flickr10M

  33. Experiments on Copydays and Exalead100M

  34. CONCLUSION • Many state-of-the-art large-scale image search systems follow the same paradigm • The BOW histogram has become a standard for the aggregation part • First proposal is to use the Fisher kernel framework for the local feature aggregation • Secondly, employ an asymmetric product quantization scheme for the vector compression part, and jointly optimize the dimensionality reduction and compression

  35. THANK YOU

More Related