1 / 28

Performance Tuning on Multicore Systems for Feature Matching within Image Collections

Performance Tuning on Multicore Systems for Feature Matching within Image Collections. Xiaoxin Tang*, Steven Mills, David Eyers, Zhiyi Huang , Kai-Cheung Leung and Minyi Guo * Department of Computer Science University of Otago , New Zealand * Department of Computer Science

bryant
Download Presentation

Performance Tuning on Multicore Systems for Feature Matching within Image Collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance Tuning on Multicore Systems forFeature Matching within Image Collections XiaoxinTang*, Steven Mills, David Eyers, Zhiyi Huang, Kai-Cheung Leung and MinyiGuo* Department of Computer Science University of Otago, New Zealand * Department of Computer Science Shanghai Jiao Tong University, China

  2. Contents • Motivation • Our work • Evaluation • Conclusion

  3. Contents • Motivation • Our work • Evaluation • Conclusion

  4. Similarity Search • Definition: • To preprocess a database of N objects so that given a query object, one can effectively determine its nearest neighbors in database. • Applications: • pattern recognition, chemical similarity analysis, and statistical classification, etc.

  5. The problem – KNN Search • K Nearest Neighbor Search: • Feature: an array of D elements • f = [e1] • Feature Space: a set of features • Fs= {f1} • Feature Similarity: Euclidean distance • =sqrt(Σ(fim-fjm)2) • Search: given a query feature fq, find k features in Fs so that they have the shortest distances to fq.

  6. Our Case Study • Feature Matching: a fundamental problem in many computer vision tasks • Use the SIFT algorithm to generate features for each image; • Use a k-Nearest Neighbors (k-NN) algorithm to find similar features between images

  7. Challenges • Very time-consuming: • datasets become larger: • hundreds or thousands of images; • image resolution increases: • 2300×1500 pixels, or higher; • New platforms: • HPC turns to multi-/many-core age: • AMD 16-core and 64-core machines.

  8. Motivation • Performance evaluation: • Find out common problems that may limit the performance of feature matching on multi-/many-core platforms. • Performance tuning: • Find general methods to solve the identified problems.

  9. Contents • Motivation • Our work • Evaluation • Conclusion

  10. Data Distribution

  11. Data Size

  12. Problems • Unbalanced workload: • Levels of parallelism; • Scheduling policy. • Poor last-level cache utilization: • Memory architecture.

  13. Level_1&2 Level_2 Level_3 Level_4 Level_1 Levels of parallelism Linear KD-tree Kmeans LSH Others ——————— …….. …….. Reference Images Features Query Images

  14. Scheduling policy • OpenMP scheduling policy: • Static: the scheduler will assign an equal number of tasks to each thread (not used); • Dynamic: when one thread finishes its current task, it will take new tasks from the global task queue; • Guided: chunk size is adjusted dynamically when tasks are requested from the task queue.

  15. Memory architecture • More cores are sharing the memory and last-level cache: • Memory bandwidth: • AMD 16-core 12.8 GB/s • AMD 64-core 25.6 GB/s • Last-level cache: • AMD 16-core 6 MB • AMD 64-core 16 MB • Large images may not fit in cache and will cause many memory accesses, which leads to hitting the memory wall.

  16. Divide-and-Merge • We propose Divide-and-Merge: • Whole feature space is split into several smaller sub-spaces; • Search each sub-space independently; • Merge their results.

  17. Divide-and-Merge

  18. Time complexity • Accurate algorithms: • Brute force: • Apply DM: • Approximate algorithms: • Randomized KD-Tree: • Apply DM:

  19. Contents • Motivation • Our work • Evaluation • Conclusion

  20. Hardware and Software configuration • Environment: • OpenCV + OpenMP: one of the most frequently used setup for computer vision researchers to utilize parallel platforms

  21. Levels of parallelism

  22. Scheduling policy(on level_1&2)

  23. Scheduling policy(on level_3)

  24. Memory architecture 1. Original Execution 2. Apply Divide-and-Merge

  25. Evaluation on Manawatu Dataset

  26. Evaluation on Manawatu Dataset

  27. Contents • Motivation • Our work • Evaluation • Conclusion

  28. Conclusion • We have shown that performance tuning is demanding on modern multicore systems. • We have comprehensively evaluated the impact of the three factors that have an influence on large-scale image feature matching. • We have proposed a Divide-and-Merge algorithm that can greatly improve the speedup and scalability of feature matching algorithms on multicore machines.

More Related