280 likes | 292 Views
This paper presents an algorithm for object recognition using shape contexts and locality-sensitive hashing. The algorithm efficiently calculates features from query and reference images, compares them, and returns a decision about which object matches. It reduces computation time without reducing the success rate.
E N D
Object Recognition Using Locality-Sensitive Hashing of Shape Contexts Andrea Frome, JitendraMalik Presented by IliasApostolopoulos
Introduction • Object Recognition • Calculate features from query image • Compare with features from reference images • Return decision about which object matches • Full object recognition • Algorithm returns: • Identity • Location • Position
Introduction • Problem • Full object recognition is expensive • High dimensionality • Time linear with reference features • Reference features linear with example objects • Relaxed Object Recognition • Pruning step before actual recognition • Reduces computation time • Does not reduce success rate
Features • Shape Context • Characterize shape in 2D or 3D • Histograms of edge pixels
Features • Two-dimensional Shape Contexts • Detect edges • Select coordinate in edge map • Radar like division of region around point • Count edge points in each region
Features • Two-dimensional Shape Contexts • Each region is 1 bin • Regions further from center summarize larger area • More accurately captures and weighs more heavily information toward the center
Features • Two-dimensional Shape Contexts • Shapes must be similar to match • Similar orientation relative to the object • Orientation variant • Similar scale of the object • Scale variant
Features • Three-dimensional Shape Contexts • Similar idea in 3D • Use 3D range scans • No need to consider scale • Real world dimensions • Rotate sphere around azimuth in 12 discrete angles • No need to take rotation of query image into consideration
Experiments with 3D shape contexts • 56 3D car models • 376 features/model • Range scans of models • 5cm & 10cm noise • 300 features/scan • Use NN to match features
Experiments with 3D shape contexts • Experiment 1 • 5cm noise • 100% success rate on top choice
Experiments with 3D shape contexts • Experiment 1 • 10cm noise • 92.86% success rate on top choice • 100% success rate on top 4 choices
Experiments with 3D shape contexts • Experiment 1 • Pros • Great results even with a lot of noise • Match found in 4 first options • Cons • Each query image takes 3.3 hours to match • Computational cost should be reduced: • By reducing the features calculated per query image • By reducing the features calculated per reference image
Reducing Running Time with Representative Descriptors • If we densely sample from reference scans we can sparsely sample from query scans • Features are fuzzy, robust to small changes • Few regional descriptors are needed to describe scene • These features can be very discriminative • Representative Descriptor method • Use a reduced number of query points as centers • Choose which points to use as RD • Calculate a score between an RD and a reference object • Aggregate RD scores to get one score
Experiments • Experiment 2 • Same set of reference data • Variable number of random RDs • Smallest distance between RD and feature as score • sum of RD scores for a model as scene score • Model with smallest summation is the best match
Experiments • Experiment 2 • 5cm noise • 30 RDs • 100% recognition on top 7 matches • 40 RDs • 99.9% on top 2 • 100% on top 3 • Reduced computation by 87% to 90%
Experiments • Experiment 2 • 10cm noise • 80 RDs • 97.8% recognition on top 7 matches • 160 RDs • 98% on top 7 • Reduced computation by 47% to 73%
Reducing Search Space with a Locality-Sensitive Hash • Compare query features only to the reference features that are nearby • Use an approximate of k-NN search called Locality-Sensitive Hash (LSH) • Sum range of data for each dimension • Choose k values from range • Each value a cut in one of the dimensions • A hyperplane parallel to that dimension’s axis • These planes divide the feature space into hypercubes • Each hypercube represented by an array of integers called first-level hash or locality-sensitive hash
Reducing Search Space with a Locality-Sensitive Hash • Exponential number of hashes • Use second-level hash function to translate the arrays to single integers • This is the number of bucket in the table • Create L tables to reduce the probability of missing close neighbors • hash function for the table • set of identifiers stored in • For each reference feature we calculate and store j in bucket
Reducing Search Space with a Locality-Sensitive Hash • Given a query feature q, we find matches in two stages • Retrieve features and calculate ,
LSH with Voting Method • 5 cm • 100% with 600 divisions for 20 tables • 100% with 800 divisions for 100 tables • 10cm • 100% with 300 or 400 divisions
Using RDs with LSH • 5cm • 80% with 400 divisions and 20 tables • 94% with 400 divisions and 100 tables • 10cm • 83% with 400 divisions and 100 tables
Using x percent of RDs with LSH • 5cm • 99.8% with 40 RDs, 400 divisions and 20 tables • 10cm • 96% with 160 RDs, 400 divisions and 100 tables
Associative LSH • Variation of LSH • Get results from LSH • Check neighborhood of q for better results • Increases the success rate • Number of comparisons increases
Summary 5cm LSH with RDs smallest number of comparisons for list of top 3
Summary 10cm LSH with voting smallest number of comparisons for list of top 5