180 likes | 357 Views
Sparse representation for coarse and fine object recognition. Thang V. Pham & Arnold W. M. Smeulders ISIS research group University of Amsterdam, the Netherlands. AIO-SOOS. Content. Coarse and fine recognition with PCA A new representation Gaussian derivative bases Experimental results
E N D
Sparse representation for coarse and fine object recognition Thang V. Pham & Arnold W. M. Smeulders ISIS research group University of Amsterdam, the Netherlands AIO-SOOS
Content • Coarse and fine recognition with PCA • A new representation • Gaussian derivative bases • Experimental results • Conclusions
Coarse and fine recognition Bear 90 degree 0 degree Car Name? Pose? Duck Training Testing
… with PCA project bear project eigenspace car duck
Our assessment of PCA-based • Storage space is huge • Recognition time is large • Spatial coherent not exploited • identical result by permuting the pixels • Incremental learning absent • large datasets • Inefficient when unknown object localization
Our idea • Sparsity Each object uses a small number of N bases from a potentially very large dictionary. Typically N could go up to 1000 … from a dictionary up to 2003.
Coarse and fine recognition To model orientation, an image is modeled as a 3D function Each basis is separable Each 1D basis is a Gaussian derivative
… with local bases Reconstruct Compare new object duck space bear space
Remember our idea • Sparsity (Each object uses a small number of N bases from a potentially very large dictionary.) • by matching pursuit • Initialize residual = object images • Select the best basis for the current residual • Update the residual • Goto 2 unless the number of bases = N
So far, in contrast to PCA + Storage space is efficient No sampled points Not the axes, but their indices in the dictionary. + Spatial coherence is exploited + Yes, there is incremental learning - Some loss for recognition and localization? Is “Reconstruct and compare” inefficient?
The answer is NO. Approximating bases by piece-wise polynomials, we turn matching to polynomial evaluation.
Polynomial recognition A new image is recognized by • Compute the piece-wise polynomials • Compute the complete Njet coefficients of the test image at all locations • Select N coefficients for each object from the object models as learned. • Polynomial computation to yield polynomials (of degree 6 max). • Evaluating the polynomials along the orientation to find the best matching candidate.
So far, in contrast to PCA + Storage space is efficient No sampled points Not the axes, but their indices in the dictionary. + Spatial coherence is exploited + Yes, there is incremental learning + Recognition phase is fast - / + Njet is slower than PCA but done once Efficient for localization Efficient for many objects
Experimental results 1000 bases 6-D eigenspace 100-D eigenspace
Conclusions • Efficient in storage space (no sample points) • Efficient in recognition time (polynomial evaluation) • Spatial correlation exploited (in framework of Gaussian differentials) • Efficient object localization & multiple objects (work with Njet coefficients) • Large dataset and incremental learning (no re-training of existing models)