Sparse representation for coarse and fine object recognition

Sparse representation for coarse and fine object recognition Thang V. Pham & Arnold W. M. Smeulders ISIS research group University of Amsterdam, the Netherlands AIO-SOOS

Content • Coarse and fine recognition with PCA • A new representation • Gaussian derivative bases • Experimental results • Conclusions

Coarse and fine recognition Bear 90 degree 0 degree Car Name? Pose? Duck Training Testing

… with PCA project bear project eigenspace car duck

Our assessment of PCA-based • Storage space is huge • Recognition time is large • Spatial coherent not exploited • identical result by permuting the pixels • Incremental learning absent • large datasets • Inefficient when unknown object localization

Our idea • Sparsity Each object uses a small number of N bases from a potentially very large dictionary. Typically N could go up to 1000 … from a dictionary up to 2003.

Coarse and fine recognition To model orientation, an image is modeled as a 3D function Each basis is separable Each 1D basis is a Gaussian derivative

… with local bases Reconstruct Compare new object duck space bear space

Remember our idea • Sparsity (Each object uses a small number of N bases from a potentially very large dictionary.) • by matching pursuit • Initialize residual = object images • Select the best basis for the current residual • Update the residual • Goto 2 unless the number of bases = N

So far, in contrast to PCA + Storage space is efficient No sampled points Not the axes, but their indices in the dictionary. + Spatial coherence is exploited + Yes, there is incremental learning - Some loss for recognition and localization? Is “Reconstruct and compare” inefficient?

The answer is NO. Approximating bases by piece-wise polynomials, we turn matching to polynomial evaluation.

Polynomial recognition A new image is recognized by • Compute the piece-wise polynomials • Compute the complete Njet coefficients of the test image at all locations • Select N coefficients for each object from the object models as learned. • Polynomial computation to yield polynomials (of degree 6 max). • Evaluating the polynomials along the orientation to find the best matching candidate.

So far, in contrast to PCA + Storage space is efficient No sampled points Not the axes, but their indices in the dictionary. + Spatial coherence is exploited + Yes, there is incremental learning + Recognition phase is fast - / + Njet is slower than PCA but done once Efficient for localization Efficient for many objects

Experimental results

Experimental results 1000 bases 6-D eigenspace 100-D eigenspace

Experimental results

Conclusions • Efficient in storage space (no sample points) • Efficient in recognition time (polynomial evaluation) • Spatial correlation exploited (in framework of Gaussian differentials) • Efficient object localization & multiple objects (work with Njet coefficients) • Large dataset and incremental learning (no re-training of existing models)

Sparse representation for coarse and fine object recognition

Sparse representation for coarse and fine object recognition

Presentation Transcript

OBJECT RECOGNITION

Object recognition

A coarse-to-fine approach for fast deformable object detection

Sparse Representation

Object Recognition

Object Recognition

Object Recognition

Sparse Representation for Image Reconstruction, and Face Recognition?

Coarse and Fine Grain Programmable Overlay Architectures for FPGAs

Object recognition

Object Recognition

Face recognition via sparse representation

Object Recognition

Object Recognition

Object recognition

Object Recognition

Object recognition

MMSE Estimation for Sparse Representation Modeling

Sparse and Overcomplete Data Representation

Object Recognition