1 / 41

Bilinear Classifiers for Visual Recognition

Bilinear Classifiers for Visual Recognition. Computational Vision Lab. University of California Irvine To be presented in NIPS 2009. Hamed Pirsiavash Deva Ramanan Charless Fowlkes. Reshape. (:). n. f. 2. 3. n. n. y. x. 6. 7. 6. 7. 6. 7. x. =. 6. 7. 6. 7. 4. 5. 1.

Download Presentation

Bilinear Classifiers for Visual Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva Ramanan Charless Fowlkes

  2. Reshape (:) n f 2 3 n n y x 6 7 6 7 . 6 7 x = . 6 7 . 6 7 4 5 1 £ n n n f y x Introduction • Linear model for visual recognition A window on input image Features Feature vector 1

  3. T 2 3 2 3 T 6 7 6 7 0 > w x 6 7 6 7 0 . . 6 7 6 7 > . . 6 7 6 7 . . 6 7 6 7 4 5 4 5 Introduction • Linear classifier • Learn a template • Apply it to all possible windows Template(Model) Feature vector

  4. T T W W Features W W W W = f f s s ( ) d d 2 3 i · n n n n : m n n n n n = f f f f s x y s n ; : : : f n . 6 7 n y . x X 6 7 . = 6 7 4 5 Reshape £ n n n f y x Introduction 2 3 2 3 : : : . 6 7 6 7 h i : : : . £ . 6 7 6 7 . = . . 6 7 6 7 . : : : d £ . n f 4 5 4 5 . d £ £ n n n f s s

  5. Introduction • Motivation for bilinear models • Reduced rank: less number of parameters • Better generalization: reduced over-fitting • Run-time efficiency • Transfer learning • Share a subset of parameters between different but related tasks

  6. Outline • Introduction • Sliding window classifiers • Bilinear model and its motivation • Extension • Related work • Experiments • Pedestrian detection • Human action classification • Conclusion

  7. 1 T T ( ) P ( ) L C i 0 1 + ¡ m n w w w m a x y w x = w n n 2 ; n T 0 > w x Sliding window classifiers • Extract some visual features from a spatio-temporal window • e.g., histogram of gradients (HOG) in Dalal and Triggs’ method • Train a linear SVM using annotated positive and negative instances • Detection: evaluate the model on all possible windows in space-scale domain • Use convolution since the model is linear

  8. W Sliding window classifiers Sample template Sample image Features

  9. Bilinear model (Definition) • Visual data are better modeled as matrices/tensors rather than vectors • Why not use the matrix structure

  10. F X t d l M W e a u r e o e T T T T ( ( ) ) T T T W W X X 2 3 0 0 > > 2 3 2 3 n n n w n r r : x n n w x = = f f f T s x y n : : : f 2 3 2 3 : : : : : : n . 6 7 n 6 7 6 7 w y x . x X 6 7 . = . . 6 7 6 7 6 7 6 7 . 6 . 7 ( ) T . . 6 7 6 7 6 7 6 7 r . . = 4 5 . . 6 7 6 7 6 7 6 7 . . 4 5 4 5 6 7 6 7 4 5 4 5 £ n n f s Bilinear model (Definition) Features Reshape

  11. d T T T T ( ( ) ) ( ( ) ) f d f W W W W X W T W W W X i 0 0 · > > w n n x x m n n r n = = f f f f f f s s s ; s Bilinear model (Definition) • Bilinear model 2 3 2 3 : : : . 6 7 6 7 h i : : : . £ . 6 7 6 7 . = . . 6 7 6 7 . : : : d £ . n f 4 5 4 5 . d £ £ n n n f s s

  12. d T T T T T ( ( ( ) ) ) ( ( ( ) ) ) d f f f l d W B W W W X X W T T W W W W W X W X W i i i i 0 0 · > > n n w x x n m e a n r n n r r n a n = = = f f f f f f f f s s s s ; s s Bilinear model (Definition) • Bilinear model 2 3 2 3 : : : . 6 7 6 7 h i : : : . £ . 6 7 6 7 . = . . 6 7 6 7 . : : : d £ . n f 4 5 4 5 . d £ £ n n n f s s

  13. d T T T T T T ( ( ( ( ) ) ) ) ( ( ( ( ) ) ) ) d f f f f l d W B W W W W X X X L W T T T W W W W W W X X W X W i i i i i 0 0 · > > n w n a x x s n m e a n r n e n n a r r r r : n a n = = = = f f f f f f f f s s s s ; s s Bilinear model (Definition) • Bilinear model 2 3 2 3 : : : . 6 7 6 7 h i : : : . £ . 6 7 6 7 . = . . 6 7 6 7 . : : : d £ . n f 4 5 4 5 . d £ £ n n n f s s

  14. 1 T T ( ) ( ) P ( ( ) ) L W T W W C T W X i 0 1 + ¡ m n r m a x y r = W n n 2 ; n f g X y n n ; Bilinear model (Learning) • Linear SVM for a given set of training pairs Objective function Regularizer Constraints Parameter

  15. T 1 T T ( ) ( ) P ( ( ) ) W W L W W T W W C T W X i 0 1 + ¡ m n : r m a x y r = = W f s n n 2 ; n f g X y n n ; Bilinear model (Learning) • Linear SVM for a given set of training pairs Objective function Regularizer Constraints Parameter

  16. 1 T T T ( ) ( ) P ( ( ) ) L W W T W W W W C T W W X i 0 1 + ¡ m n r m a x y r = f f f f s s n n ; 2 ; s s n Bilinear model (Learning) • For Bilinear formulation • Biconvex so solve by coordinate decent • By fixing one set of parameters, it’s a typical SVM problem (with a change of basis) • Use off-the-shelf SVM solver in the loop

  17. ( ) d i · T m n n n W W W f s ; = f s n d f Motivation • Regularization • Similar to PCA, but not orthogonal and learned discriminatively and jointly with the template • Run-time efficiency • convolutions instead of Reduced dimensional Template Subspace

  18. T W W W = f s W f Motivation • Transfer learning • Share the subspace between different problems • e.g human detector and cat detector • Optimize the summation of all objective functions • Learn a good subspace using all data

  19. ( ) L W W W ( ( ( ) ) ) L L L W W W W W W W W f x y f f ; ; t x x s y y ; ; ; ; ; Extension • Multi-linear • High-order tensors • instead of just • For 1D feature • Separable filter for (Rank=1) • Spatio-temporal templates

  20. T ( ( ) ) T T W W W r r Related work (Rank restriction) • Bilinear models • Often used in increasing the flexibility; however, we use them to reduce the parameters. • Mostly used in generative models like density estimation and we use in classification • Soft Rank restriction • They used rather than in SVM to regularize on rank • Convex, but not easy to solve • Decrease summation of eigen values instead of the number of non-zero eigen values (rank) • Wolf et al (CVPR’07) • Used a formulation similar to ours with hard rank restriction • Showed results only for soft rank restriction • Used it only for one task (Didn’t consider multi-task learning)

  21. Related work(Transfer learning) • Dates back to at least Caruana’s work (1997) • We got inspired by their work on multi-task learning • Worked on: Back-propagation nets and k-nearest neighbor • Ando and Zhang’s work (2005) • Linear model • All models share a component in low-dimensional subspace (transfer) • Use the same number of parameters

  22. 8 8 £ Experiments: Pedestrian detection • Baseline: Dalal and Triggs’ spatio-temporal classifier (ECCV’06) • Linear SVM on features: (84 for each cell) • Histogram of gradient (HOG) • Histogram of optical flow • Made sure that the spatiotemporal is better than the static one by modifying the learning method

  23. d 1 4 6 8 4 5 1 0 £ n n o r = = = f s , , Experiments: Pedestrian detection • Dataset: INRIA motion and INRIA static • 3400 video frame pairs • 3500 static images • Typical values: • Evaluation • Average precision • Initialize with PCA in feature space • Ours is 10 times faster

  24. Experiments: Pedestrian detection

  25. Experiments: Pedestrian detection

  26. Experiments: Pedestrian detection Link

  27. Experiments: Human action classification 1 • 1 vs all action templates • Voting: • A second SVM on confidence values • Dataset: • UCF Sports Action (CVPR 2008) • They obtained 69.2% • We got 64.8% but • More classes: 12 classes rather than 9 • Smaller dataset: 150 videos rather than 200 • Harder evaluation protocol: 2-fold vs. LOOCV • 87 training examples rather than 199 in their case

  28. UCF action Results: PCA (0.444)

  29. UCF action Results: Linear (0.518)

  30. UCF action Results: Bilinear (0.648)

  31. UCF action Results Linear (0.518) (Not always feasible) PCA on features (0.444) Bilinear (0.648)

  32. UCF action Results

  33. Experiments: Human action classification 2 • Transfer • We used only two examples for each of 12 action classes • Once trained independently • Then trained jointly • Shared the subspace • Adjusted the C parameter for best result

  34. 1 2 1 2 W W W W W W m m T W W W f f f s s s = f s Transfer PCA to initialize Train Train Train Train Train Train

  35. 2 1 1 2 W W W W W W W m m T f W W W f f f s s s = f s i m P ( ) L W W i m n f W i 1 ; f s = Transfer PCA to initialize Train Train Train Train Train Train Train

  36. Results: Transfer Average classification rate

  37. Results: Transfer (for “walking”) Iteration 1 Refined at Iteration 2

  38. Conclusion • Introduced multi-linear classifiers • Exploit natural matrix/tensor representation of spatio-temporal data • Trained with existing efficient linear solvers • Shared subspace for different problems • A novel form of transfer learning • Got better performance and about 10X speed up in run-time compared to the linear classifier. • Easy to apply to most high dimensional features (instead of dimensionality reduction methods like PCA) • Simple: ~ 20 lines of Matlab code

  39. Thanks!

  40. 1 1 T T T T T ( ( ) ) ( ( ) P ) ( P ( ( ) ) ( ) ) L L W W W T T W W W W C W W C T W X T W W X i i 0 1 0 1 + + ¡ ¡ m m n n r r m a x m y a x r y r = = f f f W f s s n n n n ; 2 2 ; ; s s n n f g x y n n ; Bilinear model (Learning details) • Linear SVM for a given set of training pairs • For Bilinear formulation • It is biconvex so solve by coordinate decent

  41. 1 1 1 1 ¡ ¡ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ T T T T 1 T T 1 T T W W ( ) ( ) P ( ( ) ) W W W A A W ( X X ) X A W ( W A X ) A A P W W W W ( ( ) ) 2 2 2 2 L W W W W C T W X i L W W W W C T W X 0 1 i 0 1 + ¡ + ¡ f m n m a x y r = = = = = = m n m a x y r f f f f = ~ = f f ~ f s f f s f s f s s n n n n s s f f W , , , , s n n W s s s s n n ; 2 ; ; 2 ; s s n n f s Bilinear model (Learning details) • Each coordinate descent iteration: freeze where then freeze where

More Related