1 / 18

Learning an Attribute Dictionary for Human Action Classification

Learning an Attribute Dictionary for Human Action Classification. Qiang Qiu. Qiang Qiu, Zhuolin Jiang, and Rama Chellappa, ” Sparse Dictionary-based Representation and Recognition of Action Attributes”, ICCV 2011. Action Feature Representation. Shape. Motion. HOG.

Download Presentation

Learning an Attribute Dictionary for Human Action Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning an Attribute Dictionary for Human Action Classification Qiang Qiu Qiang Qiu, Zhuolin Jiang, and Rama Chellappa, ”Sparse Dictionary-based Representation and Recognition of Action Attributes”, ICCV 2011

  2. Action Feature Representation Shape Motion HOG

  3. Action Sparse Representation Sparse code Action Dictionary = 0.43 0.63 0 0 -0.33 0 -0.36 0 0 0 0 0.64 0.53 -0.40 0.35 0 0 0 = + 0.63 × - 0.33 × - 0.36 × 0.43 × = + 0.53× - 0.40 × +0.35 × 0.64×

  4. K-SVD Sparse codes x1 x2 • K-SVD • Input: signals Y, dictionary size, sparisty T • Output: dictionary D, sparse codes X Input signals Dictionary d1 d2 d3 … y2 y1 = D Y 0.43 0.63 0 0 -0.33 0 -0.36 0 0 0 0 0.64 0.53 -0.40 0.35 0 0 0 X arg min |Y- DX|2 s.t. i , |xi|0 ≤ T D,X [1] M. Aharon and M. Elad and A. Bruckstein, K-SVD: An Algorithm for Designing Overcomplete Dictionries for Sparse Representation, IEEE Trans. on Signal Process, 2006

  5. Objective Learn a Compact Discriminative and Dictionary.

  6. Probabilistic Model for Sparse Representation • A Gaussian Process • Dictionary Class Distribution

  7. More Views of Sparse Representation y1 y2 y3 y4 x1 x2 x3 x4 = xd1 l1 l2 l1 l2 0.43 0.63 0 0 -0.33 0 -0.36 0 0 0 0 0 0 0 0 0.64 0.53 -0.40 0.35 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -0.28 0.698 0.37 0.25 0 0 0 0 0 -0.42 0 0 0 0 0.42 0.47 0 0.32 d1 d2 d3 … l1 l2 l1 l2

  8. x1 x2 x3 x4 xd1 d1 A Gaussian Process 0 0.53 0 0 0 0.64 0 0 0.63 0 0 0 0.43 0 0 0 -0.33 -0.40 0 -0.42 xd2 d2 d3 xd3 d4 xd4 y2 y4 y1 y3 xd5 d5 … … l1 l2 l1 l2 l1 l1 l2 l2 • A Gaussian Process • Covariance function entry: K(i,j) = cov(xdi, xdj) • P(Xd*|XD*) is a Gaussian with a closed-form conditional variance

  9. x1 x2 x3 x4 xd1 d1 Dictionary Class Distribution 0 0.53 0 0 0 0.64 0 0 0.63 0 0 0 0.43 0 0 0 -0.33 -0.40 0 -0.42 xd2 d2 d3 xd3 d4 xd4 y2 y4 y1 y3 xd5 d5 … … l1 l2 l1 l2 l1 l1 l2 l2 • Dictionary Class Distribution • P(L|di), L [1, M] • aggregate |xdi| based on class labels to obtain a M sized vector • P(L=l1|d5) = (0.33+0.40)/(0.33+0.40+0.42) = 0.6348 • P(L=l2|d5) = (0+0.42)/(0.33+0.40+0.42) = 0.37

  10. Dictionary Learning Approaches • Maximization of Joint Entropy (ME) • Maximization of Mutual Information (MMI) • Unsupervised Learning (MMI-1) • Supervised Learning (MMI-2)

  11. Maximization of Joint Entropy (ME) - Initialize dictionary using k-SVD Do = - Start with D* = • Untill |D*|=k, iteratively choose d* from Do\D*, • d* = arg max H(d|D*) d • where ME dictionary • A good approximation to ME criteria • arg max H(D) D

  12. Maximization of Mutual Information for Unsupervised Learning (MMI-1) - Initialize dictionary using k-SVD Do = - Start with D* = • Untill |D*|=k, iteratively choose d* from Do\D*, MMI dictionary Coverage Diversity • A near-optimal approximation to MMI • arg max I(D; Do\D) • within (1-1/e) of the optimum • Closed form: D • d* = arg max H(d|D*) - H(d|Do\(D* d)) d

  13. x1 x2 x3 x4 xd1 Revisit d1 -0.33 -0.40 0 -0.42 0 0.64 0 0 0.63 0 0 0 0.43 0 0 0 0 0.53 0 0 xd2 d2 d3 xd3 d4 xd4 y2 y4 y1 y3 xd5 d5 … … l1 l2 l1 l2 l1 l1 l2 l2 • Dictionary Class Distribution • P(L|di), L [1, M] • aggregate|xdi|based on class labels to obtain a M sized vector • P(l1|d5) = (0.33+0.40)/(0.33+0.40+0.42) = 0.6348 • P(l2|d5) = (0+0.42)/(0.33+0.40+0.42) = 0.37 • P(Ld) = P(L|d) • P(LD) = P(L|D) , where

  14. Maximization of Mutual Information for Supervised Learning (MMI-2) - Initialize dictionary using k-SVD Do = - Start with D* = • Untill |D*|=k, iteratively choose d* from Do\D*, • d* = arg max [H(d|D*) - H(d|Do\(D* d))] • + λ[H(Ld|LD*) – H(Ld|LDo\(D* d))] d - MMI-1 is a special case of MMI-2 with λ=0.

  15. Keck gesture dataset

  16. Representation Consistency [1] [1] J. Liu and M. Shah, Learning Human Actions via Information Maximization, CVPR 2008

  17. Recognition Accuracy The recognition accuracy using initial dictionary Do: (a) 0.23 (b) 0.42 (c) 0.71

  18. Recognizing Realistic Actions • 150 broadcast sports videos. • 10 different actions. • Average recognition rate: 83:6% • Best reported result 86.6%

More Related