1 / 81

Pascal Grand Challenge

Pascal Grand Challenge. Felix Vilensky 19/6/2011. Outline. Pascal VOC c hallenge framework. Successful detection methods Object Detection with Discriminatively Trained Part Based Models (P.F.Felzenszwalb et al.)-”UoC/TTI” Method. Multiple Kernels for Object Detection (A.Vedaldi et al.)-

nusa
Download Presentation

Pascal Grand Challenge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pascal Grand Challenge Felix Vilensky 19/6/2011

  2. Outline • Pascal VOC challenge framework. • Successful detection methods • Object Detection with Discriminatively Trained Part Based Models (P.F.Felzenszwalb et al.)-”UoC/TTI” Method. • Multiple Kernels for Object Detection (A.Vedaldi et al.)- ”Oxford\MSR India” method. • A successful classification method • Image Classification using Super-Vector Coding of Local Image Descriptors (Xi Zhou et al)-NEC/UIUC Method. • Discussion about bias in datasets. • 2010 Winners Overview.

  3. Pascal VOC Challenge Framework The PASCAL Visual Object Classes (VOC) Challenge Mark Everingham · Luc Van Gool · Christopher K. I. Williams · John Winn · Andrew Zisserman

  4. Pascal VOC Challenge • Classification Task. • Detection Task. • Pixel-level segmentation. • “Person Layout” detection. • Action Classification in still images.

  5. Classification Task 100% At least one bus

  6. Detection Task 100% Predicted bounding box should overlap by at least 50% with ground truth!!!

  7. Detections “near misses” Didn’t fulfill the BB overlap criterion

  8. Pascal VOC Challenge-The Object Classes

  9. Pascal VOC Challenge-The Object Classes Images retrieved from flicker website.

  10. Pixel Level Segmentation Object segmentation Class segmentation Image

  11. Person Layout

  12. Action Classification • Classification among 9 action classes. 100% Speaking on the phone 100% Playing the guitar

  13. Annotation • Class. • Bounding Box. • Viewpoint. • Truncation. • Difficult (for classification\detection).

  14. Annotation Example

  15. Evaluation A way to compare between different methods. • Precision\Recall Curves. • Interpolated Precision. • AP(Average Precision)

  16. Evaluation-Precision\Recall Curves(1) • Practical Tradeoff between precision and recall. • Interpolated Precision-

  17. Evaluation-Precision\Recall Curves(2)

  18. Evaluation-Average Precision(AP) AP is for determining who’s the best.

  19. SuccessfulDetectionMethods

  20. UoC/TTI Method Overview (P.Felzenszwalb et al.) • Joint winner in 2009 Pascal VOC challenge with the Oxford Method. • Award of "lifetime achievement“ in 2010. • Mixture of deformable part models. • Each component has global template + deformable parts • HOG feature templates. • Fully trained from bounding boxes alone.

  21. UoC/TTI Method – HOG Features(1) • [-1 0 1] and its transpose Gradient. • Gradient orientation is discretized into one of p values. • Pixel-level features Cells of size k. • 8-pixel cells(k=8). • 9 bins contrast sensitive +18 bins contrast insensitive =total 27 bins! Soft binning

  22. UoC/TTI Method – HOG Features(2) …27

  23. UoC/TTI Method – HOG Features(3) • Normalization. • Truncation. • 27 bins X 4 normalization factors= 4X27 matrix. • Dimensionality Reduction to 31.

  24. UoC/TTI Method – Deformable Part Models • Coarse root. • High-Resolution deformable parts. • Part - (Anchor position, deformation cost, Res. Level)

  25. UoC/TTI Method – Mixture Models(1) • Diversity of a rich object category. • Different views of the same object. • A mixture of deformable part models for each class. • Each deformable part model in the mixture is called a component.

  26. UoC/TTI Method – Object Hypothesis Slide taken from the methods presentation

  27. UoC/TTI Method –Models(1) 6 component person model

  28. UoC/TTI Method –Models(2) 6 component bicycle model

  29. UoC/TTI Method – Score of a Hypothesis Slide taken from method's presentation

  30. UoC/TTI Method – Matching(1) • “Sliding window approach” . • High scoring root locations define detections. Best part location Root location • Matching is done for each component separately.

  31. UoC/TTI Method – Matching(2)

  32. UoC/TTI Method – Post Processing & Context Rescoring Slide taken from method's presentation

  33. UoC/TTI Method – Training & DM • Weakly labeled data in Training set. • Latent SVM(LSVM) trainingwith as latent value. • Training and Data mining in 4 stages: Optimize z Add hard negative examples Optimize β Remove easy negative examples

  34. UoC/TTI Method – Results(1)

  35. UoC/TTI Method – Results(2)

  36. Oxford Method Overview (A.Vedaldi et al.) Regions with different scales and aspect ratios 6 feature channels 3 level spatial pyramid Cascade :3 SVM classifiers with 3 different kernels Post Processing

  37. Oxford Method – Feature Channels • Bag of Visual Words-SIFT descriptors are extracted and quantized in a vocabulary of 64 words. • Dense words (PhowGray, PhowColor)- Another set of SIFT Descriptors are then quantized in 300 visual words. • Histogram of oriented edges (Phog180, Phog360)-Similar to the HOG descriptor used by the ”UoC/TTI” Method with 8 orientation bins. • Self-similarity features (SSIM).

  38. Oxford Method – Spatial Pyramids

  39. Oxford Method – Feature Vector Chart is taken from the methods presentation

  40. Oxford Method – Discriminant Function(1)

  41. Oxford Method – Discriminant Function(2) • The kernel of the discriminant function is a linear combination of histogram kernels: • The parameters and the weights (total 18)are learned using MKL(Multiple Kernel Learning). • The discriminant function is used to rank candidate regions R by the likelihood of containing an instance of the object of interest.

  42. Oxford Method – Cascade Solution(1) • Exhaustive search of the best candidate regions R , requires a number of operations which is O(MBN): • N – The number of regions. • M – The number of support vectors in . • B – The dimensionality of the histograms. • To reduce this complexity a cascade solution is applied. • The first stage uses a “cheap” linear kernel to evaluate . • The second uses a more expensive and powerful quasi-linear kernel. • The Third uses the most powerful non-linear kernel. • Each stage evaluates the discriminant function on a smaller number of candidate regions.

  43. Oxford Method – Cascade Solution(2) Stage 1- Linear Stage 2- Quasi-linear Stage 3- Non linear

  44. Oxford Method – Cascade Solution(3) Chart is taken from the methods presentation

  45. Oxford Method – The Kernels • All the before mentioned kernels are of the following form: • For Linear kernels both f and g are linear. For quasi-linear kernels only f is linear.

  46. Oxford Method – Post-Processing • The output of the last stage is a ranked list of 100 candidate regions per image. • Many of these regions correspond to multiple detections. • Non- Maxima Suppression is used. • Max 10 regions per image remain.

  47. Oxford Method – Training/Retraining(1) • Jittered\flipped instances are used as positive samples. • Training images are partitioned into two subsets. • The classifiers are tested on each subset in turn adding new hard negative samples for retraining.

  48. Oxford Method – Results(1)

  49. Oxford Method – Results(2)

  50. Oxford Method – Results(3) Training and testing on VOC2009. Training and testing on VOC2007. Training and testing on VOC2008. Training on VOC2008 and testing on VOC2007.

More Related