Pascal Grand Challenge

Pascal Grand Challenge Felix Vilensky 19/6/2011

Outline • Pascal VOC challenge framework. • Successful detection methods • Object Detection with Discriminatively Trained Part Based Models (P.F.Felzenszwalb et al.)-”UoC/TTI” Method. • Multiple Kernels for Object Detection (A.Vedaldi et al.)- ”Oxford\MSR India” method. • A successful classification method • Image Classification using Super-Vector Coding of Local Image Descriptors (Xi Zhou et al)-NEC/UIUC Method. • Discussion about bias in datasets. • 2010 Winners Overview.

Pascal VOC Challenge Framework The PASCAL Visual Object Classes (VOC) Challenge Mark Everingham · Luc Van Gool · Christopher K. I. Williams · John Winn · Andrew Zisserman

Pascal VOC Challenge • Classification Task. • Detection Task. • Pixel-level segmentation. • “Person Layout” detection. • Action Classification in still images.

Classification Task 100% At least one bus

Detection Task 100% Predicted bounding box should overlap by at least 50% with ground truth!!!

Detections “near misses” Didn’t fulfill the BB overlap criterion

Pascal VOC Challenge-The Object Classes

Pascal VOC Challenge-The Object Classes Images retrieved from flicker website.

Pixel Level Segmentation Object segmentation Class segmentation Image

Person Layout

Action Classification • Classification among 9 action classes. 100% Speaking on the phone 100% Playing the guitar

Annotation • Class. • Bounding Box. • Viewpoint. • Truncation. • Difficult (for classification\detection).

Annotation Example

Evaluation A way to compare between different methods. • Precision\Recall Curves. • Interpolated Precision. • AP(Average Precision)

Evaluation-Precision\Recall Curves(1) • Practical Tradeoff between precision and recall. • Interpolated Precision-

Evaluation-Precision\Recall Curves(2)

Evaluation-Average Precision(AP) AP is for determining who’s the best.

SuccessfulDetectionMethods

UoC/TTI Method Overview (P.Felzenszwalb et al.) • Joint winner in 2009 Pascal VOC challenge with the Oxford Method. • Award of "lifetime achievement“ in 2010. • Mixture of deformable part models. • Each component has global template + deformable parts • HOG feature templates. • Fully trained from bounding boxes alone.

UoC/TTI Method – HOG Features(1) • [-1 0 1] and its transpose Gradient. • Gradient orientation is discretized into one of p values. • Pixel-level features Cells of size k. • 8-pixel cells(k=8). • 9 bins contrast sensitive +18 bins contrast insensitive =total 27 bins! Soft binning

UoC/TTI Method – HOG Features(2) …27

UoC/TTI Method – HOG Features(3) • Normalization. • Truncation. • 27 bins X 4 normalization factors= 4X27 matrix. • Dimensionality Reduction to 31.

UoC/TTI Method – Deformable Part Models • Coarse root. • High-Resolution deformable parts. • Part - (Anchor position, deformation cost, Res. Level)

UoC/TTI Method – Mixture Models(1) • Diversity of a rich object category. • Different views of the same object. • A mixture of deformable part models for each class. • Each deformable part model in the mixture is called a component.

UoC/TTI Method – Object Hypothesis Slide taken from the methods presentation

UoC/TTI Method –Models(1) 6 component person model

UoC/TTI Method –Models(2) 6 component bicycle model

UoC/TTI Method – Score of a Hypothesis Slide taken from method's presentation

UoC/TTI Method – Matching(1) • “Sliding window approach” . • High scoring root locations define detections. Best part location Root location • Matching is done for each component separately.

UoC/TTI Method – Matching(2)

UoC/TTI Method – Post Processing & Context Rescoring Slide taken from method's presentation

UoC/TTI Method – Training & DM • Weakly labeled data in Training set. • Latent SVM(LSVM) trainingwith as latent value. • Training and Data mining in 4 stages: Optimize z Add hard negative examples Optimize β Remove easy negative examples

UoC/TTI Method – Results(1)

UoC/TTI Method – Results(2)

Oxford Method Overview (A.Vedaldi et al.) Regions with different scales and aspect ratios 6 feature channels 3 level spatial pyramid Cascade :3 SVM classifiers with 3 different kernels Post Processing

Oxford Method – Feature Channels • Bag of Visual Words-SIFT descriptors are extracted and quantized in a vocabulary of 64 words. • Dense words (PhowGray, PhowColor)- Another set of SIFT Descriptors are then quantized in 300 visual words. • Histogram of oriented edges (Phog180, Phog360)-Similar to the HOG descriptor used by the ”UoC/TTI” Method with 8 orientation bins. • Self-similarity features (SSIM).

Oxford Method – Spatial Pyramids

Oxford Method – Feature Vector Chart is taken from the methods presentation

Oxford Method – Discriminant Function(1)

Oxford Method – Discriminant Function(2) • The kernel of the discriminant function is a linear combination of histogram kernels: • The parameters and the weights (total 18)are learned using MKL(Multiple Kernel Learning). • The discriminant function is used to rank candidate regions R by the likelihood of containing an instance of the object of interest.

Oxford Method – Cascade Solution(1) • Exhaustive search of the best candidate regions R , requires a number of operations which is O(MBN): • N – The number of regions. • M – The number of support vectors in . • B – The dimensionality of the histograms. • To reduce this complexity a cascade solution is applied. • The first stage uses a “cheap” linear kernel to evaluate . • The second uses a more expensive and powerful quasi-linear kernel. • The Third uses the most powerful non-linear kernel. • Each stage evaluates the discriminant function on a smaller number of candidate regions.

Oxford Method – Cascade Solution(2) Stage 1- Linear Stage 2- Quasi-linear Stage 3- Non linear

Oxford Method – Cascade Solution(3) Chart is taken from the methods presentation

Oxford Method – The Kernels • All the before mentioned kernels are of the following form: • For Linear kernels both f and g are linear. For quasi-linear kernels only f is linear.

Oxford Method – Post-Processing • The output of the last stage is a ranked list of 100 candidate regions per image. • Many of these regions correspond to multiple detections. • Non- Maxima Suppression is used. • Max 10 regions per image remain.

Oxford Method – Training/Retraining(1) • Jittered\flipped instances are used as positive samples. • Training images are partitioned into two subsets. • The classifiers are tested on each subset in turn adding new hard negative samples for retraining.

Oxford Method – Results(1)

Oxford Method – Results(2)

Oxford Method – Results(3) Training and testing on VOC2009. Training and testing on VOC2007. Training and testing on VOC2008. Training on VOC2008 and testing on VOC2007.

Pascal Grand Challenge

Pascal Grand Challenge

Presentation Transcript

DARPA Grand Challenge

The Verification Grand Challenge

MOD GRAND CHALLENGE

Multimedia Grand Challenge 2012

Grand Challenge Communities

Mini Grand Challenge Presentation

Grand Challenge Network Problems

Grand Challenge:

Mini Grand Challenge

Grand Challenge Initiative

PASCAL Challenge

Recognizing Textual Entailment Challenge PASCAL

UKCRC's Grand Challenge Initiative

Grand Challenge IV

DARPA Grand Challenge ‘05

2014 Grand Challenge Symposium

Grand Challenge in MDC2

PASCAL Challenge

Evolution: a Grand Challenge

A STEP PASCAL Challenge?

DARPA Grand Challenge

Grand Challenge: