Constructing Category Hierarchies for Visual Recognition

Constructing Category Hierarchies for Visual Recognition MarcinMarszaklek and CordeliaSchmid

Introduction • Hierarchical classification scales well in the number of classes: • O(n^2): one-vs-one • O(n): one-vs-rest • O(log(n): classification tree • Previous works to construct class hierarchies: • By hand [Zweig’07] • From external sources [Marszaklek’07] • From visual similarities: • Exhaustive [Yuan’06] • Top-down [Chen’04, Griffin’08] • Botton-up [Zhigang’05, Griffin’08]

Motivation: Disjoint VS overlap • Previous works: disjoint partitioning of classes (class separability) • Increasingly difficult to disjoint partition for large number of classes. • Propose: relaxed hierarchy – postpone uncertain classification decisions until the number of classes get reduced and learning good decision boundaries becomes tractable.

Method • Building relaxed hierarchy • Train top-down classifiers using hierarchy

Building top-down relaxed hierarchy • Using balanced Normalized-cut, split the set of classes such that: • Further relaxation: • Find the class on boundary • Define the split: (α: overlap ratio) , given a partition

Building top-down relaxed hierarchy – conti.

Train/test top-down classifiers • Training hierarchy: • For each node of DAG, samples of [Ln\Rn] as positive sample, [Rn\Ln] as negative samples; • Samples in classes Xn=LnПRnnot for training • Testing: • Traversal DAG until a leaf is reached. • The decision is either directly the class label (leaves containing one class), or performing OAR classification on the remaining classes in current leaf.

Results: one-vs-rest Confusion between mountain/touring bikes High intra-class variability Low intra-class variability

Class hierarchies: caltech 256 hand-crafted hierarchy relaxed hierarchy Disjoint visual hierarchy Categories: animal, natural phenomena and man-made objects

Results Average per-class accuracy on Caltech-256

Results – conti. Complexity in the number of classes r: # relaxed training sample per class. Speed-for-accuracy trade-off

Learning and Using Taxonomies For Fast Visual Categorization Gregory Griffin and PietroPerona

Motivation expensive Given a testing sample: O( # category) inexpensive O( log2(# category)) One-vs-rest strategy VS hierarchical strategy

Methods • Building confusion matrix • Building Taxonomies • Re-train top-down classifiers

Building confusion matrix • Multi-class classification: one-vs-rest strategy • Classifier: Spatial Pyramid Matching; • Training data only and loo validation

Building Taxonomies • Intuition: • Categories that are easily confused should be grouped together; • Decisions between easily-confused categories sholuld be taken later in the decision tree. • Method: • Self-Tuning Spectral Clustering • Greedy, bottom-up grouping using mutual infor.

Re-train top-down classifiers • Known the taxonomy tree of categories as a binary tree • At each node, reformulating a binary-classifier • Again, using Spatial Pyramid Matching + SVM • F_{train} = 10%

Results Red: insects Yellow: birds Green: land mammals Blue: aquatic mammals Taxonomy tree for Caltech-256

Trade-off between performance and speed Spectral clustering Greedy clustering A: ordinary one-vs-rest multi-classifier C: each testing image goes through the tree B: intermediate level N_{train} = 10; 5x speed up with 10% performance drop

Results Cascade performance / speed trade off as a function of # training example/class 20x speed up with 10% performance drop for N_{train}=50

Constructing Category Hierarchies for Visual Recognition