1 / 8

A Generic Approach for Image Classification Based on Decision Tree Ensembles and Local Sub-windows

This research paper presents a generic approach for image classification, using decision tree ensembles and local sub-windows. The approach eliminates the need for problem-specific feature extraction methods and achieved competitive results on well-known datasets. The paper also discusses the robustness of the approach and its potential for future improvement.

Download Presentation

A Generic Approach for Image Classification Based on Decision Tree Ensembles and Local Sub-windows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1 A Generic Approach for Image Classification Based on Decision Tree Ensembles and Local Sub-windows Raphaël Marée, Pierre Geurts, Justus Piater, Louis Wehenkel University of Liège, Belgium maree@montefiore.ulg.ac.be • Problem • Many application domains require classification of characters, symbols, faces, 3D objects, textures, … • Specific feature extraction methods must be manually adapted when considering a new application • Approach • Recent and generic ML algorithm based on decision tree ensembles and working directly on pixel values • Extension with local sub-window extraction • Results • Competitive with the state of the art on four well known datasets: MNIST, ORL, COIL-100, OUTEX • Encouraging results for robustness (generalisation, rotation, scaling, occlusion) Abstract

  2. 2 Image classification • Many different kind of problems • Usually tackled using: • Problem-specific feature extraction • ie. extracting a reduced set of « interesting » features from the initially huge number of pixels • + • Learning or matching algorithm • Our generic approach: • Working directly on pixel values ie. without any feature extraction ie. images are described by integer values (grey or RGB intensities) of all pixels • + • Ensemble of decision trees

  3. 3 Global generic approach • Ensemble of extremely randomized trees (extra-trees) • Learning • Top-down induction algorithm like classical decision tree (with tests at the internal nodes of the form [ak,l < ath] that compare the value of the pixel at position (k,l) to a threshold ath) but: • Test attributes and thresholds in internal nodes are chosen randomly, • Each tree is fully developed until it perfectly classifies images in the learning sample, • Several extra-trees are built from the same learning sample. • Testing • Propagate the entire test image successively into all the trees (involves comparing pixel values to thresholds in test nodes) and assign to the image the majority class among the classes given by the trees.

  4. 4 Local generic approach • Extra-trees and Sub-windows • Learning • Given a window size w1 x w2 and a large number Nw: • Extract Nwsub-windowsat random from learning set images and assign to each sub-window the classificationof its parent image; • Build a model to classify these Nw sub-windows by using thew1x w2 pixel values that characterize them • Testing • Given the window size w1 x w2: • Extract all possible sub-windows of size w1x w2from test image; • Apply the model on each sub-window; • Assign to the image the majority class among the classes assigned to the sub-windows by the model

  5. 5 Experiments: description • Database specificationEvery image in each database is described by all its pixel values and belong to one class. • Database protocolsSeparation of each database in two independent sets: the learning set (LS) of pre-classified images used to build a model and the test set (TS) used to evaluate the model. • MNIST • LS: first 60000 images • TS: last 10000 remaining images • ORL • 100 random runs: • LS: 200 images • TS: 200 remaining images • COIL-100 • LS: 1800 images (k*20°, k=0..17) • TS: 5400 remaining images • OUTEX • LS: 432 images • TS: 432 remaining images

  6. 6 Experiments: results • Error rates on test sets [1] Y. LeCun and L. Bottou and Y. Bengio and P. Haffner, Gradient-based learning applied to document recognition, 1998 [2] R. Paredes and A. Perez-Cortes, Local representations and a direct voting scheme for face recognition, 2001 [3] S. Obrzalek and J. Matas, Object Recognition using Local Affine Frames on Distinguished Regions, 2002 [4] T. Mäenpää, M. Pietikäinen, and J. Viertola, Separating color and pattern information for color texture discrimination, 2002 • Computing times • Learning on OUTEX • Extra-trees: ± 5 sec • Extra-trees + Sub-Windows: ± 8min • Testing on OUTEX (one image) • Extra-trees: < 1 msec • Extra-trees + Sub-Windows: ± 0,6 sec

  7. 7 Evaluation of Robustness • Generalisation • Rotation • Scaling • Occlusion Considering different learning sample sizes (COIL-100) Image-plane rotation of the test images (COIL-100) Scaled version of the test images, with model built from 32x32 images (COIL-100) Erasing right parts of the test images (COIL-100)

  8. 8 9 Conclusion • Novel, generic, and simple method • Competitive accuracy • Our local generic method (Extra-trees + Sub-windows) is close to state-of-the-art methods without any problem-specific feature extraction but still slightly inferior to best results • In practice, is it necessary to develop specific methods to have a slightly better accuracy ? • Invariance • Robustness to small transformations in test images • Local approach more robust than global approach (many local feature vectors are left more or less intact by a given image transformation) Future work directions • Improving robustness • Augmenting the learning sample with transformed versions of the original images • Normalization of sub-window sizes and orientations • Speed/accuracy trade-off for prediction • Combining Sub-windows with other Machine Learning algorithms

More Related