220 likes | 233 Views
This work focuses on the application of the SMART-TV algorithm for image classification using KNN method. The algorithm leverages a P-tree vertical data structure for scalability on large image databases. Color and texture features are extracted and used for classification, demonstrating improved performance compared to traditional methods.
E N D
Efficient Image Classification on Vertically Decomposed Data Taufik Abidin, Aijuan Dong, Hongli Li, andWilliam Perrizo Computer Science North Dakota State University The 1st IEEE International Workshop on Multimedia Databases and Data Management (MDDM-06)
Outline • Image classification • The application of SMART-TV algorithm in image classification • SMART-TV algorithm • Experimental results • Summary
Image Classification • Why classifying images? • The proliferation of digital images • The need to organize them into semantic categories for effective browsing for effective retrieval • Techniques for image classification: • SVM, Bayesian, Neural Network, KNN
Image Classification Cont. • In this work, we focus on KNN method • KNN is widely used in image classification: • Simple and easy to implement • Good classification results • Problems: • Classification time is linear to the size of image repositories • When the repositories are very large, contains millions of images, KNN is impractical
Our Contributions • We apply our recently developed classification algorithms, a.k.a. SMART-TV for image classification task and analyze its performance • We demonstrate that SMART-TV, a classification algorithm that uses P-tree vertical data structure, is fast and scalable to very large image databases • We show that for Corel images a combination of color and texture features are a good alternative to represent the low-level of images
Image Preprocessing • We extracted color and texture features from the original pixel of the images • We created 54-dimension color histogram in HVS (6x3x3) color space for color features and created 8 multi-resolutions Gabor filter (4 orientations and 2 scales) to extract texture features of the images (see B.S. Manjunath, IEEE Trans. on Pattern Analysis and Machine Intelligence, 1996, for more detail about the filter)
Image Preprocessing Cont. Color Features • Convert RGB to HSV • HSV to the way humans tend to perceive color • The value is in the range of 0..1 • Quantize the image into 54 bins i.e. (6 x 3 x 3) bins • Record the frequency of the HSV of each pixel in the images
Image Preprocessing Cont. Texture Features • Transform the images into frequency domain using the 8 filters generated (4 orientations and 2 scales parameters) and record the standard deviation and the mean of the pixel in the image after transformation • This process will produce 16 texture features for each image
Store the root count and TV values Compute Root Counts Measure TV of each object in each class Large Training Set Preprocessing Phase Classifying Phase Search the K-nearest neighbors from the candidate set Approximate the candidate set of NNs Unclassified Object Vote Overview of SMART-TV
SMART-TV Algorithm • SMART-TV: SMall Absolute diffeRence of ToTal Variation • Approximates a set of candidates of nearest neighbors by examining the absolute difference between the total variation of each data object in the training set and the total variation of the unclassified object • The k-nearest neighbors are searched from the candidate set • Computing Total Variation (TV):
TV g TV(X,)=TV(X,x33) 1 1 2 2 a 3 3 4 4 5 5 a- X Total Variation The Total Variation of a set X about (the mean), , measures total squared separation of objects in X about , defined as follows: Y
The Independency of RC • The root count operations are independence from , which allows us to run the operations once in advance and retain the count results • In classification task, the sets of classes are known and unchanged. Thus, the total variation of an object about its class can be pre-computed
Preprocessing Phase Preprocessing: • The computation of root counts of each class Cj, where 1 j number of classes. O(kdb2) where k is the number of classes, d is the total of dimensions, and b is the bit-width • Compute , 1 j number of classes. O(n) where n is the number of images in the training set
Classifying Phase Classifying: • For each class Cj, where 1 j number of classes do: a. Compute , where is the feature of the unclassified image • Find hs images in Cj such that the absolute difference between the total variation of the images in Cj and the total variation of are the smallest, i.e. Let A be an array and , where c. Store the ID of the images in an arrayTVGapList
Classifying Phase (Cont.) • For each objectIDt, 1 t Len(TVGapList) where Len(TVGapList) is equal to hs times the total number of classes, retrieve the corresponding object features from the training set and measure the pair-wise Euclidian distance between and , i.e. and determine the k nearest neighbors of • Vote the class label forfrom the k nearest neighbors
Dataset We used Corel images (http://wang.ist.psu.edu/docs/related) • 10 categories • Originally, each category has 100 images • Number of feature attributes 70 (54 from color and 16 from texture) • We randomly generated several bigger size datasets to evaluate the speed and scalability of the algorithms • 50 images for testing set, 5 for each category
Experimental Results Experimental Setup : Intel P4 CPU 2.6 GHz machine, 3.8GB RAM running Red Hat Linux Classification Accuracy Comparison
Experimental Results Cont. Loading Time Classification Time
Summary • We have presented the SMART-TV algorithm, a classification algorithm that uses vertical data structure, and applied it in image classification task • We found that the speed of our algorithm outperforms the speed of the classical KNN algorithm • Our method scales well to large image repository. Its classification accuracy is very comparable to that of KNN algorithm