Class distributions on SOM surfaces for feature extraction and object retrieval

Class distributions on SOM surfaces for feature extraction and object retrieval Advisor : Dr. Hsu Graduate : Kuo-min Wang Authors :Jorma T. Laaksonen*, J. Markus Koskela, Erkki Oja 2005 Expert Systems with Applications .

Outline • Motivation • Objective • Introduction • Class Distributions • BMU Probabilities • BMU Entropy • SOM Surface Convolutions • Multiple feature extraction • Bayesian Decision estimation • Personal Opinion

Motivate • A Self-Organizing Map (SOM) is typically trained in unsupervised mode, using a large batch of training data. • Even from the same data, qualitatively different distributions can be obtained by using different feature extraction techniques

Objective • We use such distributions for comparing different classes and different feature representations of the data in our content-based image retrieval system PicSOM. • The information-theoretic measures of entropy and mutual information are suggested to evaluate the compactness of a distribution and the independence of two distributions.

Introduction • 影像檢索（Image Retrieval） • segmentation, feature extraction, representation 及 query processing • Segmentation • 將影像中不同的區域劃分出來，大多是時候是指者將影像中物件的邊緣找出來，然後再確定這個區域是否是有意義的區域 • Feature extraction • 指一張影向上某一塊區域的特徵。 • 特徵的擷取跟特徵的表示方式（Representation）有直接的關係，因為不同的表示法，就會需要不同的擷取法 • 顏色(color)、形狀(shape)、質地(texture)

Introduction (cont.) • We study how object class histograms on SOMs can be given interpretations in terms of probability densities and information-theoretic measures • Entropy and mutual information (Cover & Thomas, 1991) • A good feature • the class is heavily concentrated on only a few nearby map elements, giving a low value of entropy. • The mutual information of two features’ distributions is a measure on how independent those features are.

Class Distributions • Normalized to unit sum • the hit frequency give a discrete histogram which is a sample estimate of a probability distribution of the class on the SOM surface. • Theshape of the distribution depends on several factors • The distribution of the original data • Cannot to control the very-high-dimensional pattern space • Feature extraction technique in use affects the metrics and the distribution of all the generated feature vectors • Feature invariance • Some pattern space directions are retained better than others. • Working properly, semantically similar patterns will be mapped nearer to each other

Class Distributions (cont.) • Overall shape of the training set • After it has been mapped from the original data space to the feature vector space, determines the overall organization of the SOM. • The class distribution of the studied object subset or class, relative to the overall shape of the feature vector distribution.

Class Distributions (cont.) • Measures the denseness or locality of feature vectors on a SOM • SDH (Pampalk, Rauber, & Merkl, 2002) • Each data point is mapped not only to its nearest SOM unit but to s nearest units with reciprocally decreasing fractions. • Quantitative locality measures • Map usage, average pair distance, fragmentation, • and purity (Pullwitt , 2002) • They fail to take into account the topological structure of the class

BMU Probabilities • Calculating the a priori probability of each SOM unit for being the BMU for any vector x of the feature space. • Probability density function (pdf)

BMU Probabilities (cont.) • Voronoi region • The set of vectors in the original feature space that the closer to the weight vector of unit i than to any other weight vector • We are actually replacing the continuous pdf with a discrete probability histogram by counting the number of times that any given map unit is the BMU. • The probability histogram of class C on the SOM surface

1 2 s 4 2 1 2 s s 3 3 s 4 s BMU Entropy • The entropy H of a distribution P=(P0,P1,…Pk-1) is calculated as

BMU Entropy (cont.)

SOM Surface Convolutions • Entropy Drawback • The calculation of entropies does not yet take into account the spatial topology of the SOM units in any way. • It is the topological order of the units that separates SOM from other vector quantization methods. • That method bears similarity to the smoothed data histogram approach (Pampalk, 2002) • data points are not mapped one-to-one to their BMUs but spread into s closet map units in the feature space.

SOM Surface Convolutions (cont.)

SOM Surface Convolutions (cont.) • The larger the convolution window is , the smoother is the overall shape of the distribution due to the vanishing of the details. • The selection of a proper size for the convolution mask can be identified as a form of the general scale-space problem

SOM Surface Convolutions (cont.)

Multiple Feature Extractions • It is possible to use more than one feature extraction method in parallel. • In CBIR, three different feature categories are generally recognized: color, texture, and shape features. • Let us denote by P=(P0, P1, …, Pk-1) and Q=(Q0, Q1,…, Qk-1) • H(P) and H(Q) measure the distributions of the single feature vectors, mutual information I(P, Q) can be used for studying the interplay between them

Multiple Feature Extractions (cont.) • HT and nHT have by far the largest values for mutual information • CS and SC have the largest value on both SOMs • EH and HT is high on the smaller SOM, but not so much on the larger SOM with more resolution.

Bayesian Decision Estimation • Using the Bayesian decision rule to make optimal classification • Posterior probability • To decide on the jth object’s membership in class C

Bayesian Decision Estimation (cont.) • Query by example • Presents a number of images to the user at each query round, and the user is expected to evaluate their relevance to her current task. • Relevance feedback • Incrementally fine-tune the selection so that more and more relevant images will be shown at consequtive query rounds • Choose next image for the user • Maximal probability of relevance • Minimal probability of nonrelevance

Bayesian Decision Estimation (cont.)

Bayesian Decision Estimation (cont.) • Relevance feedback problem • By adding the hit caused by the new relevant and nonrelevant samples to the map units, • convolving them with the mask used, • And renormalizing the distributions to unit sums • Let us denote the history of the query up to the t – 1’th round by • Maximize the current probability of relevance

Bayesian Decision Estimation (cont.)

Conclusions • The entropy of the distribution characterizes quantitatively the compactness of an object class. • Proposed method can be used as an efficient way of comparing these features and the SOMs produced with them. • We showed that the mutual information of the distributions • could be used to identify both the most similar and the most uncorrelated of features • can also be used to select the subset of the feature extraction methods with the most independent features. • Bayesian decision • used for choosing either the most probable class for a data item, or the most likely data item belonging to a given class.

Personal Opinions • Advantage • Combined entropy & mutual information & smooth method to find the important feature and independent of features. • Application • Feature extraction • Drawback • The structure of this paper is not good, • Some diagram is not clear, so difficult to understand

Class distributions on SOM surfaces for feature extraction and object retrieval

Class distributions on SOM surfaces for feature extraction and object retrieval

Presentation Transcript

Feature extraction: Corners and blobs

On Feature Combination for Multiclass Object Classification

Feature Extraction

Information Retrieval and Extraction

Feature extraction: Corners

Feature Selection and Extraction

Feature Extraction for ASR

Feature extraction

Feature Extraction (I)

Feature Extraction

Feature Extraction for speech applications

Feature extraction for change detection

Feature extraction

Feature Selection, Feature Extraction

Feature Extraction

Feature Extraction for speech applications

Feature Extraction (I)

Feature Extraction for ASR

Class distributions on SOM surfaces for feature extraction and object retrieval

Feature Extraction

Feature Extraction