590 likes | 791 Views
Machine Learning & Bioinformatics. Tien-Hao Chang (Darby Chang). Using kernel density estimation in enhancing conventional instance-based classifier. Kernel approximation Kernel d ensity e stimation Kernel a pproximation in O ( n ) Multi-dimensional kernel approximation
E N D
Machine Learning & Bioinformatics Tien-HaoChang (Darby Chang) Machine Learning and Bioinformatics
Using kernel density estimation in enhancing conventional instance-based classifier • Kernel approximation • Kernel density estimation • Kernel approximation in O(n) • Multi-dimensional kernel approximation • Application in data classification Machine Learning and Bioinformatics
Identifying boundary of different classes of objects Machine Learning and Bioinformatics
Boundary identified Machine Learning and Bioinformatics
Problem definition of kernel approximation • Given the values of function at a set of samples • We want to find a set of symmetric kernel functions and the corresponding weights such that Machine Learning and Bioinformatics
Kernel approximationSpherical Gaussian functions • Hartman et al. showed that a linear combination of spherical Gaussian functions can approximate any function with arbitrarily small error • “Layered neural networks with Gaussian hidden units as universal approximations”, Neural Computation, Vol. 2, No. 2, 1990 Machine Learning and Bioinformatics
Kernel approximationSpherical Gaussian functions • With the Gaussian kernel functions, we want to find such that Machine Learning and Bioinformatics
Problem definition of kernel density estimation • Assume that we are given a set of samples taken from a probability distribution in a d-dimensional vector space • The problem definition of kernel density estimation is to find a linear combination of kernel functions that provides a good approximation of the probability density function Machine Learning and Bioinformatics
Probability density estimation • The value of the probability density function at a vector v can be estimated as follows: Machine Learning and Bioinformatics
Probability density estimation • nis the total number of samples • R(v) is the distance between vector vand its k-th nearest samples • is the volume of a sphere with radius = R(v) in a d-dimensional vector space Machine Learning and Bioinformatics
The Gamma function Machine Learning and Bioinformatics
Since • Since Machine Learning and Bioinformatics
Kernel approximationAn 1-D example Machine Learning and Bioinformatics
Kernel approximation in O(n) • In the learning algorithm, we assume uniform sampling • namely, samples are located at the crosses of an evenly-spaced grid in the d-dimensional vector space • Let denote the distance between two adjacent samples • If the assumption of uniform sampling does not hold, then some sort of interpolation can be conducted to obtain the approximate function values at the crosses of the grid Machine Learning and Bioinformatics
A 2-D example of uniform sampling Machine Learning and Bioinformatics
The O(n)kernel approximation algorithmThe basic idea • Under the assumption that the sampling density is sufficiently high, i.e., we have the function values at a sample and its k nearest samples, , are virtually equal: • In other words, f(v) is virtually a constant function equal to in the proximity of • Accordingly, we can expect that ; Machine Learning and Bioinformatics
The O(n)kernel approximation algorithmAn 1-D example Machine Learning and Bioinformatics
The O(n) kernel approximation algorithm • In the 1-D example, samples located at , where h is an integer • Under the assumption that , we have Machine Learning and Bioinformatics
The O(n) kernel approximation algorithm • The issue now is to find appropriate and such thatin the proximity of Machine Learning and Bioinformatics
The O(n) kernel approximation algorithm • Therefore, with , we can set and obtain for Machine Learning and Bioinformatics
The O(n) kernel approximation algorithm • In fact, it can be shown that is bounded by • Therefore, we have the following function approximate: Machine Learning and Bioinformatics
The O(n)kernel approximation algorithmGeneralization • We can generalize the result by setting , where is a real number Machine Learning and Bioinformatics
The O(n)kernel approximation algorithmGeneralization • The table shows the bounds of Machine Learning and Bioinformatics
An example of the effect of different setting of beta Machine Learning and Bioinformatics
The smoothing effect • The kernel approximation function is actually a weighted average of the sampled function values • Selecting a larger value implies that the smoothing effect will be more significant • Our suggestion is set Machine Learning and Bioinformatics
An example of the smoothing effect Machine Learning and Bioinformatics
Multi-dimensional kernel approximation • Under the assumption that the sampling density is sufficiently high, i.e., we have the function values at a sample and its k nearest samples, , are virtually equal: Machine Learning and Bioinformatics
Multi-dimensional kernel approximation • As a result, we can expect that and , where and are the weights and bandwidths of the Gaussian functions located at , respectively Machine Learning and Bioinformatics
Multi-dimensional kernel approximation • Since the influence of a Gaussian function decreases exponentially as the distance increases, we can set k to a value such that, for a vector vin the proximity of sample , we have Machine Learning and Bioinformatics
Since we have and ,our objectives is to find and such that Machine Learning and Bioinformatics
Multi-dimensional kernel approximation • Let,then we have Machine Learning and Bioinformatics
We have=,where • If we set to , then is bounded by • Accordingly, we have is bounded by Machine Learning and Bioinformatics
Multi-dimensional kernel approximation • Therefore, with , g(v) is virtually a constant function and for v in the proximity of • Accordingly, we want to set Machine Learning and Bioinformatics
Multi-dimensional kernel approximation • Finally, by setting uniformly to , we obtain the following kernel smoothing function that approximates f(v):, where are the k nearest samples of the object located at v Machine Learning and Bioinformatics
Generally speaking, if we set uniformly to, then we have , where • If we select a proper value for , then Machine Learning and Bioinformatics
Multi-dimensional kernel approximation • As a result, we want to set , where and obtain the following function approximate:,where are the k nearest samples of the object located at v Machine Learning and Bioinformatics
Lack something? ? How about kernel density estimation? Machine Learning and Bioinformatics
Application in data classification • Recent development in data classification focuses on the support vector machines (SVM), due to accuracy concern • In this lecture, we will describe a kernel density estimation based data classification algorithm that can delivers the same level of accuracy as the SVM and enjoys some advantages Machine Learning and Bioinformatics
The KDE-based classifier • The learning algorithm constructs one approximate probability density function for one class of objects Machine Learning and Bioinformatics
Let us adopt the following estimation of the value of the probability density function at each training sample:, where • is the probability density function of class-m objects; • is the distance between training sample and its k-th nearest training samples of the same class; • d is the dimension of the vector space Machine Learning and Bioinformatics
The KDE-based classifier • In the kernel approximation problem, we set the bandwidth of each Gaussian function uniformly to , where is the distance between two adjacent training samples • In the kernel density estimation problem, for each training sample, we need to determine the average distance between two adjacent training samples of the same class in the local region Machine Learning and Bioinformatics
The KDE-based classifier • In the d-dimensional vector space, if the average distance between samples is , then the number of samples in a subspace of volume V is approximately equal to • Accordingly, we can estimate by Machine Learning and Bioinformatics
Accordingly, with the kernel approximation function that we obtain earlier, we have the following approximate probability density function for class-m objects:,where are the k nearest class-m training samples of the object located at v, , and Machine Learning and Bioinformatics
The KDE-based classifier • An interesting observation is that, as long as , then regardless of the value of , we have • With this observation, Machine Learning and Bioinformatics
In the discussion above, is defined to be the distance between sample and its k-th nearest training sample • However, this definition depends on only one single sample and tends to be unreliable, if the data set is noisy • We can replace with, where are the k nearest training samples of the same class as Machine Learning and Bioinformatics
Parameter tuning • The discussions so far are based on the assumption that the sampling density is sufficiently high, which may not hold for some real data sets Machine Learning and Bioinformatics
Parameter tuning • Three parameters, namely are incorporated in the learning algorithm: ,where are the nearest training samples of the same class as Machine Learning and Bioinformatics
Parameter tuning • One may wonder how should be set • According to our experimental results, the value of has essentially no effect, as long as is set to a value within Machine Learning and Bioinformatics