540 likes | 680 Views
Machine Learning and Bioinformatics 機器學習與生物資訊學. Using kernel density estimation in enhancing conventional instance-based classifier. Kernel approximation Kernel a pproximation in O ( n ) Multi-dimensional kernel approximation Kernel d ensity e stimation
E N D
Machine Learning and Bioinformatics機器學習與生物資訊學 Machine Learning & Bioinformatics
Using kernel density estimation in enhancing conventional instance-based classifier • Kernel approximation • Kernel approximation in O(n) • Multi-dimensional kernel approximation • Kernel density estimation • Application in data classification Machine Learning and Bioinformatics
Identifying class boundary Machine Learning and Bioinformatics
Boundary identified Machine Learning and Bioinformatics
Kernel approximation Machine Learning and Bioinformatics
Kernel approximationProblem definition • Given the values of function at a set of samples • We want to find a set of symmetric kernel functions and the corresponding weights such that Machine Learning and Bioinformatics
Equations • (1) Machine Learning and Bioinformatics
Kernel approximationAn 1-D example Machine Learning and Bioinformatics
Kernel approximationSpherical Gaussian functions • Hartman et al. showed that a linear combination of spherical Gaussian functions can approximate any function with arbitrarily small error • “Layered neural networks with Gaussian hidden units as universal approximations”, Neural Computation, Vol. 2, No. 2, 1990 • With the Gaussian kernel functions, we want to find such that Machine Learning and Bioinformatics
Equations • (1) • (2) Machine Learning and Bioinformatics
Kernel approximation in O(n) Machine Learning and Bioinformatics
Kernel approximation in O(n) • In the learning algorithm, we assume uniform sampling • namely, samples are located at the crosses of an evenly-spaced grid in the d-dimensional vector space • Let denote the distance between two adjacent samples • If the assumption of uniform sampling does not hold, then some sort of interpolation can be conducted to obtain the approximate function values at the crosses of the grid Machine Learning and Bioinformatics
A 2-D example of uniform sampling Machine Learning and Bioinformatics
The O(n)kernel approximation • Under the assumption that the sampling density is sufficiently high, i.e., we have the function values at a sample and its k nearest samples, , are virtually equal: • In other words, is virtually a constant function equal to in the proximity of • Accordingly, we can expect that ; Machine Learning and Bioinformatics
The O(n)kernel approximationAn 1-D example Machine Learning and Bioinformatics
In the 1-D example, samples located at , where h is an integer • Under the assumption that , we have • The issue now is to find appropriate and such thatin the proximity of Machine Learning and Bioinformatics
Equations • (1) • (2) • (3) Machine Learning and Bioinformatics
What’s the difference between equations (2) and (3)? Machine Learning and Bioinformatics
Therefore, with , we can set and obtain • In fact, it can be shown that is bounded by • Therefore, we have the following function approximate: Machine Learning and Bioinformatics
Equations • (1) • (2) • (3) • (4) • (5) Machine Learning and Bioinformatics
The O(n)kernel approximationGeneralization • We can generalize the result by setting , where is a real number • The table shows the bounds of Machine Learning and Bioinformatics
Equations • (1) • (2) • (3) • (4) • (5) • (6) Machine Learning and Bioinformatics
The effect of Machine Learning and Bioinformatics
The smoothing effect • The kernel approximation function is actually a weighted average of the sampled function values • Selecting a larger value implies that the smoothing effect will be more significant • Our suggestion is set Machine Learning and Bioinformatics
An example of the smoothing effect Machine Learning and Bioinformatics
Multi-dimensional kernel approximation Machine Learning and Bioinformatics
Multi-dimensional kernel approximationThe basic concepts • Under the assumption that the sampling density is sufficiently high, i.e., we have the function values at a sample and its k nearest samples, , are virtually equal: • As a result, we can expect that and , where and are the weights and bandwidths of the Gaussian functions located at , respectively Machine Learning and Bioinformatics
Let , the scalar form of is • If we set to , then is bounded by Machine Learning and Bioinformatics
Accordingly, we have is bounded by and, with , • In RVKDE, d is relaxed to a parameter, • Finally, by setting uniformly to , we obtain the following kernel smoothing function that approximates :, where are the k nearest samples of the object located at v Machine Learning and Bioinformatics
Equations • (1) • (2) • (3) • (4) • (5) • (6) • (7) Machine Learning and Bioinformatics
Multi-dimensional kernel approximationGeneralization • If we set uniformly to, then • As a result, we want to set , where and obtain the following function approximate:,where are the k nearest samples of the object located at v • In RVKDE, k is a parameter kt Machine Learning and Bioinformatics
Equations • (1) • (2) • (3) • (4) • (5) • (6) • (7) • (8) Machine Learning and Bioinformatics
Multi-dimensional kernel approximationSimplification • An interesting observation is that, as long as , then regardless of the value of , we have • With this observation, we have Machine Learning and Bioinformatics
Equations • (1) • (2) • (5) • (6) • (7) • (8) • (9) Machine Learning and Bioinformatics
Lack something? ? Machine Learning and Bioinformatics
Kernel density estimation Machine Learning and Bioinformatics
Kernel density estimationProblem definition • Assume that we are given a set of samples taken from a probability distribution in a d-dimensional vector space • The problem definition of kernel density estimation is to find a linear combination of kernel functions that provides a good approximation of the probability density function Machine Learning and Bioinformatics
Probability density estimation • The value of the probability density function at a vector v can be estimated as follows: • n is the total number of samples • R(v) is the distance between vector v and its ks-thnearest samples • is the volume of a sphere with radius = R(v) in a d-dimensional vector space Machine Learning and Bioinformatics
The Gamma function • , since • , since Machine Learning and Bioinformatics
Application in data classification Machine Learning and Bioinformatics
Application in data classification • Recent development in data classification focuses on the support vector machines (SVM), due to accuracy concern • RVKDE • delivers the same level of accuracy as the SVM and enjoys some advantages • constructs one approximate probability density function for one class of objects Machine Learning and Bioinformatics
Time complexity • The average time complexity to construct a classifier is , if the k-d tree structure is employed, where n is the number of training samples • The time complexity to classify m new objects with unknown class is Machine Learning and Bioinformatics
Comparison of classification accuracy Machine Learning and Bioinformatics
Comparison of classification accuracy3 larger data sets Machine Learning and Bioinformatics
Data reduction • As the learning algorithm is instance-based, removal of redundant training samples will lower the complexity of the prediction phase • The effect of a naïve data reduction mechanism was studied • The naïve mechanism removes a training sample, if all of its 10 nearest samples belong to the same class as this particular sample Machine Learning and Bioinformatics
With the KDE based statistical learning algorithm, each training sample is associated with a kernel function with typically a varying width Machine Learning and Bioinformatics
In comparison with the decision function of SVM in which only support vectors are associated with a kernel function with a fixed width Machine Learning and Bioinformatics
Effect of data reduction Machine Learning and Bioinformatics