330 likes | 345 Views
A Clustered Particle Swarm Algorithm for Re t ri evi ng all the Local Minima of a function. C. Voglis & I. E. Lagaris Computer Science Department University of Ioannina, GREECE. Presentation Outline. Global Optimization Problem Particle Swarm Optimization
E N D
A ClusteredParticle Swarm Algorithm for Retrievingall the LocalMinima of a function C. Voglis & I. E. Lagaris Computer Science Department University of Ioannina, GREECE
Presentation Outline • Global Optimization Problem • Particle Swarm Optimization • Modifying Particle Swarm to form clusters • Clustering Approach • Modifying the affinity matrix • Putting the pieces together • Determining the number of minima • Identification of the clusters • Preliminary results – Future research
Global Optimization • The goal is to find the Global minimum inside a bounded domain: • One way to do that, is to find all the local minima and choose among them the global one (or ones). • Popular methods of that kind are Multistart, MLSL, TMLSL*, etc. *M. Ali
Particle Swarm Optimization • Developed in 1995 by James Kennedy and Russ Eberhart. • It was inspired by social behavior of bird flocking or fish schooling. • PSO applies the concept of social interaction to problem solving. • Finds a global optimum.
PSO-Description • The method allows the motion of particles to explore the space of interest. • Each particle updates its position in discrete unit time steps. • The velocity is updated by a linear combination of two terms: • The first along the direction pointing to the best position discovered by the particle • The second towards the overall best position.
PSO - Relations Where: is the position of the ith particle at step k is its velocity is the best position visited by the ith particle is the overall best position ever visited Particle’s best position Swarm’s best position is the constriction factor
PS+Clustering Optimization • If the global component is weakened the swarm is expected to form clusters around the minima. • If a bias is added towards the steepest descent direction, this will be accelerated. • Locating the minima then may be tackled, to a large extend, as a Clustering Problem (CP). • However is not a regular CP, since it can benefit from information supplied by the objective function.
Modified PSO • Global component is set to zero. • A component pointing towards the steepest descent direction* is added to accelerate the process. • So the swarm motion is described by: *A. Ismael F. Vaz, M.G.P. Fernantes
Clustering • Clustering problem: “Partition a data set into M disjoint subsets containing points with one or more properties in common” • A commonly used property refers to topographical grouping based on distances. • Plethora of Algorithms: • K-Means, Hierarchical -Single linkage-Quantum-Newtonian clustering.
Global k-means • Minimize the clustering error • It is an incremental procedure using the k-Means algorithm repeatedly • Independent of the initialization choice. • Has been successfully applied to many problems. A. Likas
Spectral Clustering • Algorithms that cluster points using eigenvectors of matrices derived from the data • Obtain data representation in the low-dimensional space that can be easily clustered • Variety of methods that use the eigenvectors differently • Useful information can be extracted from the eigenvalues
The Affinity Matrix This symmetric matrix is of key importance. Each off-diagonal element is given by:
The Affinity Matrix • Let and for The Matrix is diagonalized and let be its eigenvalues sorted in descending order. The gap which is biggest, identifies the number of clusters (k).
Subset of Cisi/Medline dataset Two clusters: IR abstracts, Medical abstracts 650 documents, 3366 terms after pre-processing • Spectral embedded space based constructed from two largest eigenvectors: Simple example
Eigengap: the difference between two consecutive eigenvalues. Most stable clustering is generally given by the value k that maximises the expression Largest eigengap λ1 • Choose k=2 λ2 How to select k?
Putting the pieces together • Apply modified particle swarm to form clusters around the minima • Construct the affinity matrix A and compute the eigenvalues of M. • Use only distance information • Add gradient information • Find the largest eigengap and identify k. • Perform global k-means using the determined k • Use pairwise distances and centroids • Use affinity matrix and medoids (with gradient info)
Adding information to Affinity matrix • Use the gradient vectors to zero out pairwise affinities. • New formula : • Do not associate particles that would become more distant if they would follow the negative gradient.
Adding information to Affinity matrix Black arrow: Gradient of particle i Green arrows: Gradient of j with non zero affinity to i Red arrows: Gradient of j with zero affinity to i
From global k-means to global k-medoids • Original global k-means
Rastrigin function (49 minima) After modified particle Swarm Gradient information
Rastrigin function Estimation of k using distance Estimation of k using gradient info
Shubert function (100 minima) After modified particle Swarm Gradient information
Shubert function Estimation of k using distance Estimation of k using gradient info
Ackley function (25 minima) After modified particle Swarm Gradient information
Shubert function Estimation of k using distance Estimation of k using gradient info