1.45k likes | 1.5k Views
Recent Advances in Particle Swarm Optimization. Yong Wang. Associate Professor, Ph.D. School of Information Science and Engineering, Central South University ywang@csu.edu.cn. Outline of Talk. The Origin of Particle Swarm Optimization The Classic Particle Swarm Optimization
E N D
Recent Advances in Particle Swarm Optimization Yong Wang Associate Professor, Ph.D. School of Information Science and Engineering, Central South University ywang@csu.edu.cn
Outline of Talk • The Origin of Particle Swarm Optimization • The Classic Particle Swarm Optimization • The State-of-the-Art Particle Swarm Optimization
Outline of Talk • The Origin of Particle Swarm Optimization • The Classic Particle Swarm Optimization • The State-of-the-Art Particle Swarm Optimization
The Origin of Particle Swarm Optimization • Particle Swarm Optimization (PSO) was invented by James Kennedy and Russ Eberhart in 1995 • Originally, PSO is proposed for optimization of continuous nonlinear functions J. Kennedy R. Eberhart A social psychologist An electrical engineer J. Kennedy and R. C. Eberhart, “Particle swarm optimization,” in Proc. IEEE Int. Conf. Neural Networks, 1995, pp. 1942-1948.
The Origin of Particle Swarm Optimization • PSO has roots in two main component methodologies • Artificial lift (in general) • Bird flocking, fish schooling, and swarming theory (in particular) • It is also related to evolutionary algorithms • Genetic algorithm • Evolutionary programming
The Origin of Particle Swarm Optimization • Simulating social behavior • Bird flocking movement • Reynolds: bird flocking choreography • Heppner: synchrony, sudden direction change, scatting, and regrouping • Their simulation relied heavily on manipulation of inter-individual distances
The Origin of Particle Swarm Optimization • Simulating social behavior • Fish schooling movement • Wilson: social sharing of information among conspeciates offers an evolutionary advantage • The above point of view is the fundamental to the development of PSO
The Origin of Particle Swarm Optimization • Simulating social behavior • Human social behavior • Its abstractness: physical and psychosocial change • Mapping: change movement
The Origin of Particle Swarm Optimization • The etiology of PSO — a simulation of a simplified social milieu • In the simulation, each agent was considered as a collision-proof bird • The original intent was to graphically simulate the graceful but unpredictable choreography of a bird flocking
The Origin of Particle Swarm Optimization • Conceptual development of PSO • Nearest neighbor velocity matching and craziness a Va,xand va,y y Nearest neighbor velocity matching synchrony + vb,xand vb,y vc,xand vc,y b craziness c Random variation x The drawback: the variation was wholly artificial
The Origin of Particle Swarm Optimization • Conceptual development of PSO • The cornfield vector • Heppner’s simulation: eliminate the need for craziness • The drawback of Heppner’s simulation: the birds knew where the roost was • How do the birds find food: the use of one another's knowledge • Each agent remembers its best position:pbesti,xand pbesti,y • Each agent knows the globally best position: gbestx and gbesty y roost x The cornfield vector The system parameter If an agent is to the left of its pbesti,x, vi,x=vi,x+rand*p_increment If an agent is to the right of its pbesti,x, vi,x=vi,x-rand*p_increment Movement equation If presenti,x>gbestx, vi,x=vi,x-rand*g_increment If presenyi,x<gbestx, vi,x=vi,x+rand*g_increment
The Origin of Particle Swarm Optimization • Conceptual development of PSO • Some observations from the cornfield vector’s simulation • If p_increment and g_increment set relatively high, the flock seemed to sucked violently into the cornfield • If p_increment and g_increment set relatively lower, the flock swirled around the cornfield, swinging out rhythmically with subgroups synchronized, and finally landing on the target y y roost roost x x
The Origin of Particle Swarm Optimization • Conceptual development of PSO • Eliminating ancillary variables • The algorithm without craziness works just as well and looks just as realistic • The algorithm without nearest neighbor velocity matching converges slightly faster • Therefore, both of them can be removed • However, the pbest and gbest and their increments are necessary
The Origin of Particle Swarm Optimization • Conceptual development of PSO • Multidimensional search • Change presenti,x, presenti,y, vi,x, and vi,y to N*D matrices The number of agents The number of dimensions
The Origin of Particle Swarm Optimization • Conceptual development of PSO • Acceleration by distance • Though the algorithm worked well, there was something aesthetically displeasing and hard to understand If an agent is to the left of its pbesti,j If an agent is to the right of its pbesti,j If presenti,j>gbestj If presenyi,j<gbestj vi,j=vi,j+rand*p_increment*(pbesti,j-presenti,j)+rand*g_increment*(gbestj-presenti,j) i=1,…,N, j=1,…,D
The Origin of Particle Swarm Optimization • The simplified version • It is very hard to set p_increment and g_increment vi,j=vi,j+rand*p_increment*(pbesti,j-presenti,j)+rand*g_increment*(gbestj-presenti,j) vi,j=vi,j+2*rand*(pbesti,j-presenti,j) +2*rand*(gbestj-presenti,j) stochastic average of pbest and gbest
The Origin of Particle Swarm Optimization • Swarm • The behavior of the population of the agents is now more like a swarm than a flock • Millonas defined five basic principles of swarm intelligence • Proximity: the population should be able to carry out simple space and time computations • Quality: the population should be able to respond to quality factors in the environment • Diverse response: the population should not commit its activities along excessively narrow channels • Stability: the population should not change its mode of behavior every time the environment changes • Adaptability: The population must be able to change behavior mode when it is worth the computational price
The Origin of Particle Swarm Optimization • Particle • Each member of the population is mass-less and volume-less • The velocity and acceleration are more appropriately applied to particles, even if each is defined to have arbitrarily small mass and volume Maybe it is more suitable to call each member a “point”
The Origin of Particle Swarm Optimization • The similarity with other evolutionary algorithms (EAs) • It is highly dependent on stochastic processes • The adjustment toward pbest and gbest by PSO is conceptually similar to crossover operation of genetic algorithm • It uses the concept of fitness
The Origin of Particle Swarm Optimization • The difference with other evolutionary algorithms (EAs) • In PSO, each particle is flying potential solution through hyperspace, accelerating toward better solution • Other EAs operate directly on potential solutions which are represented as location in hyperspace • PSO has memory and does not have selection operator
The Origin of Particle Swarm Optimization • Why is PSO an effective tool for optimization? • The balance between the exploration and the exploitation The stochastic factors allow thorough search of spaces between regions that have been found The momentum effect caused by modifying the extant velocities rather than replacing them vi,j=vi,j+2*rand*(pbesti,j-presenti,j) +2*rand*(gbestj-presenti,j) i=1,…,N, j=1,…,D
The Origin of Particle Swarm Optimization • The local version of PSO vi,j=vi,j+2*rand*(pbesti,j-presenti,j) +2*rand*(gbestj-presenti,j) vi,j=vi,j+2*rand*(pbesti,j-presenti,j) +2*rand*(lbestj-presenti,j) The local best of the neighborhood of each particle Remark: how to define the local topological neighborhood of the particles is one of the most important aspects in the local version of PSO R. C. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proc. 6th Int. Symp. Micromachine Human Sci., Nagoya, Japan, 1995, pp. 39-43.
The Origin of Particle Swarm Optimization • The global version VS the local version • The global version biases more toward exploitation, has a faster convergence speed, and performs better on unimodal problems • The local version focuses more on exploration, is less likely to get trapped in a local optimal solution, and performs better on multimodal problems • The global version and has a higher probability of converging to a local optimum than the local model • The local model and has a slower convergence speed than the global model Remark 1: expanding the neighborhood speeds up convergence, but introduce the frailties of the global version Remark 2: The former is an extreme case of the latter
Outline of Talk • The Origin of Particle Swarm Optimization • The Classic Particle Swarm Optimization • The State-of-the-Art Particle Swarm Optimization
The Classic Particle Swarm Optimization • The velocity update equation of the original PSO • A generalization vi,j=vi,j+2*rand*(pbesti,j-presenti,j) +2*rand*(gbestj-presenti,j) the global version vi,j=vi,j+2*rand*(pbesti,j-presenti,j) +2*rand*(lbestj-presenti,j) the local version vt+1i,j=vti,j+c1r1(pbestti,j-xti,j) +c2r2(gbesttj-xti,j) vt+1i,j=vti,j+c1r1(pbestti,j-xti,j) +c2r2(lbesttj-xti,j) t is the generation number i=1,…N,j=1,..,D c1 and c2are the acceleration constants r1 and r2 are uniformly distributed random numbers in the range [0,1]
The Classic Particle Swarm Optimization • The movement equation of the classic PSO velocity update equation position update equation social part cognition part Remark: in PSO, each variable is updated independently Kennedy, J. The particle swarm: social adaptation ofknowledge, Proc. IEEE International Conference on Evolutionary Computation (Indianapolis, Indiana), IEEE Service Center, Piscataway, NJ, pp. 303-308,1997.
The Classic Particle Swarm Optimization • The main principle of the movement equation the global version
The Classic Particle Swarm Optimization • The framework of the classic PSO
The Classic Particle Swarm Optimization • Some issues • should be limited by some value • It keeps the computer from overflowing • It simulates the incremental changes of human learning and attitude change • It determines the granularity of search of the problem space • If is out of the search region [aj,bj], a repair operator should be applied &
The Classic Particle Swarm Optimization • There are three main parameters in the classic PSO • If is toot high, particles might fly past good solutions • If is too small, particles might not explore sufficiently beyond locally good regions • Usually, • c1 andc2 • Low values allow particles to roam far from target region before being tugged back • High values result in abrupt movement toward target region • Usually, c1=c2=2
Outline of Talk • The Origin of Particle Swarm Optimization • The Classic Particle Swarm Optimization • The State-of-the-Art Particle Swarm Optimization
The Current Research Directions of PSO • Due to its simplicity and effectiveness, PSO has become a popular optimizer • Much effort has been made to improve the performance of PSO • Adapting the control parameters • such as w, c1, and c2 • Designing population topology • the way particle communication or share information with each other • static topology or dynamic topology • Hybrid PSO with auxiliary operations • Self-learning and adaptive strategies
The Current Research Directions of PSO • Part I: Adapting the control parameters
A Modified PSO (1/2) • The classic velocity update equation vt+1i,j=vti,j+c1r1(pbestti,j-xti,j) +c2r2(gbesttj-xti,j) local search global search ① ② ③ • If the above equation only has the first term, the particles will keep on “flying” at the current speed in the same directions until they hit the boundary • If the above equation does not have the first term, the search range statistically shrinks through the generations, and the swarm stagnates due to the lack of momentum • By adding the first term, the particles have a tendency to expand the search space, having the ability to explore the new area
A Modified PSO (2/2) • How to balance the global and local search in PSO • A inertia weight w is added into the generalized velocity update equation • A large inertia weight facilitates a global search • A small inertia weight facilitates a local search • How to set the parameter value of w • Medium setting, such as 0.9<w<1.2 • Linearly decreasing setting, such as w starts at 0.9 and end at 0.4 vt+1i,j=wvti,j+c1r1(pbestti,j-xti,j) +c2r2(gbesttj-xti,j) Y. Shi and R. C. Eberhart, “A modified particle swarm optimizer,” in Proc. IEEE Int. Conf. Evolutionary Computation, Anchorage, AK, May 1998, pp. 69-73.
A Generalized Model (1/3) • Explosion in the classic PSO • Due to the random weighting of the control parameters, the velocities and position of the particles might careen toward infinity. • The explosion can be contained by making using of and the inertia weight • However, it is still an open issue to set the value of
A Generalized Model (2/3) • Clerc and Kennedy proposed the use of a constriction coefficient by analyzing a particle’s trajectory • The generalized model constriction coefficient where and a special case In general, and 1.49445 0.729 1.49445 Clerc M, Kennedy J. The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Transactions on Evolutionary Computation, vol. 6, no. 1, pp. 58-73, 2002.
A Generalized Model (3/3) • The advantages of the generalized model • The explosion can be prevented • It makes the limit unnecessary • It can guarantee convergence on local optima • However, the performance of the generalized model can be significantly improved by limiting to R. C. Eberhart and Y. Shi, “Comparing inertia weights and constriction factors in particle swarm optimization,” in Proceeding of 2000 Congress on Evolutionary Computation, San Diego, CA, July 2000, pp. 84-88.
Self-organizing Hierarchical PSO with Time-varying Acceleration Coefficients (1/5) • Motivation • The ability of PSO with time-varying inertia weight to find the optimum is relatively weak, due to the lack of diversity at the end of the search • In PSO, problem-based tuning of parameters is also a key factor to find the optimum accurately and efficiently • Main ideas • Time-varying acceleration coefficients (TVAC) • PSO with mutation and TVAC (MPSO-TVAC) • Self-organizing hierarchical PSO with TVAC (HPSO-TVAC) A. Ratnaweera, S. Halgamuge, and H. Watson, “Self-organizing hierarchical particle swarm optimizer with time varying accelerating coefficients,” IEEE Trans. Evol. Comput., vol. 8, pp. 240-255, Jun. 2004.
Self-organizing Hierarchical PSO with Time-varying Acceleration Coefficients (2/5) • Time-varying acceleration coefficients (TVAC) • During the early stages, it is desirable to encourage the particles to wander through the entire search space • During the later stages, it is very important to enhance the convergence speed toward the optimum varyingfrom 0.9 at the beginning to 0.4 at the end With a large c1 and small c2, particles have a capability to explore the search space With a small c1 and large c2, particles have a very fast convergence speed
Self-organizing Hierarchical PSO with Time-varying Acceleration Coefficients (3/5) • PSO with mutation and TVAC (MPSO-TVAC) • A mutation is proposed to enhance the global search ability of the particles by providing additional diversity the mutation probability The improvement of the global best over the generations the mutation step size Remark: time-varying mutation step size is an effective parameter tuning strategy
Self-organizing Hierarchical PSO with Time-varying Acceleration Coefficients (4/5) • Self-organizing hierarchical PSO with TVAC (HPSO-TVAC) • Only the social part and the cognitive part are considered to estimate the new velocity of each particle • The velocity of a particle is reinitialized if it stagnates in the search space The reinitialization velocity which is proportional to vmax Remark: time-varying reinitialization velocity was used
Self-organizing Hierarchical PSO with Time-varying Acceleration Coefficients (5/5) • Experimental results • In TVAC, the best varying range for c1 and c2 are 2.5—0.5 and 0.5—2.5, respectively • In MPSO-TVAC, the mutation step size changes from vmax to 0.1vmax • In MPSO-TVAC, it is good to set the mutation probability in the range of [0.4,0.8] • In HPSO-TVAC, reinitialization velocity changes from vmax to 0.1vmax • The performance of MPSO and HPSO with fix acceleration coefficients at 2 (i.e., c1=2 and c2=2) is very poor
Adaptive PSO (1/11) • Motivation • In PSO, the time-varying parameters are set based on the generation number, they may be inappropriate because no information on the evolutionary state is utilized • Some auxiliary techniques (such as mutation, selection and so on) have been developed for PSO, however, the performance can be further enhanced if they are adaptively performed according to the evolutionary state Z.-H. Zhan, J. Zhang, Y. Li, and H. S.-H. Chung, “Adaptive particle swarm optimization,” IEEE Trans. Syst. Man Cybern. B, vol. 39, no. 6, pp. 1362-1381, 2009.
Adaptive PSO (2/11) • Main ideas • The population distribution information is used to estimate the evolutionary state • The parameters (such as w, c1 and c2) of PSO are adaptively adjusted according to the estimated evolutionary state • An elitist learning strategy is designed
Adaptive PSO (3/11) • Population distribution information in PSO (a) Generation = 1 (b) Generation = 25 (c) Generation = 49 (d) Generation = 50 (e) Generation = 60 (f) Generation = 80 Remark: The population distribution information can significantly vary during the evolution
Adaptive PSO (4/11) • How to formulate the population distribution information • Calculate the mean distance of each particle (di) with all the other particles in the decision space • Compare all di and determine the maximum dmax and minimum dmin • Define of the mean distance of the global best particle as dg • Compute the evolutionary factor: f=(dg-dmin)/(dmax-dmin) S1 Exploration S2 Exploitation Remark: f can formulate the population distribution information S3 Convergence Jumping out S4 four evolutionary states
Adaptive PSO (5/11) • Evolutionary state estimation (ESE) f=(dg-dmin)/(dmax-dmin) exploitation or convergence dgdpi f is a relatively less value jumping out dgdpi f is a relatively larger value exploration dg≈dpi f is a medium value
Adaptive PSO (6/11) • Evolutionary state estimation (ESE) • The state transition would be nondeterministic and fuzzy and different algorithms could exhibit different characters of the transition • The fuzzy classification is adopted • How to define the membership functions Exploitation: a shrink value Exploration: a medium value Convergence: a less value Jumping out: a large value
Adaptive PSO (7/11) • Evolutionary state estimation (ESE) • In the fuzzy classification, the membership functions are overlapped • Defuzzification singleton method f is classify to S2 PSO is in a transitional period between S1 and S2 f=0.45 f is classify to S1