320 likes | 456 Views
Adham Atyabi Somnuk Phon-Amnuaisuk, Chin Kuan Ho Multimedia University, Malaysia Paper Accepted at IEEE Congress on Evolutionary Computation (CEC 2008), Hong Kong, June 1-6 2008. Cooperative Learning of Homogeneous and Heterogeneous Particles in Area-extension PSO. The environment PSO
E N D
Adham Atyabi Somnuk Phon-Amnuaisuk, Chin Kuan Ho Multimedia University, Malaysia Paper Accepted at IEEE Congress on Evolutionary Computation (CEC 2008), Hong Kong, June 1-6 2008 Cooperative Learning of Homogeneous and Heterogeneous Particles in Area-extension PSO IEEE Congress on Evolutionary Computation CEC2008
The environment PSO Area-extension PSO Learning in PSO Results and conclusion Outline IEEE Congress on Evolutionary Computation CEC2008
The environment is a hostile robotic scenario based on cooperative robots trying to locate bombs and disarm them. The robots know the likelihood of having bombs in the area, but not their precise locations. The likelihood information could be uncertain (because of noise). Simulated Environment IEEE Congress on Evolutionary Computation CEC2008 3
Plausible applications: applying PSO to navigate agents in uncertain environments IEEE Congress on Evolutionary Computation CEC2008
PSO is an Evolutionary Algorithm (EA) inspired from animal social behaviors. (Kennedy, Eberhart, 1995) The method was inspired by the movement of flocking birds and their interactions with their neighbors in the group. EA achieves optimization using three primary principles: Evaluation, where quantitative fitness can be determined for each agent (particle); Comparison, where the best performer among agents can be selected; Imitation, where the qualities of better agents are mimicked by others. Particle Swarm Optimisation IEEE Congress on Evolutionary Computation CEC2008
Every particle in the population begins with a randomized position Xij and randomized velocity Vij in the n-dimensional search space, where i represent the particle index and j represents the dimension in the search space Each particle remembers the position at which it achieved its highest performance (p). Each particle is also a member of some neighborhood of particles, and remembers which particle achieved the best overall position in that neighborhood (g). Vij(t) = Last Velocity + Cognitive component + Social component Vij(t) = w*Vij(t-1)+C1*R1*(pij-Xij(t-1))+C2*R2*(gi-Xij(t-1)) Xij(t) = Xij(t-1) + Vij(t) Update equations IEEE Congress on Evolutionary Computation CEC2008
Trajectory of five simulated agents navigated using basic PSO IEEE Congress on Evolutionary Computation CEC2008
On a good day IEEE Congress on Evolutionary Computation CEC2008 9
Points to ponder Particles seem to stick in one place. This results in ineffective exploitation and exploration. How should we inform the swarm about fruitful positions? (without giving away the solution). How should the swarm communicate/share useful information? There is no free lunch. Basic PSO does not seem to work. Why? IEEE Congress on Evolutionary Computation CEC2008 10
The idea is based on dividing the environment to sub virtual fixed areas with various credits. Credit in the area is defined as the proportion of goals and obstacles positioned in the area. Particles know the credit of the first and second layer of its current neighborhood. Area-Extension PSO IEEE Congress on Evolutionary Computation CEC2008
New velocity update rules. Help Request Signal which provide cooperation between different sub-swarms. Reward and penalty of their actions which are used in the controls of Leave Force and Speculation mechanisms. Leave Force and Speculation mechanisms help prevent particles from over-exploring unfruitful areas. Heuristics to guide search IEEE Congress on Evolutionary Computation CEC2008
Velocity update rules IEEE Congress on Evolutionary Computation CEC2008
Communication among particles • Particles can only communicate with those who are in their communication range. • Various communication ranges are used (500, 250, 125, 5 pixels). • This heuristic has a major effect on the sub swarm size. • Help request signal can provide a chain of connections. IEEE Congress on Evolutionary Computation CEC2008 14
More knowledge IEEE Congress on Evolutionary Computation CEC2008
Fitness of PSO, P(.) and G(.) • The fitness of a particle is derived from the number of bombs in its observation. • Local best P(.) and global best G(.) are derived from the fitness values according to its own observation and from observations shared from other particles. • Movements of AEPSO is derived from both velocity update rules and informed fitness derived from hot zone. • Movement of cooperative learning AEPSO is derived from velocity update rules hot zone and effective direction learned from experience. IEEE Congress on Evolutionary Computation CEC2008 16
Movement Trajectories (homogeneous) IEEE Congress on Evolutionary Computation CEC2008
AEPSO vs. Random Search and Linear Search IEEE Congress on Evolutionary Computation CEC2008
Area-Extension-Cooperative PSO • Particles incorporate knowledge from their training session. • Particles share their priority areas with others. • In homogeneous PSO, all particles are considered to have the same properties. • In heterogeneous PSO, particles do not have the same properties. In the experiment, this is translated to the environments with different bomb types which require different particles (types of robots to disarm them). IEEE Congress on Evolutionary Computation CEC2008
The aim is to learn (a) which is the best area? (b) what decision is made? and (c) is it the right decision? In the training phase, the training method could be either Individual training or Team- based training. Initialization in the testing phase may be either with the same initialization as the training or with different initialization. Learning in AEPSO IEEE Congress on Evolutionary Computation CEC2008
More knowledge IEEE Congress on Evolutionary Computation CEC2008 21
Simulation results IEEE Congress on Evolutionary Computation CEC2008
Homogeneous # bomb explosions The results are from 20 runs (each run is 20,000 iteration). In each run, 5 robots, 51 bombs, and 51 obstacles are used (table IV). IEEE Congress on Evolutionary Computation CEC2008
Homogeneous # disarmed bombs IEEE Congress on Evolutionary Computation CEC2008
Heterogeneous # bomb explosions IEEE Congress on Evolutionary Computation CEC2008
Heterogeneous # disarmed bombs IEEE Congress on Evolutionary Computation CEC2008
Heterogeneous AEPSOMovement trajectory after learning IEEE Congress on Evolutionary Computation CEC2008
Balancing between exploitation and exploration is the key to good performance. No free lunch theory is true in general. Enhancing PSO with extra knowledge may lead to useful applications in exploration tasks (e.g., sea bed exploration). Conclusion IEEE Congress on Evolutionary Computation CEC2008 28
THANK YOUQ&A IEEE Congress on Evolutionary Computation CEC2008
Parameters setting IEEE Congress on Evolutionary Computation CEC2008 31
Heterogeneous AEPSOMovement trajectory before learning IEEE Congress on Evolutionary Computation CEC2008