180 likes | 319 Views
Swarming Agents for Discovering Clusters in Spatial Data. G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it. Second International Symposium on Parallel and Distributed Computing Ljubljana, Slovenia · 13-14 October 2003. Sommario. Introduction Swarm intelligence
E N D
Swarming Agents for Discovering Clusters in Spatial Data G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Second International Symposium on Parallel and Distributed Computing Ljubljana, Slovenia · 13-14 October 2003
Sommario • Introduction • Swarm intelligence • Flocking algorithm • Clustering and spatial datasets • Sparrow-SNN • Experimental results • Conclusions and Future Works
Swarm Intelligence • Swarm Intelligence (SI) is the property of a system whereby the collective behaviors of (unsophisticated) agents interacting locally with their environment cause coherent functional global patterns to emerge. • A swarm has the following interesting properties: • Distributed, without central control • Ability to change the environment • Stigmergy (indirect communication via interaction with environment) • Fault tolerance • Adaptivity and self organization • Typical examples are ant colonies, flocks of birds, etc..
Flocking algorithm • Typical example of emergent collective behavior. • No global control • Every agent has a limited visibility • The collective behavior emerges only by local interation, following these three simple rules: SeparationAlignment Cohesion
Flocking algorithm • Agents could have an exploratory behavior: • Before, agents can search for a goal of particular interest • Then, the other flock members will be driven towards the goal in order to explore interesting area more carefully.
Clustering • Clustering means to divide all objects in different groups (clusters) so that all members of a cluster are as similar as possible whereas the members of different clusters differ as much as possible from each other. • Spatial clustering should identify clusters of different dimensions, size, shape and density (particularly difficult).
Clustering A different density spatial dataset
SNN algorithm (1) • SNN is based on the famous Jarvis-Patrick algorithm. • identifies the K nearest-neighbors of each object (data point) in the dataset. • two objects i and j join the same cluster if: 1) i is one of the K nearest-neighbors of j; 2) j is one of the K nearest-neighbors of i; 3) i and j have at least Kmin of their K-nearest- neighbors in common; • where K and Kmin are used-defined parameters. For each pair of points i and j is defined a link with an associate weight. • The connectivity of a data point is computed as the sum of the weights associated to the outgoing links.
SNN algorithm (2) • For every node (data point) calculate the connectivity; • Identify representative points by choosing the point that have high connectivity ( > core_threshold); • Identify noise points by choosing the points that have low connectivity ( < noise_threshold) and remove them; • Remove all links between points that have weight smaller than a threshold (merge_threshold) • Take connected components of points to form clusters, where every point in a cluster is either a representative point or is connected to a representative point.
SPARROW-SNN • Sparrow-SNN combine the stochastic search of an adaptive flocking with SNN to discover clusters in spatial data. • It uses a variant of the flocking algorithm: • Before, agents can search for a goal of particular interest • Then, the other flock’s members will be driven towards the goal in order to explore interesting area more carefully. • We used Swarm, a software package for multi-agent simulation of complex systems, for the implementation of Sparrow-SNN.
SPARROW-SNN Pseudo-code of the algorithm
SPARROW-SNN • N agents are generated randomly in the search space. • When an agent falls on a data point not previously explored computes the connectivity. • Using connectivity, agents take different colors: conn > core_threshold -> mycolor = red noise_threshold < conn <= core_threshold -> mycolor = green 0 < conn < noise_threshold -> mycolor = yellow conn = 0 -> mycolor = white • Agents can indicate a representative point (red), noise (yellow), border point (green), or obstacle (white). • Red and white agents will stop signaling to the others the interesting and desert regions.
SPARROW-SNN • Yellow and green agents will move following the modified rules of the flock (with repulsion from white agents and attraction towards red agents. • Besides, yellow agents move quickly (not interesting zones) whereas green agents move slowly. • red agents (placed on a representative point) will run the merge procedure so that it will include, in the final cluster, the representative point discovered together to the points that share with it a significant (greater that Pmin) number of neighbors.
Experimental results(random search vs Sparrow–SNN) a) GEORGE b) North-East
Conclusions and Future Works • Sparrow-SNN is able to discover cluster of arbitrary shape, size and density in spatial data. • Performs well approximate clustering. • is naturally distributed, fault tolerant and scalable. • We are working on implementing a new version of Sparrow using Anthill, a peer-to-peer multi agent system based on JXTA.