Clustering can be defined as the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups, called clusters. Traditional clustering methods require that all data have to be located at the site where they are analyzed and cannot be applied in the case of multiple distributed datasets. This paper describes a multi-agent algorithm for clustering distributed data in a peer-to-peer environment. The algorithm proposed is based on the biology-inspired paradigm of a flock of birds. Agents, in this context, are used to discovery clusters us ing a density-based approach. Swarm-based algorithms have attractive features that include adaptation, robustness and a distributed, decentralized nature, making them well-suited for clustering in p2p networks, in which it is difficult to implement centralized network control. We have applied this algorithm on synthetic and real world datasets and we have measured the impact of the flocking search strategy on performance in terms of accuracy and scalability.
Swarm-based Distributed Clustering in Peer-to-Peer Systems
Folino Gianluigi;Spezzano Giandomenico;Forestiero Agostino
2006
Abstract
Clustering can be defined as the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups, called clusters. Traditional clustering methods require that all data have to be located at the site where they are analyzed and cannot be applied in the case of multiple distributed datasets. This paper describes a multi-agent algorithm for clustering distributed data in a peer-to-peer environment. The algorithm proposed is based on the biology-inspired paradigm of a flock of birds. Agents, in this context, are used to discovery clusters us ing a density-based approach. Swarm-based algorithms have attractive features that include adaptation, robustness and a distributed, decentralized nature, making them well-suited for clustering in p2p networks, in which it is difficult to implement centralized network control. We have applied this algorithm on synthetic and real world datasets and we have measured the impact of the flocking search strategy on performance in terms of accuracy and scalability.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


