Clustering can be defined as the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups, called clusters. Traditional clustering methods require that all data have to be located at the site where they are analyzed and cannot be applied in the case of multiple distributed datasets. This paper describes a multi-agent algorithm for clustering distributed data in a peer-to-peer environment. The algorithm proposed is based on the biology-inspired paradigm of a flock of birds. Agents, in this context, are used to discovery clusters us ing a density-based approach. Swarm-based algorithms have attractive features that include adaptation, robustness and a distributed, decentralized nature, making them well-suited for clustering in p2p networks, in which it is difficult to implement centralized network control. We have applied this algorithm on synthetic and real world datasets and we have measured the impact of the flocking search strategy on performance in terms of accuracy and scalability.

Swarm-based Distributed Clustering in Peer-to-Peer Systems

Folino Gianluigi;Spezzano Giandomenico;Forestiero Agostino
2006

Abstract

Clustering can be defined as the process of partitioning a set of patterns into disjoint and homogeneous meaningful groups, called clusters. Traditional clustering methods require that all data have to be located at the site where they are analyzed and cannot be applied in the case of multiple distributed datasets. This paper describes a multi-agent algorithm for clustering distributed data in a peer-to-peer environment. The algorithm proposed is based on the biology-inspired paradigm of a flock of birds. Agents, in this context, are used to discovery clusters us ing a density-based approach. Swarm-based algorithms have attractive features that include adaptation, robustness and a distributed, decentralized nature, making them well-suited for clustering in p2p networks, in which it is difficult to implement centralized network control. We have applied this algorithm on synthetic and real world datasets and we have measured the impact of the flocking search strategy on performance in terms of accuracy and scalability.
2006
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/126617
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact