We present NG-DBSCAN, an approximate density-based clustering algorithm that can operate with arbitrary similarity metrics. The distributed design of our algorithm makes it scalable to very large datasets; its approximate nature makes it fast, yet capable of producing high quality clustering results. We provide a detailed overview of the various steps of NG-DBSCAN, together with their analysis. Our results, which we obtain through an extensive experimental campaign with real and synthetic data, substantiate our claims about NG-DBSCAN's performance and scalability.

NGDBSCAN: scalable density based clustering for arbitrary data

2016

Abstract

We present NG-DBSCAN, an approximate density-based clustering algorithm that can operate with arbitrary similarity metrics. The distributed design of our algorithm makes it scalable to very large datasets; its approximate nature makes it fast, yet capable of producing high quality clustering results. We provide a detailed overview of the various steps of NG-DBSCAN, together with their analysis. Our results, which we obtain through an extensive experimental campaign with real and synthetic data, substantiate our claims about NG-DBSCAN's performance and scalability.
2016
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Clustering
File in questo prodotto:
File Dimensione Formato  
prod_367399-doc_121538.pdf

accesso aperto

Descrizione: NGDBSCAN: scalable density based clustering for arbitrary data
Tipologia: Versione Editoriale (PDF)
Dimensione 1.07 MB
Formato Adobe PDF
1.07 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/355283
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 75
  • ???jsp.display-item.citation.isi??? 58
social impact