A novel algorithm, named $DESCRY$, for clustering very large multidimensional data sets with numerical attributes is presented. $DESCRY$ discovers clusters having different shape, size, and density and when data contains noise by first finding and clustering a small set of points, called {\it meta-points}, that well depict the shape of clusters present in the data set. Final clusters are obtained by assigning each point to one of the partial clusters. The computational complexity of DESCRY is linear both in the data set size and in the data set dimensionality. Experiments show the very good qualitative results obtained comparable with those obtained by state of the art clustering algorithms.
DESCRY: A Grid and Density Based Clustering Algorithm for Very Large Data Sets
Angiulli Fabrizio;
2004
Abstract
A novel algorithm, named $DESCRY$, for clustering very large multidimensional data sets with numerical attributes is presented. $DESCRY$ discovers clusters having different shape, size, and density and when data contains noise by first finding and clustering a small set of points, called {\it meta-points}, that well depict the shape of clusters present in the data set. Final clusters are obtained by assigning each point to one of the partial clusters. The computational complexity of DESCRY is linear both in the data set size and in the data set dimensionality. Experiments show the very good qualitative results obtained comparable with those obtained by state of the art clustering algorithms.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.