This paper significantly extends our previous research contribution [1], where we introduced the OLAP-based ClustCube framework for clustering and mining complex database objects extracted from distributed database settings. In particular, in this research we provide the following two novel contributions over [1]. First, we provide an innovative tree-based distance function over complex objects that takes into account the typical tree-like nature of these objects in distributed database settings. This novel distance is a relevant contribution over the simpler low-level-fieldbased distance presented in [1]. Second, we provide a comprehensive experimental campaign of ClustCube algorithms for computing ClustCube cubes, according to both performance metrics and accuracy metrics, against a well-known benchmark data set, and in comparison with a state-of-the-art subspace clustering algorithm for high-dimensional data. Retrieved results clearly demonstrate the superiority of our approach. Copyright © 2012 ACM.
Enhanced clustering of complex database objects in the clustcube framework
Cuzzocrea Alfredo;
2012
Abstract
This paper significantly extends our previous research contribution [1], where we introduced the OLAP-based ClustCube framework for clustering and mining complex database objects extracted from distributed database settings. In particular, in this research we provide the following two novel contributions over [1]. First, we provide an innovative tree-based distance function over complex objects that takes into account the typical tree-like nature of these objects in distributed database settings. This novel distance is a relevant contribution over the simpler low-level-fieldbased distance presented in [1]. Second, we provide a comprehensive experimental campaign of ClustCube algorithms for computing ClustCube cubes, according to both performance metrics and accuracy metrics, against a well-known benchmark data set, and in comparison with a state-of-the-art subspace clustering algorithm for high-dimensional data. Retrieved results clearly demonstrate the superiority of our approach. Copyright © 2012 ACM.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.