In this paper, we introduce and experimentally assess ClustCube, an innovative OLAP-based framework for clustering and mining complex database objects extracted from distributed database settings by means of complex SQL statements involving multiple JOIN queries across (distributed) relational tables. To this end, ClustCube puts together conventional clustering techniques and well-consolidated OLAP methodologies in order to achieve higher expressive power and mining effectiveness over traditional methodologies for mining tuple-oriented information. A relevant challenge in our research is represented by the issue of efficiently computing ClustCube cubes, enriched by the respective cuboid lattices, which may represent a critical bottleneck for the proposed ClustCube framework. To face-off this drawback, we propose a collection of algorithms that implement an innovative distributive approach taking advantages from both the structured nature of complex database objects within cuboids and the distributive nature of clustering across hierarchical domains, like those defined by conventional OLAP schemas. © 2011 ACM.
ClustCube: An OLAP-based framework for clustering and mining complex database objects
Cuzzocrea Alfredo;
2011
Abstract
In this paper, we introduce and experimentally assess ClustCube, an innovative OLAP-based framework for clustering and mining complex database objects extracted from distributed database settings by means of complex SQL statements involving multiple JOIN queries across (distributed) relational tables. To this end, ClustCube puts together conventional clustering techniques and well-consolidated OLAP methodologies in order to achieve higher expressive power and mining effectiveness over traditional methodologies for mining tuple-oriented information. A relevant challenge in our research is represented by the issue of efficiently computing ClustCube cubes, enriched by the respective cuboid lattices, which may represent a critical bottleneck for the proposed ClustCube framework. To face-off this drawback, we propose a collection of algorithms that implement an innovative distributive approach taking advantages from both the structured nature of complex database objects within cuboids and the distributive nature of clustering across hierarchical domains, like those defined by conventional OLAP schemas. © 2011 ACM.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.