In this paper, we focus on the problem of providing accurate estimates to a target data cube from sets of source data cubes, which share the same summary measures. We investigate the acyclic and cyclic schemas of data sources and show that the more accurate target data cube can be computed on the basis of third and fourth standardized moments (i.e., skewness and kurtosis, respectively) of the source data cubes. In particular, we consider the mean absolute deviation of these parameters over the data cubes involved in the sets. For both standardized moments, we prove formally the set that achieves higher mean of skewness/kurtosis shows lower mean absolute deviation of skewness/kurtosis. On the basis of these results, we prove the set of data cubes with the lower mean absolute deviation of skewness/kurtosis achieves more accurate target estimate. We show that groups with cyclic schemas that present lower mean absolute deviation of standardized moments, achieve more accurate target estimate with respect to the ones with acyclic schemas as well. We define an algorithm that identifies within each sets of acyclic and cyclic groups, the one that achieves more accurate estimate for the target data cube on the basis of standardized moments. We provide some experimental results in order to verify our theoretical results. (C) 2020 Elsevier Ltd. All rights reserved.
Providing accurate answers to OLAP queries based on standardized moments of data cubes
Pourabbas Dolatabad, Elaheh
2020
Abstract
In this paper, we focus on the problem of providing accurate estimates to a target data cube from sets of source data cubes, which share the same summary measures. We investigate the acyclic and cyclic schemas of data sources and show that the more accurate target data cube can be computed on the basis of third and fourth standardized moments (i.e., skewness and kurtosis, respectively) of the source data cubes. In particular, we consider the mean absolute deviation of these parameters over the data cubes involved in the sets. For both standardized moments, we prove formally the set that achieves higher mean of skewness/kurtosis shows lower mean absolute deviation of skewness/kurtosis. On the basis of these results, we prove the set of data cubes with the lower mean absolute deviation of skewness/kurtosis achieves more accurate target estimate. We show that groups with cyclic schemas that present lower mean absolute deviation of standardized moments, achieve more accurate target estimate with respect to the ones with acyclic schemas as well. We define an algorithm that identifies within each sets of acyclic and cyclic groups, the one that achieves more accurate estimate for the target data cube on the basis of standardized moments. We provide some experimental results in order to verify our theoretical results. (C) 2020 Elsevier Ltd. All rights reserved.File | Dimensione | Formato | |
---|---|---|---|
prod_434603-doc_155312.pdf
solo utenti autorizzati
Descrizione: Providing accurate answers to OLAP queries based on standardized moments of data cubes
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.41 MB
Formato
Adobe PDF
|
1.41 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.