In this paper, we focus on the problem of providing accurate estimates to a target data cube from sets of source data cubes, which share the same summary measures. We investigate the acyclic and cyclic schemas of data sources and show that the more accurate target data cube can be computed on the basis of third and fourth standardized moments (i.e., skewness and kurtosis, respectively) of the source data cubes. In particular, we consider the mean absolute deviation of these parameters over the data cubes involved in the sets. For both standardized moments, we prove formally the set that achieves higher mean of skewness/kurtosis shows lower mean absolute deviation of skewness/kurtosis. On the basis of these results, we prove the set of data cubes with the lower mean absolute deviation of skewness/kurtosis achieves more accurate target estimate. We show that groups with cyclic schemas that present lower mean absolute deviation of standardized moments, achieve more accurate target estimate with respect to the ones with acyclic schemas as well. We define an algorithm that identifies within each sets of acyclic and cyclic groups, the one that achieves more accurate estimate for the target data cube on the basis of standardized moments. We provide some experimental results in order to verify our theoretical results. (C) 2020 Elsevier Ltd. All rights reserved.

Providing accurate answers to OLAP queries based on standardized moments of data cubes

Pourabbas;Elaheh
2020

Abstract

In this paper, we focus on the problem of providing accurate estimates to a target data cube from sets of source data cubes, which share the same summary measures. We investigate the acyclic and cyclic schemas of data sources and show that the more accurate target data cube can be computed on the basis of third and fourth standardized moments (i.e., skewness and kurtosis, respectively) of the source data cubes. In particular, we consider the mean absolute deviation of these parameters over the data cubes involved in the sets. For both standardized moments, we prove formally the set that achieves higher mean of skewness/kurtosis shows lower mean absolute deviation of skewness/kurtosis. On the basis of these results, we prove the set of data cubes with the lower mean absolute deviation of skewness/kurtosis achieves more accurate target estimate. We show that groups with cyclic schemas that present lower mean absolute deviation of standardized moments, achieve more accurate target estimate with respect to the ones with acyclic schemas as well. We define an algorithm that identifies within each sets of acyclic and cyclic groups, the one that achieves more accurate estimate for the target data cube on the basis of standardized moments. We provide some experimental results in order to verify our theoretical results. (C) 2020 Elsevier Ltd. All rights reserved.
2020
Istituto di Analisi dei Sistemi ed Informatica ''Antonio Ruberti'' - IASI
OLAP
Query estimation
Accuracy
Skewness
Kurtosis
Acyclic and cyclic schemas
IPFP
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/385710
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact