Given an OLAP query expressed over multiple source OLAP databases, we study the problem of estimating the resulting OLAP target database. The problem arises when it is not possible to derive the result from a single database. The method we use is linear indirect estimation, commonly used for statistical estimation. We examine two obvious computational methods for computing such a target database, called the full cross-product (F) and preaggregation (P) methods. We study the accuracy and computational cost of these methods. While the F method provides a more accurate estimate, it is more expensive computationally than P. Our contribution is in proposing a third, new method, called the partial preaggregation method (PP), which is significantly less expensive than F, but just as accurate. We prove formally that the PP method yields the same results as the F method, and provide analytical and experimental results on the accuracy and computational benefits of the PP method.

Efficient Estimation of Joint Queries from Multiple OLAP Databases

2007

Abstract

Given an OLAP query expressed over multiple source OLAP databases, we study the problem of estimating the resulting OLAP target database. The problem arises when it is not possible to derive the result from a single database. The method we use is linear indirect estimation, commonly used for statistical estimation. We examine two obvious computational methods for computing such a target database, called the full cross-product (F) and preaggregation (P) methods. We study the accuracy and computational cost of these methods. While the F method provides a more accurate estimate, it is more expensive computationally than P. Our contribution is in proposing a third, new method, called the partial preaggregation method (PP), which is significantly less expensive than F, but just as accurate. We prove formally that the PP method yields the same results as the F method, and provide analytical and experimental results on the accuracy and computational benefits of the PP method.
2007
Istituto di Analisi dei Sistemi ed Informatica ''Antonio Ruberti'' - IASI
OLAP
Query Estimation
Multiple Summary Databases
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/170248
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 6
social impact