In this paper, we provide foundations and theoretical results of a novel paradigm for supporting data stream miming algorithms effectively and efficiently, the so-called non-linear data stream compression model. Particularly, the proposed model falls in that class of data stream mining applications where interesting knowledge is extracted via suitable collections of OLAP queries from data streams, being latter ones baseline operations of complex knowledge discovery tasks over data streams implemented by ad-hoc data stream mining algorithms. Here, a fortunate line of research consists in admitting approximate, i.e. compressed, representation models and query/mining results at the benefit of a more efficient and faster computation. On top of this main assumption, the proposed non-linear data stream compression model pursues the idea of maintaining a lower degree of approximation (thus, as a consequence, a higher query error) for aggregate information on those data stream readings related to interesting events, and, by contrast, a higher degree of approximation (thus, as a consequence, a lower query error) for aggregate information on other data stream readings, i.e. readings not related to any particular event, or related to low-interesting events. © 2012 Springer-Verlag.
Non-linear data stream compression: Foundations and theoretical results
Cuzzocrea Alfredo;
2012
Abstract
In this paper, we provide foundations and theoretical results of a novel paradigm for supporting data stream miming algorithms effectively and efficiently, the so-called non-linear data stream compression model. Particularly, the proposed model falls in that class of data stream mining applications where interesting knowledge is extracted via suitable collections of OLAP queries from data streams, being latter ones baseline operations of complex knowledge discovery tasks over data streams implemented by ad-hoc data stream mining algorithms. Here, a fortunate line of research consists in admitting approximate, i.e. compressed, representation models and query/mining results at the benefit of a more efficient and faster computation. On top of this main assumption, the proposed non-linear data stream compression model pursues the idea of maintaining a lower degree of approximation (thus, as a consequence, a higher query error) for aggregate information on those data stream readings related to interesting events, and, by contrast, a higher degree of approximation (thus, as a consequence, a lower query error) for aggregate information on other data stream readings, i.e. readings not related to any particular event, or related to low-interesting events. © 2012 Springer-Verlag.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.