In this paper we address the problem of mining frequent closed itemsets in a distributed setting. We gure out an environment where a transactional dataset is horizontally partitioned and stored in di erent sites. We assume that due to the huge size of datasets and privacy concerns dataset partitions cannot be moved to a centralized site where to materialize the whole dataset and perform the mining task. Thus it becomes mandatory to perform separate mining on each site, and then merge the local results do derive a global knowledge. This paper shows how frequent closed itemsets, mined independently in each site, can be merged in order to derive globally frequent closed itemsets. Unfortunately, such merging might produce a superset of all the frequent closed itemsets, while the associated supports could be smaller than the exact ones because some globally frequent closed itemsets might be not locally frequent in some partition. A post-processing
Distributed mining of frequent closed itemsets: some preliminary results
Orlando S;Lucchese C;Perego R
2005
Abstract
In this paper we address the problem of mining frequent closed itemsets in a distributed setting. We gure out an environment where a transactional dataset is horizontally partitioned and stored in di erent sites. We assume that due to the huge size of datasets and privacy concerns dataset partitions cannot be moved to a centralized site where to materialize the whole dataset and perform the mining task. Thus it becomes mandatory to perform separate mining on each site, and then merge the local results do derive a global knowledge. This paper shows how frequent closed itemsets, mined independently in each site, can be merged in order to derive globally frequent closed itemsets. Unfortunately, such merging might produce a superset of all the frequent closed itemsets, while the associated supports could be smaller than the exact ones because some globally frequent closed itemsets might be not locally frequent in some partition. A post-processing| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_120539-doc_128050.pdf
accesso aperto
Descrizione: Distributed mining of frequent closed itemsets: some preliminary results
Tipologia:
Documento in Pre-print
Dimensione
182.16 kB
Formato
Adobe PDF
|
182.16 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


