In this paper we address the problem of mining frequent closed itemsets in a distributed setting. We gure out an environment where a transactional dataset is horizontally partitioned and stored in di erent sites. We assume that due to the huge size of datasets and privacy concerns dataset partitions cannot be moved to a centralized site where to materialize the whole dataset and perform the mining task. Thus it becomes mandatory to perform separate mining on each site, and then merge the local results do derive a global knowledge. This paper shows how frequent closed itemsets, mined independently in each site, can be merged in order to derive globally frequent closed itemsets. Unfortunately, such merging might produce a superset of all the frequent closed itemsets, while the associated supports could be smaller than the exact ones because some globally frequent closed itemsets might be not locally frequent in some partition. A post-processing

Distributed mining of frequent closed itemsets: some preliminary results

Orlando S;Lucchese C;Perego R
2005

Abstract

In this paper we address the problem of mining frequent closed itemsets in a distributed setting. We gure out an environment where a transactional dataset is horizontally partitioned and stored in di erent sites. We assume that due to the huge size of datasets and privacy concerns dataset partitions cannot be moved to a centralized site where to materialize the whole dataset and perform the mining task. Thus it becomes mandatory to perform separate mining on each site, and then merge the local results do derive a global knowledge. This paper shows how frequent closed itemsets, mined independently in each site, can be merged in order to derive globally frequent closed itemsets. Unfortunately, such merging might produce a superset of all the frequent closed itemsets, while the associated supports could be smaller than the exact ones because some globally frequent closed itemsets might be not locally frequent in some partition. A post-processing
2005
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
H.2.8 Database Applications
Frequent Closed Itemsets Mining
File in questo prodotto:
File Dimensione Formato  
prod_120539-doc_128050.pdf

accesso aperto

Descrizione: Distributed mining of frequent closed itemsets: some preliminary results
Tipologia: Documento in Pre-print
Dimensione 182.16 kB
Formato Adobe PDF
182.16 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/97378
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact