Inspired by emerging multi-core computer architectures, in this paper we present MT CL O S E D, a multi-threaded algorithm for frequent closed itemset mining (FCIM). To the best of our knowledge, this is the first FCIM parallel algorithm proposed so far. We studied how different duplicate checking techniques, typical of FCIM algorithms, may affect this parallelization. We showed that only one of them allows to decompose the global FCIM problem into independent tasks that can be executed in any order, and thus in parallel. Finally we show how MT CL O S E D efficiently harness modern CPUs. We designed and tested several parallelization paradigms by investigating static/dynamic decomposition and scheduling of tasks, thus showing its scalability w.r.t. to the number of CPUs. We analyzed the cache friendliness of the algorithm. Finally, we provided additional speed-up by introducing SIMD extensions.

Parallel mining of frequent closed patterns: harnessing modern computer architectures

Lucchese C;Orlando S;Perego R
2007

Abstract

Inspired by emerging multi-core computer architectures, in this paper we present MT CL O S E D, a multi-threaded algorithm for frequent closed itemset mining (FCIM). To the best of our knowledge, this is the first FCIM parallel algorithm proposed so far. We studied how different duplicate checking techniques, typical of FCIM algorithms, may affect this parallelization. We showed that only one of them allows to decompose the global FCIM problem into independent tasks that can be executed in any order, and thus in parallel. Finally we show how MT CL O S E D efficiently harness modern CPUs. We designed and tested several parallelization paradigms by investigating static/dynamic decomposition and scheduling of tasks, thus showing its scalability w.r.t. to the number of CPUs. We analyzed the cache friendliness of the algorithm. Finally, we provided additional speed-up by introducing SIMD extensions.
2007
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
978-0-7695-3018-5
H.2.8 Database Applications
Data mining
Parallel algorithms
File in questo prodotto:
File Dimensione Formato  
prod_91675-doc_24809.pdf

solo utenti autorizzati

Descrizione: articolo pubblicato
Tipologia: Versione Editoriale (PDF)
Dimensione 287.42 kB
Formato Adobe PDF
287.42 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/102634
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 25
  • ???jsp.display-item.citation.isi??? 15
social impact