This paper presents the implementation of DCI++, an enhancement of DCI, a scalable algorithm for discovering frequent sets in large databases. The main contribution of DCI++ resides on a novel counting inference strategy, inspired by previously known results by Basted et al. Moreover, multiple heuristics and efficient data structures are used in order to adapt the algorithm behavior to the features of the specific dataset mined and of the computing platform used. DCI++ turns out to be effective in mining both short and long patterns from a variety of datasets. We conducted a wide range of experiments on synthetic and real-world datasets, both in-core and out-of-core. The results obtained allow us to state that DCIpp performances are not over-fitted to a special case, and its high performance is maintained on datasets with different characteristics.

kDCI: a Multi-Strategy Algorithm for Mining Frequent Sets

Orlando S;Lucchese C;Palmerini P;Perego R;Silvestri F
2003

Abstract

This paper presents the implementation of DCI++, an enhancement of DCI, a scalable algorithm for discovering frequent sets in large databases. The main contribution of DCI++ resides on a novel counting inference strategy, inspired by previously known results by Basted et al. Moreover, multiple heuristics and efficient data structures are used in order to adapt the algorithm behavior to the features of the specific dataset mined and of the computing platform used. DCI++ turns out to be effective in mining both short and long patterns from a variety of datasets. We conducted a wide range of experiments on synthetic and real-world datasets, both in-core and out-of-core. The results obtained allow us to state that DCIpp performances are not over-fitted to a special case, and its high performance is maintained on datasets with different characteristics.
2003
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Frequent Patterns Mining
Algorithms
File in questo prodotto:
File Dimensione Formato  
prod_91022-doc_123609.pdf

solo utenti autorizzati

Descrizione: kDCI: a Multi-Strategy Algorithm for Mining Frequent Sets
Tipologia: Versione Editoriale (PDF)
Dimensione 236.89 kB
Formato Adobe PDF
236.89 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/56751
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact