We present an exact and highly scalable sampling algorithm that can be used as an alternative to exhaustive local pattern discovery methods. It samples patterns according to their frequency of occurrence and can substantially improve efficiency and controllability of the pattern discovery processes. While previous sampling approaches mainly rely on the Markov chain Monte Carlo method, our procedure is direct, i.e. a non process-simulating sampling algorithm. The ad- vantages of this direct method are an almost optimal time complexity per pattern as well as an exactly controlled distribution of the produced pat- terns. In addition we present experimental results which demonstrate that these procedures can improve the accuracy of pattern-based models similar to frequent sets and often also lead to substantial gains in terms of scalability

Direct pattern sampling with respect to pattern frequency

Lucchese C;
2011

Abstract

We present an exact and highly scalable sampling algorithm that can be used as an alternative to exhaustive local pattern discovery methods. It samples patterns according to their frequency of occurrence and can substantially improve efficiency and controllability of the pattern discovery processes. While previous sampling approaches mainly rely on the Markov chain Monte Carlo method, our procedure is direct, i.e. a non process-simulating sampling algorithm. The ad- vantages of this direct method are an almost optimal time complexity per pattern as well as an exactly controlled distribution of the produced pat- terns. In addition we present experimental results which demonstrate that these procedures can improve the accuracy of pattern-based models similar to frequent sets and often also lead to substantial gains in terms of scalability
2011
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Frequent pattern mining
File in questo prodotto:
File Dimensione Formato  
prod_206862-doc_99876.pdf

non disponibili

Descrizione: Direct pattern sampling with respect to pattern frequency
Dimensione 477.68 kB
Formato Adobe PDF
477.68 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/181582
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact