The development of novel platforms and techniques for emerging “Big Data” applications requires the availability of real-life datasets for data-driven experiments, which are however not accessible in most cases for various reasons, e.g., confidentiality, privacy or simply insufficient availability. An interesting solution to ensure high quality experimental findings is to synthesize datasets that reflect patterns of real ones. A promising approach is based on inverse mining techniques such as inverse frequent itemset mining (IFM), which consists of generating a transactional dataset satisfying given support constraints on the itemsets of an input set, that are typically the frequent and infrequent ones. This paper describes an extension of IFM that considers more structured schemes for the datasets to be generated, as required in emerging big data applications, e.g., social network analytics.

Multi-sorted inverse frequent itemsets mining for generating realistic no-SQL datasets

Rullo A.
2021

Abstract

The development of novel platforms and techniques for emerging “Big Data” applications requires the availability of real-life datasets for data-driven experiments, which are however not accessible in most cases for various reasons, e.g., confidentiality, privacy or simply insufficient availability. An interesting solution to ensure high quality experimental findings is to synthesize datasets that reflect patterns of real ones. A promising approach is based on inverse mining techniques such as inverse frequent itemset mining (IFM), which consists of generating a transactional dataset satisfying given support constraints on the itemsets of an input set, that are typically the frequent and infrequent ones. This paper describes an extension of IFM that considers more structured schemes for the datasets to be generated, as required in emerging big data applications, e.g., social network analytics.
2021
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
IFM
Itemset mining
No-SQL
File in questo prodotto:
File Dimensione Formato  
Multi_Sorted_Inverse_Frequent_Itemsets_M.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 204.86 kB
Formato Adobe PDF
204.86 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/534019
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact