Co-clustering is a specific type of clustering that addresses the problem of simultaneously clustering objects and attributes of a data matrix. Although general clustering techniques find non-overlapping co-clusters, finding possible overlaps between co-clusters can reveal embedded patterns in the data that the disjoint clusters cannot discover. The overlapping co-clustering approaches proposed in the literature focus on finding global overlapped co-clusters and they might overlook interesting local patterns that are not necessarily identified as global co-clusters. Discovering such local co-clusters increases the granularity of the analysis, and therefore more specific patterns can be captured. This is the objective of the present paper, which proposes the new Overlapped Co-Clustering (OCoClus) method for finding overlapped co-clusters on binary data, including both global and local patterns. This is a non-exhaustive method based on the co-occurrence of attributes and objects in the data. Another novelty of this method is that it is driven by an objective cost function that can automatically determine the number of co-clusters. We evaluate the proposed approach on publicly available datasets, both real and synthetic data, and compare the results with a number of baselines. Our approach shows better results than the baseline methods on synthetic data and demonstrates its efficacy in real data.

A co-occurrence based approach for mining overlapped co-clusters in binary data

Perego R;Renso C;
2021

Abstract

Co-clustering is a specific type of clustering that addresses the problem of simultaneously clustering objects and attributes of a data matrix. Although general clustering techniques find non-overlapping co-clusters, finding possible overlaps between co-clusters can reveal embedded patterns in the data that the disjoint clusters cannot discover. The overlapping co-clustering approaches proposed in the literature focus on finding global overlapped co-clusters and they might overlook interesting local patterns that are not necessarily identified as global co-clusters. Discovering such local co-clusters increases the granularity of the analysis, and therefore more specific patterns can be captured. This is the objective of the present paper, which proposes the new Overlapped Co-Clustering (OCoClus) method for finding overlapped co-clusters on binary data, including both global and local patterns. This is a non-exhaustive method based on the co-occurrence of attributes and objects in the data. Another novelty of this method is that it is driven by an objective cost function that can automatically determine the number of co-clusters. We evaluate the proposed approach on publicly available datasets, both real and synthetic data, and compare the results with a number of baselines. Our approach shows better results than the baseline methods on synthetic data and demonstrates its efficacy in real data.
2021
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
978-3-030-91702-9
Clustering
Trajectories
Co-clustering data mining
File in questo prodotto:
File Dimensione Formato  
prod_465551-doc_182837.pdf

solo utenti autorizzati

Descrizione: A Co-occurrence Based Approach for Mining Overlapped Co-clusters in Binary Data
Tipologia: Versione Editoriale (PDF)
Dimensione 674.01 kB
Formato Adobe PDF
674.01 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_465551-doc_182911.pdf

accesso aperto

Descrizione: Postprint - A Co-occurrence Based Approach for Mining Overlapped Co-clusters in Binary Data
Tipologia: Versione Editoriale (PDF)
Dimensione 577.01 kB
Formato Adobe PDF
577.01 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/443667
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact