CNR Institutional Research Information System

The bi-clustering, i.e., simultaneously clustering two types of objects based on their correlations, has been studied actively in the last few years, in virtue of its impact on several relevant applications, such as text mining, collaborative filtering, gene expression analysis. In particular, many research efforts were recently spent on extending such a problem towards higher-order scenarios, where more than two data types are to be clustered synergically, according to pairwise inter-type relations. Measuring co-clustering quality as a weighted combination of the distortions over input relations, a number of alternate-optimization methods were developed of late, which scale linearly with the size of data. This result is likely to be inadequate for large scale applications where massive volumes of data are involved, and high performance solutions would be desirable. However, to date, parallel clustering approaches have been investigated deeply only for the case of just one or two inter-related data types. In this paper, we face the more general (high-order) co-clustering problem by proposing a parallel implementation of an effective and state-of-the-art method, by leveraging a parallel computation infrastructure implementing popular Map-Reduce paradigm.

Scalable parallel co-clustering over multiple heterogeneous data types

Francesco Paolo Folino;Gianluigi Greco;Antonella Guzzo;Luigi Pontieri

2010

Abstract

The bi-clustering, i.e., simultaneously clustering two types of objects based on their correlations, has been studied actively in the last few years, in virtue of its impact on several relevant applications, such as text mining, collaborative filtering, gene expression analysis. In particular, many research efforts were recently spent on extending such a problem towards higher-order scenarios, where more than two data types are to be clustered synergically, according to pairwise inter-type relations. Measuring co-clustering quality as a weighted combination of the distortions over input relations, a number of alternate-optimization methods were developed of late, which scale linearly with the size of data. This result is likely to be inadequate for large scale applications where massive volumes of data are involved, and high performance solutions would be desirable. However, to date, parallel clustering approaches have been investigated deeply only for the case of just one or two inter-related data types. In this paper, we face the more general (high-order) co-clustering problem by proposing a parallel implementation of an effective and state-of-the-art method, by leveraging a parallel computation infrastructure implementing popular Map-Reduce paradigm.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2010
			
	Strutture organizzative
	
				Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
			
	Lingua/e
	
				Inglese
			
	Titolo del convegno
	
				International Conference on High Performance Computing & Simulation (HPCS 2010)
			
	Da pagina
	
				529
			
	A pagina
	
				535
			
	Codice ISBN
	
				978-1-4244-6828-7
			
	Codice DOI
	
				https://dx.doi.org/10.1109/HPCS.2010.5547087
			
	URL
	
				http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5547087&url=http%3A%2F%2Fieeexplore.ieee.org%2Fstamp%2Fstamp.jsp%3Ftp%3D%26arnumber%3D5547087
			
	Nome Editore
	
				IEEE, Institute of electrical and electronics engineers
			
	Città Editore
	
				New York
			
	Nazione Editore
	
				STATI UNITI D'AMERICA
			
	Referee
	
				Sì, ma tipo non specificato
			
	Periodo del Convegno
	
				June 28 2010 - July 2, 2010
			
	Luogo del Convegno
	
				Caen, France
			
	Parole chiave
	
				Co-Clustering
Data Mining
			
	Codice Scopus
	
				2-s2.0-77956965556
			
	Numero autori
	
				2
			
	Fulltext
	
				none
			
	Tutti gli autori
	
						Francesco Paolo Folino; Gianluigi Greco; Antonella Guzzo; Luigi Pontieri
					
	Tipologia Login Miur
	
				273
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04 Contributo in convegno::04.01 Contributo in Atti di convegno
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/71014

Citazioni

ND

5

ND

social impact