CNR Institutional Research Information System

Clustering is one of the most important unsupervised learning problems and it deals with finding a structure in a collection of unlabeled data; however, different clustering algorithms applied to the same data-set produce different solutions. In many applications the problem of multiple solutions becomes crucial and providing a limited group of good clusterings is often more desirable than a single solution. In this work we propose the Least Square Consensus clustering that allows a user to extrapolate a small number of different clustering solutions from an initial (large) set of solutions obtained by applying any clustering algorithm to a given data-set. Two different implementations are presented. In both cases, each consensus is accomplished with a measure of quality defined in terms of Least Square error and a graphical visualization is provided in order to make immediately interpretable the result. Numerical experiments are carried out on both synthetic and real data-sets. © 2010 Springer-Verlag.

Multiple clustering solutions analysis through least-squares consensus algorithms

Murino;La;Angelini;Cb;Bifulco;Ia;De Feis;Ib;Raiconi;Ga;Tagliaferri;Ra

2010

Abstract

Clustering is one of the most important unsupervised learning problems and it deals with finding a structure in a collection of unlabeled data; however, different clustering algorithms applied to the same data-set produce different solutions. In many applications the problem of multiple solutions becomes crucial and providing a limited group of good clusterings is often more desirable than a single solution. In this work we propose the Least Square Consensus clustering that allows a user to extrapolate a small number of different clustering solutions from an initial (large) set of solutions obtained by applying any clustering algorithm to a given data-set. Two different implementations are presented. In both cases, each consensus is accomplished with a measure of quality defined in terms of Least Square error and a graphical visualization is provided in order to make immediately interpretable the result. Numerical experiments are carried out on both synthetic and real data-sets. © 2010 Springer-Verlag.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2010
			
	Strutture organizzative
	
				Istituto Applicazioni del Calcolo ''Mauro Picone''
			
	Parole chiave
	
				Clustering solutions
Clusterings
Consensus algorithms
Consensus clustering
Data sets
Graphical visualization
Least Square
Least square errors
Multiple clusterings
Multiple solutions
Numerical experiments
Synthetic and real data
Unlabeled data
Artificial intelligence
Bioinformatics
Cluster analysis
Visualization
Clustering algorithms
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/19561

Citazioni

ND

0

ND

social impact