This paper overviews soft clustering algorithms applied in the context of information retrieval (IR). First, a motivation of the utility of soft clustering approaches in IR is discussed. Then, an outline of the two main flat soft approaches, namely probabilistic clustering and fuzzy clustering, is described. Specifically, the expectation maximization and fuzzy c-means algorithms are introduced, and some of their extensions defined to overcome their main drawbacks when applied for organizing large document collections. Finally, soft hierarchical clustering algorithms designed for generating taxonomies of documents are introduced. C (C) 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 138-146 DOI: 10.1002/widm.3

Soft clustering for information retrieval applications

Bordogna Gloria;
2011

Abstract

This paper overviews soft clustering algorithms applied in the context of information retrieval (IR). First, a motivation of the utility of soft clustering approaches in IR is discussed. Then, an outline of the two main flat soft approaches, namely probabilistic clustering and fuzzy clustering, is described. Specifically, the expectation maximization and fuzzy c-means algorithms are introduced, and some of their extensions defined to overcome their main drawbacks when applied for organizing large document collections. Finally, soft hierarchical clustering algorithms designed for generating taxonomies of documents are introduced. C (C) 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 138-146 DOI: 10.1002/widm.3
2011
Istituto per la Dinamica dei Processi Ambientali - IDPA - Sede Venezia
soft clustering
fuzzy clustering
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/341989
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 9
social impact