We propose an approach to clustering XML-based corpora of healthcare documents by their latent topic similarity. Our approach is a two-step process. Initially, the latent topic distributions of the input healthcare documents are inferred, by performing collapsed Gibbs sampling and parameter estimation under an XML topic model. Subsequently, the inferred distributions are grouped through established clustering techniques.

Topical Cluster Discovery in Semistructured Healthcare Data

Gianni Costa;Riccardo Ortale
2018

Abstract

We propose an approach to clustering XML-based corpora of healthcare documents by their latent topic similarity. Our approach is a two-step process. Initially, the latent topic distributions of the input healthcare documents are inferred, by performing collapsed Gibbs sampling and parameter estimation under an XML topic model. Subsequently, the inferred distributions are grouped through established clustering techniques.
2018
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Inglese
International Workshop on Social Media Analytics for Healthcare (@ IEEE/WIC/ACM International Conference on Web Intelligence 2018)
4
Sì, ma tipo non specificato
03/12/2018
Topical Clusters
Semistructured Healthcare Data Analysis
2
none
Costa, Gianni; Ortale, Riccardo
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/353495
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact