The adoption of services for automatic information management is one of the most interesting open problems in various professional and social fields. We focus on the health domain characterized by the production of huge amount of documents, in which the adoption of innovative systems for information management can significantly improve the tasks performed by the actors involved and the quality of the health services offered. In this work we propose a methodology for automatic documents categorization based on the adoption of unsupervised learning techniques. We extracted both semantic and syntactic features in order to define the vector space models and proposed the use of a clustering ensemble in order to increase the discriminative power of our approach. Results on real medical records, digitalized by means of a state-of-the-art OCR technique, demonstrated the effectiveness of the proposed approach.

Combining syntactic and semantic vector space models in the health domain by using a clustering ensemble

Gargiulo Francesco;
2013

Abstract

The adoption of services for automatic information management is one of the most interesting open problems in various professional and social fields. We focus on the health domain characterized by the production of huge amount of documents, in which the adoption of innovative systems for information management can significantly improve the tasks performed by the actors involved and the quality of the health services offered. In this work we propose a methodology for automatic documents categorization based on the adoption of unsupervised learning techniques. We extracted both semantic and syntactic features in order to define the vector space models and proposed the use of a clustering ensemble in order to increase the discriminative power of our approach. Results on real medical records, digitalized by means of a state-of-the-art OCR technique, demonstrated the effectiveness of the proposed approach.
2013
Inglese
: HEALTHINF2013 - 6th International Conference on Health Informatics
382
385
9789898565372
http://www.scopus.com/record/display.url?eid=2-s2.0-84877972740&origin=inward
Sì, ma tipo non specificato
11-14/02/2013
Barcellona, Spagna
Clustering ensemble
Semantic processing
5
none
Amato, Flora; Gargiulo, Francesco; Mazzeo, Antonino; Romano, Sara; Sansone, Carlo
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/321786
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? ND
social impact