In this paper we propose an architecture specifi- cally devoted to the analysis of huge natural language biomed- ical textual collections, with the purpose of searching for semantic similarity in order to obtain useful hints for effective simulation that could help physicians in diagnosis tasks. We leverage Word Embedding models trained with word2vec algorithm and a Big Data architecture for their processing and management. We performed some preliminary analyses using a dataset extracted from the whole PubMed library and we developed a web front-end to show the usability of this methodology in a real context.

Health Data Information Retrieval For Improved Simulation

Mario Ciampi;Giuseppe De Pietro;Stefano Silvestri
2020

Abstract

In this paper we propose an architecture specifi- cally devoted to the analysis of huge natural language biomed- ical textual collections, with the purpose of searching for semantic similarity in order to obtain useful hints for effective simulation that could help physicians in diagnosis tasks. We leverage Word Embedding models trained with word2vec algorithm and a Big Data architecture for their processing and management. We performed some preliminary analyses using a dataset extracted from the whole PubMed library and we developed a web front-end to show the usability of this methodology in a real context.
2020
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
978-1-7281-6582-0
Medical Information Retrieval
Big Data Architecture
Semantic Search
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/383186
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact