In this paper we propose an architecture specifi- cally devoted to the analysis of huge natural language biomed- ical textual collections, with the purpose of searching for semantic similarity in order to obtain useful hints for effective simulation that could help physicians in diagnosis tasks. We leverage Word Embedding models trained with word2vec algorithm and a Big Data architecture for their processing and management. We performed some preliminary analyses using a dataset extracted from the whole PubMed library and we developed a web front-end to show the usability of this methodology in a real context.
Health Data Information Retrieval For Improved Simulation
Mario Ciampi;Giuseppe De Pietro;Stefano Silvestri
2020
Abstract
In this paper we propose an architecture specifi- cally devoted to the analysis of huge natural language biomed- ical textual collections, with the purpose of searching for semantic similarity in order to obtain useful hints for effective simulation that could help physicians in diagnosis tasks. We leverage Word Embedding models trained with word2vec algorithm and a Big Data architecture for their processing and management. We performed some preliminary analyses using a dataset extracted from the whole PubMed library and we developed a web front-end to show the usability of this methodology in a real context.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.