Big Data paradigm is leading both research and industry effort calling for new approaches in many computer science areas. In this paper, we show how semantic similarity search for natural language texts can be leveraged in biomedical domain by Word Embedding models obtained by word2vec algorithm, exploiting a specifically developed Big Data architecture. We tested our approach using a dataset extracted from the whole PubMed library. Moreover, we describe a user friendly web front-end able to show the usability of this methodology on a real context that allowed us to learn some useful lessons about this peculiar kind of data.

Some lessons learned using health data literature for smart information retrieval

Mario Ciampi;Giuseppe De Pietro;Stefano Silvestri
2020

Abstract

Big Data paradigm is leading both research and industry effort calling for new approaches in many computer science areas. In this paper, we show how semantic similarity search for natural language texts can be leveraged in biomedical domain by Word Embedding models obtained by word2vec algorithm, exploiting a specifically developed Big Data architecture. We tested our approach using a dataset extracted from the whole PubMed library. Moreover, we describe a user friendly web front-end able to show the usability of this methodology on a real context that allowed us to learn some useful lessons about this peculiar kind of data.
2020
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Inglese
SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing
SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing
931
934
4
978-1-4503-6866-7
https://dl.acm.org/doi/10.1145/3341105.3374128
ACM, Association for computing machinery
New York
STATI UNITI D'AMERICA
Sì, ma tipo non specificato
30/03/2020-03/04/2020
Brno, Czech Republic
Smart Information Retrieval Systems
Word Embeddings
Big Data Architecture
PubMed
4
reserved
Ciampi, Mario; DE PIETRO, Giuseppe; Masciari, Elio; Silvestri, Stefano
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
152_1880.pdf

non disponibili

Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.56 MB
Formato Adobe PDF
1.56 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/364893
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact