CNR Institutional Research Information System

In recent years, the storage and transfer of clinical data within electronic health records (EHRs) have represented a significant step forward for research, providing direct and almost instantaneous access to information useful for patient care. However, despite their potential, extracting generalisable knowledge from these records remains a challenge. The lack of uniformity in the data they contain and the absence of standardisation in their formats hinder the direct use of EHR data for training predictive models of disease-associated risks. Access to the information contained in EHRs is crucial for identifying issues or risk factors and is essential for developing new therapies. In this context, word embedding algorithms play a crucial role in standardising and analysing clinical data within EHRs, representing medical terms as numerical vectors capable of capturing semantic and syntactic similarities between terms. This approach facilitates the extraction of clinically meaningful patterns, revealing possible hidden relationships between features useful to improve the quality of healthcare. A concrete example of the application of these techniques is a model of embedding tested on a dataset of clinical records of asplenic patients provided by the Italian Network for Asplenia (INA), a collaborative network of more than 60 Italian hospital centres. This model was used to identify possible relationships between clinical features, predict potential issues related to the disorder, and identify the most suitable therapies for each patient.

Representation learning of asplenic patient data for disease risk prediction

Teresa Cappuccio^Primo;Maddalena Casale;Laura Casalino;Maurizio Giordano;Marcella Vacca;Ilaria Granata^Ultimo

2025

Abstract

In recent years, the storage and transfer of clinical data within electronic health records (EHRs) have represented a significant step forward for research, providing direct and almost instantaneous access to information useful for patient care. However, despite their potential, extracting generalisable knowledge from these records remains a challenge. The lack of uniformity in the data they contain and the absence of standardisation in their formats hinder the direct use of EHR data for training predictive models of disease-associated risks. Access to the information contained in EHRs is crucial for identifying issues or risk factors and is essential for developing new therapies. In this context, word embedding algorithms play a crucial role in standardising and analysing clinical data within EHRs, representing medical terms as numerical vectors capable of capturing semantic and syntactic similarities between terms. This approach facilitates the extraction of clinically meaningful patterns, revealing possible hidden relationships between features useful to improve the quality of healthcare. A concrete example of the application of these techniques is a model of embedding tested on a dataset of clinical records of asplenic patients provided by the Italian Network for Asplenia (INA), a collaborative network of more than 60 Italian hospital centres. This model was used to identify possible relationships between clinical features, predict potential issues related to the disorder, and identify the most suitable therapies for each patient.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Strutture organizzative
	
				Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR - Sede Secondaria Napoli
Istituto di genetica e biofisica "Adriano Buzzati Traverso"- IGB - Sede Napoli
			
	Parole chiave
	
				Asplenia, word embedding algorithm, electronic health records (EHRs), clinical data analysis, predictive models.
			
	Appare nelle tipologie:
	
				04.02 Abstract in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Abstracts_ICSA_2025.pdf accesso aperto Tipologia: Abstract Licenza: Creative commons Dimensione 224.66 kB Formato Adobe PDF Visualizza/Apri	224.66 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/568101

Citazioni

ND

ND

ND

social impact