Dementia is a set of mental diseases affecting millions of people worldwide. Similarly to all the other mental health issues, it is often difficult to forecast the trend of the disease for patients suffering from it. In this context, data of patients suffering from mental health are usually collected through questionnaires, psychological and cognitive tests, over several timepoints. This way, longitudinal data can help identify disease trajectories and allow medical doctors to forecast specific treatments. In this study, we analyze an open, unrestricted dataset of electronic health records (EHRs) of patients suffering from dementia, called OASIS-2, through several unsupervised machine learning methods (К-means, Hierarchical Clustering, Gaussian Mixture Model, and Spectral Clustering). This dataset contains demographic data and psychological test data collected over five independent visits, and having 142 patients at the first visit and ten features. Our goal is to identify patients’ clusters that stay stable over the first four visits (we discarded the data of the fifth visit because of its small size), and then to characterize these clusters by studying their variables. We also measure the performances of the clustering methods through conventional metrics for internal and external validation. Our preliminary results show that unsupervised techniques can identify significant clusters of patients with mental health issues in this dataset and that Hierarchical Clustering outperforms the other algorithms to this end.

Exploratory analysis of longitudinal data of patients with dementia through unsupervised techniques

Ribino P.
Primo
;
Di Napoli C.;Paragliola G.;Serino L.;
2023

Abstract

Dementia is a set of mental diseases affecting millions of people worldwide. Similarly to all the other mental health issues, it is often difficult to forecast the trend of the disease for patients suffering from it. In this context, data of patients suffering from mental health are usually collected through questionnaires, psychological and cognitive tests, over several timepoints. This way, longitudinal data can help identify disease trajectories and allow medical doctors to forecast specific treatments. In this study, we analyze an open, unrestricted dataset of electronic health records (EHRs) of patients suffering from dementia, called OASIS-2, through several unsupervised machine learning methods (К-means, Hierarchical Clustering, Gaussian Mixture Model, and Spectral Clustering). This dataset contains demographic data and psychological test data collected over five independent visits, and having 142 patients at the first visit and ten features. Our goal is to identify patients’ clusters that stay stable over the first four visits (we discarded the data of the fifth visit because of its small size), and then to characterize these clusters by studying their variables. We also measure the performances of the clustering methods through conventional metrics for internal and external validation. Our preliminary results show that unsupervised techniques can identify significant clusters of patients with mental health issues in this dataset and that Hierarchical Clustering outperforms the other algorithms to this end.
2023
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR - Sede Secondaria Napoli
clustering
electronic health records
dementia
mental health
older adults
unsupervised machine learning
File in questo prodotto:
File Dimensione Formato  
AIxAS_2023_paper_8.pdf

non disponibili

Tipologia: Documento in Post-print
Licenza: Altro tipo di licenza
Dimensione 485.52 kB
Formato Adobe PDF
485.52 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/519533
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact