Healthcare organizations collect and store significant amounts of patient health information. However, sharing or accessing this information outside of their facilities is often hindered by factors such as privacy concerns. Federated Learning (FL) data systems are emerging to overcome the siloed nature of health data and the barriers to sharing it. While federated approaches have been extensively studied, especially in classification problems, clustering-oriented approaches are still relatively few and less widespread, both in formulating algorithms and in their application in eHealth domains. The primary objective of this paper is to introduce a federated K-means-based approach for clustering tasks within the healthcare domain and explore the impact of heterogeneous health data distributions. The evaluation of the proposed federated K-means approach has been conducted on several health-related datasets through comparison with the centralized version and by estimating the trade-off between privacy and performance. The preliminary findings suggest that in the case of heterogeneous health data distributions, the difference between the centralized and federated approach is marginal, with the federated approach outperforming the centralized one on some healthcare datasets.
A Federated K-Means-Based Approach in eHealth Domains with Heterogeneous Data Distributions
Paragliola G.;Ribino P.;Mannone M.
2024
Abstract
Healthcare organizations collect and store significant amounts of patient health information. However, sharing or accessing this information outside of their facilities is often hindered by factors such as privacy concerns. Federated Learning (FL) data systems are emerging to overcome the siloed nature of health data and the barriers to sharing it. While federated approaches have been extensively studied, especially in classification problems, clustering-oriented approaches are still relatively few and less widespread, both in formulating algorithms and in their application in eHealth domains. The primary objective of this paper is to introduce a federated K-means-based approach for clustering tasks within the healthcare domain and explore the impact of heterogeneous health data distributions. The evaluation of the proposed federated K-means approach has been conducted on several health-related datasets through comparison with the centralized version and by estimating the trade-off between privacy and performance. The preliminary findings suggest that in the case of heterogeneous health data distributions, the difference between the centralized and federated approach is marginal, with the federated approach outperforming the centralized one on some healthcare datasets.File | Dimensione | Formato | |
---|---|---|---|
129812.pdf
non disponibili
Tipologia:
Documento in Post-print
Licenza:
Altro tipo di licenza
Dimensione
303.34 kB
Formato
Adobe PDF
|
303.34 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.