Missing data represents a challenge in large-scale epidemiological studies as it can introduce a strong and negative bias in the final estimates when not handled appropriately. Addressing missing values is considered important for the correct assignment of cases from one hand and the characterisation of risk factors from another. In this study, we present a robust experimental comparison between MICE and several ML-based imputation approaches applied to the Ecuadorian birth data. We assess their performance and discuss the respective strengths and limitations within an epidemiological context.
Missing data imputation in epidemiology: a comparison between MICE and Machine Learning methods
Franco Alberto Cardillo
2026
Abstract
Missing data represents a challenge in large-scale epidemiological studies as it can introduce a strong and negative bias in the final estimates when not handled appropriately. Addressing missing values is considered important for the correct assignment of cases from one hand and the characterisation of risk factors from another. In this study, we present a robust experimental comparison between MICE and several ML-based imputation approaches applied to the Ecuadorian birth data. We assess their performance and discuss the respective strengths and limitations within an epidemiological context.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


