Missing data represents a challenge in large-scale epidemiological studies as it can introduce a strong and negative bias in the final estimates when not handled appropriately. Addressing missing values is considered important for the correct assignment of cases from one hand and the characterisation of risk factors from another. In this study, we present a robust experimental comparison between MICE and several ML-based imputation approaches applied to the Ecuadorian birth data. We assess their performance and discuss the respective strengths and limitations within an epidemiological context.

Missing data imputation in epidemiology: a comparison between MICE and Machine Learning methods

Franco Alberto Cardillo
2026

Abstract

Missing data represents a challenge in large-scale epidemiological studies as it can introduce a strong and negative bias in the final estimates when not handled appropriately. Addressing missing values is considered important for the correct assignment of cases from one hand and the characterisation of risk factors from another. In this study, we present a robust experimental comparison between MICE and several ML-based imputation approaches applied to the Ecuadorian birth data. We assess their performance and discuss the respective strengths and limitations within an epidemiological context.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Mahmoud Hashoush en
dc.authority.people Emmanuelle Cadot en
dc.authority.people Franco Alberto Cardillo en
dc.authority.project corda_____he::86c21b1aa82d5bdc53411947d7ebd9f8 en
dc.collection.id.s 2e1a85b5-484d-45dd-a997-50e67e31babd *
dc.collection.name 04.05 Poster/Abstract non pubblicati in atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.date.firstsubmission 2026/03/04 11:36:34 *
dc.date.issued 2026 -
dc.date.submission 2026/03/04 11:36:34 *
dc.description.abstracteng Missing data represents a challenge in large-scale epidemiological studies as it can introduce a strong and negative bias in the final estimates when not handled appropriately. Addressing missing values is considered important for the correct assignment of cases from one hand and the characterisation of risk factors from another. In this study, we present a robust experimental comparison between MICE and several ML-based imputation approaches applied to the Ecuadorian birth data. We assess their performance and discuss the respective strengths and limitations within an epidemiological context. -
dc.description.allpeople Hashoush, Mahmoud; Cadot, Emmanuelle; Cardillo, Franco Alberto -
dc.description.allpeopleoriginal Mahmoud Hashoush, Emmanuelle Cadot, Franco Alberto Cardillo en
dc.description.fulltext none en
dc.description.numberofauthors 3 -
dc.identifier.source manual *
dc.identifier.uri https://hdl.handle.net/20.500.14243/570982 -
dc.language.iso eng en
dc.relation.conferencename EGU General Assembly 2026 en
dc.relation.projectAcronym STARWARS en
dc.relation.projectAwardNumber 101086252 en
dc.relation.projectAwardTitle STormwAteR and WastewAteR networkS heterogeneous data AI-driven management en
dc.relation.projectFunderName European Commission en
dc.relation.projectFundingStream Horizon Europe Framework Programme en
dc.subject.keywordseng Missing data imputation, machine learning -
dc.subject.singlekeyword Missing data imputation *
dc.subject.singlekeyword machine learning *
dc.title Missing data imputation in epidemiology: a comparison between MICE and Machine Learning methods en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.05 Poster/Abstract non pubblicati in atti di convegno it
dc.type.miur -2 -
iris.orcid.lastModifiedDate 2026/03/04 11:36:34 *
iris.orcid.lastModifiedMillisecond 1772620594236 *
iris.sitodocente.maxattempts 1 -
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/570982
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact