The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data.

Characterization of synthetic health data using rule-based artificial intelligence models

Lenatti Marta
Co-primo
;
Paglialonga Alessia
Co-primo
;
Ferretti Melissa
Co-primo
;
Mongelli Maurizio
Co-primo
2023

Abstract

The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data.
Campo DC Valore Lingua
dc.authority.ancejournal IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS en
dc.authority.orgunit Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni - IEIIT en
dc.authority.people Lenatti Marta en
dc.authority.people Paglialonga Alessia en
dc.authority.people Orani Vanessa en
dc.authority.people Ferretti Melissa en
dc.authority.people Mongelli Maurizio en
dc.collection.id.s b3f88f24-048a-4e43-8ab1-6697b90e068e *
dc.collection.name 01.01 Articolo in rivista *
dc.contributor.appartenenza Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni - IEIIT *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 877 *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/20 04:47:53 -
dc.date.available 2024/02/20 04:47:53 -
dc.date.issued 2023 -
dc.description.abstracteng The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data. -
dc.description.affiliations Lenatti M.; Paglialonga A.; Orani V.; Ferretti M.; Mongelli M.: CNR IEIIT -
dc.description.allpeople Lenatti, Marta; Paglialonga, Alessia; Orani, Vanessa; Ferretti, Melissa; Mongelli, Maurizio -
dc.description.allpeopleoriginal Lenatti Marta; Paglialonga Alessia; Orani Vanessa; Ferretti Melissa; Mongelli Maurizio en
dc.description.fulltext open en
dc.description.international no en
dc.description.numberofauthors 5 -
dc.identifier.doi 10.1109/JBHI.2023.3236722 en
dc.identifier.isi WOS:001045824200007 -
dc.identifier.scopus 2-s2.0-85147261752 en
dc.identifier.uri https://hdl.handle.net/20.500.14243/418837 -
dc.identifier.url https://ieeexplore.ieee.org/document/10016704 en
dc.language.iso eng en
dc.miur.last.status.update 2024-12-06T14:55:43Z *
dc.relation.firstpage 3760 en
dc.relation.issue 8 en
dc.relation.lastpage 3769 en
dc.relation.medium ELETTRONICO en
dc.relation.numberofpages 10 en
dc.relation.volume 27 en
dc.subject.keywordseng Synthetic data -
dc.subject.keywordseng Auditory system -
dc.subject.keywordseng Data models -
dc.subject.keywordseng Biomedical measurement -
dc.subject.keywordseng eXplainable AI (XAI) -
dc.subject.keywordseng hearing screening -
dc.subject.keywordseng rule similarity -
dc.subject.keywordseng Generative Adversarial Networks (GAN) -
dc.subject.keywordseng data augmentation -
dc.subject.keywordseng rule-based models -
dc.subject.singlekeyword Synthetic data *
dc.subject.singlekeyword Auditory system *
dc.subject.singlekeyword Data models *
dc.subject.singlekeyword Biomedical measurement *
dc.subject.singlekeyword eXplainable AI (XAI) *
dc.subject.singlekeyword hearing screening *
dc.subject.singlekeyword rule similarity *
dc.subject.singlekeyword Generative Adversarial Networks (GAN) *
dc.subject.singlekeyword data augmentation *
dc.subject.singlekeyword rule-based models *
dc.title Characterization of synthetic health data using rule-based artificial intelligence models en
dc.type.circulation Internazionale en
dc.type.driver info:eu-repo/semantics/article -
dc.type.full 01 Contributo su Rivista::01.01 Articolo in rivista it
dc.type.impactfactor si en
dc.type.miur 262 -
dc.type.referee Esperti anonimi en
dc.ugov.descaux1 476441 -
dc.ugov.descaux2 CC BY 4.0 -
iris.isi.extIssued 2023 -
iris.isi.extTitle Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models -
iris.mediafilter.data 2025/03/28 03:38:42 *
iris.orcid.lastModifiedDate 2025/03/13 01:43:39 *
iris.orcid.lastModifiedMillisecond 1741826619212 *
iris.scopus.extIssued 2023 -
iris.scopus.extTitle Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.bestoahost publisher *
iris.unpaywall.bestoaversion publishedVersion *
iris.unpaywall.doi 10.1109/jbhi.2023.3236722 *
iris.unpaywall.hosttype publisher *
iris.unpaywall.isoa true *
iris.unpaywall.journalisindoaj false *
iris.unpaywall.landingpage https://doi.org/10.1109/jbhi.2023.3236722 *
iris.unpaywall.license cc-by-nc-sa *
iris.unpaywall.metadataCallLastModified 22/04/2026 05:05:54 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1776827154312 -
iris.unpaywall.oastatus hybrid *
iris.unpaywall.pdfurl https://ieeexplore.ieee.org/ielx7/6221020/6363502/10016704.pdf *
isi.authority.ancejournal IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS###2168-2194 *
isi.category EV *
isi.category PT *
isi.category MC *
isi.category ET *
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.name Marta -
isi.contributor.name Alessia -
isi.contributor.name Vanessa -
isi.contributor.name Melissa -
isi.contributor.name Maurizio -
isi.contributor.researcherId ABV-5822-2022 -
isi.contributor.researcherId F-9847-2010 -
isi.contributor.researcherId GEB-2460-2022 -
isi.contributor.researcherId COJ-1891-2022 -
isi.contributor.researcherId DFY-2820-2022 -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.surname Lenatti -
isi.contributor.surname Paglialonga -
isi.contributor.surname Orani -
isi.contributor.surname Ferretti -
isi.contributor.surname Mongelli -
isi.date.issued 2023 *
isi.description.abstracteng The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data. *
isi.description.allpeopleoriginal Lenatti, M; Paglialonga, A; Orani, V; Ferretti, M; Mongelli, M; *
isi.document.sourcetype WOS.SCI *
isi.document.type Article *
isi.document.types Article *
isi.identifier.doi 10.1109/JBHI.2023.3236722 *
isi.identifier.eissn 2168-2208 *
isi.identifier.isi WOS:001045824200007 *
isi.journal.journaltitle IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS *
isi.journal.journaltitleabbrev IEEE J BIOMED HEALTH *
isi.language.original English *
isi.publisher.place 445 HOES LANE, PISCATAWAY, NJ 08855-4141 USA *
isi.relation.firstpage 3760 *
isi.relation.issue 8 *
isi.relation.lastpage 3769 *
isi.relation.volume 27 *
isi.title Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models *
scopus.authority.ancejournal IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS###2168-2194 *
scopus.category 1706 *
scopus.category 2718 *
scopus.category 2208 *
scopus.category 3605 *
scopus.contributor.affiliation CNR-IEIIT -
scopus.contributor.affiliation CNR-IEIIT -
scopus.contributor.affiliation CNR-IEIIT -
scopus.contributor.affiliation CNR-IEIIT -
scopus.contributor.affiliation CNR-IEIIT -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.auid 57222472784 -
scopus.contributor.auid 23668671800 -
scopus.contributor.auid 57217857599 -
scopus.contributor.auid 57203499432 -
scopus.contributor.auid 7005882346 -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.name Marta -
scopus.contributor.name Alessia -
scopus.contributor.name Vanessa -
scopus.contributor.name Melissa -
scopus.contributor.name Maurizio -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.surname Lenatti -
scopus.contributor.surname Paglialonga -
scopus.contributor.surname Orani -
scopus.contributor.surname Ferretti -
scopus.contributor.surname Mongelli -
scopus.date.issued 2023 *
scopus.description.abstracteng The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data. *
scopus.description.allpeopleoriginal Lenatti M.; Paglialonga A.; Orani V.; Ferretti M.; Mongelli M. *
scopus.differences scopus.subject.keywords *
scopus.differences scopus.description.allpeopleoriginal *
scopus.document.type ar *
scopus.document.types ar *
scopus.funding.funders 100008180 - Capita Foundation; *
scopus.identifier.doi 10.1109/JBHI.2023.3236722 *
scopus.identifier.eissn 2168-2208 *
scopus.identifier.pmid 37018683 *
scopus.identifier.pui 2022545519 *
scopus.identifier.scopus 2-s2.0-85147261752 *
scopus.journal.sourceid 21100256982 *
scopus.language.iso eng *
scopus.publisher.name Institute of Electrical and Electronics Engineers Inc. *
scopus.relation.firstpage 3760 *
scopus.relation.issue 8 *
scopus.relation.lastpage 3769 *
scopus.relation.volume 27 *
scopus.subject.keywords Data augmentation; eXplainable AI (XAI); Generative Adversarial Networks (GAN); hearing screening; rule similarity; *
scopus.title Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models *
scopus.titleeng Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models *
Appare nelle tipologie: 01.01 Articolo in rivista
File in questo prodotto:
File Dimensione Formato  
prod_476441-doc_194721.pdf

accesso aperto

Descrizione: Lenatti et al., Characterization of synthetic health data using rule-based artificial intelligence models
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 1.16 MB
Formato Adobe PDF
1.16 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/418837
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 14
social impact