CNR Institutional Research Information System

The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data.

Characterization of synthetic health data using rule-based artificial intelligence models

Lenatti Marta^Co-primo;Paglialonga Alessia^Co-primo;Orani Vanessa^Co-primo;Ferretti Melissa^Co-primo;Mongelli Maurizio^Co-primo

2023

Abstract

The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.ancejournal	IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS	en
dc.authority.orgunit	Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni - IEIIT	en
dc.authority.people	Lenatti Marta	en
dc.authority.people	Paglialonga Alessia	en
dc.authority.people	Orani Vanessa	en
dc.authority.people	Ferretti Melissa	en
dc.authority.people	Mongelli Maurizio	en
dc.collection.id.s	b3f88f24-048a-4e43-8ab1-6697b90e068e	*
dc.collection.name	01.01 Articolo in rivista	*
dc.contributor.appartenenza	Istituto di Elettronica e di Ingegneria dell'Informazione e delle Telecomunicazioni - IEIIT	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	877	*
dc.contributor.appartenenza.mi	918	*
dc.date.accessioned	2024/02/20 04:47:53	-
dc.date.available	2024/02/20 04:47:53	-
dc.date.issued	2023	-
dc.description.abstracteng	The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data.	-
dc.description.affiliations	Lenatti M.; Paglialonga A.; Orani V.; Ferretti M.; Mongelli M.: CNR IEIIT	-
dc.description.allpeople	Lenatti, Marta; Paglialonga, Alessia; Orani, Vanessa; Ferretti, Melissa; Mongelli, Maurizio	-
dc.description.allpeopleoriginal	Lenatti Marta; Paglialonga Alessia; Orani Vanessa; Ferretti Melissa; Mongelli Maurizio	en
dc.description.fulltext	open	en
dc.description.international	no	en
dc.description.numberofauthors	5	-
dc.identifier.doi	10.1109/JBHI.2023.3236722	en
dc.identifier.isi	WOS:001045824200007	-
dc.identifier.scopus	2-s2.0-85147261752	en
dc.identifier.uri	https://hdl.handle.net/20.500.14243/418837	-
dc.identifier.url	https://ieeexplore.ieee.org/document/10016704	en
dc.language.iso	eng	en
dc.miur.last.status.update	2024-12-06T14:55:43Z	*
dc.relation.firstpage	3760	en
dc.relation.issue	8	en
dc.relation.lastpage	3769	en
dc.relation.medium	ELETTRONICO	en
dc.relation.numberofpages	10	en
dc.relation.volume	27	en
dc.subject.keywordseng	Synthetic data	-
dc.subject.keywordseng	Auditory system	-
dc.subject.keywordseng	Data models	-
dc.subject.keywordseng	Biomedical measurement	-
dc.subject.keywordseng	eXplainable AI (XAI)	-
dc.subject.keywordseng	hearing screening	-
dc.subject.keywordseng	rule similarity	-
dc.subject.keywordseng	Generative Adversarial Networks (GAN)	-
dc.subject.keywordseng	data augmentation	-
dc.subject.keywordseng	rule-based models	-
dc.subject.singlekeyword	Synthetic data	*
dc.subject.singlekeyword	Auditory system	*
dc.subject.singlekeyword	Data models	*
dc.subject.singlekeyword	Biomedical measurement	*
dc.subject.singlekeyword	eXplainable AI (XAI)	*
dc.subject.singlekeyword	hearing screening	*
dc.subject.singlekeyword	rule similarity	*
dc.subject.singlekeyword	Generative Adversarial Networks (GAN)	*
dc.subject.singlekeyword	data augmentation	*
dc.subject.singlekeyword	rule-based models	*
dc.title	Characterization of synthetic health data using rule-based artificial intelligence models	en
dc.type.circulation	Internazionale	en
dc.type.driver	info:eu-repo/semantics/article	-
dc.type.full	01 Contributo su Rivista::01.01 Articolo in rivista	it
dc.type.impactfactor	si	en
dc.type.miur	262	-
dc.type.referee	Esperti anonimi	en
dc.ugov.descaux1	476441	-
dc.ugov.descaux2	CC BY 4.0	-
iris.isi.extIssued	2023	-
iris.isi.extTitle	Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models	-
iris.mediafilter.data	2025/03/28 03:38:42	*
iris.orcid.lastModifiedDate	2025/03/13 01:43:39	*
iris.orcid.lastModifiedMillisecond	1741826619212	*
iris.scopus.extIssued	2023	-
iris.scopus.extTitle	Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models	-
iris.sitodocente.maxattempts	1	-
iris.unpaywall.bestoahost	publisher	*
iris.unpaywall.bestoaversion	publishedVersion	*
iris.unpaywall.doi	10.1109/jbhi.2023.3236722	*
iris.unpaywall.hosttype	publisher	*
iris.unpaywall.isoa	true	*
iris.unpaywall.journalisindoaj	false	*
iris.unpaywall.landingpage	https://doi.org/10.1109/jbhi.2023.3236722	*
iris.unpaywall.license	cc-by-nc-sa	*
iris.unpaywall.metadataCallLastModified	22/04/2026 05:05:54	-
iris.unpaywall.metadataCallLastModifiedMillisecond	1776827154312	-
iris.unpaywall.oastatus	hybrid	*
iris.unpaywall.pdfurl	https://ieeexplore.ieee.org/ielx7/6221020/6363502/10016704.pdf	*
isi.authority.ancejournal	IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS###2168-2194	*
isi.category	EV	*
isi.category	PT	*
isi.category	MC	*
isi.category	ET	*
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.affiliation	Consiglio Nazionale delle Ricerche (CNR)	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.country	Italy	-
isi.contributor.name	Marta	-
isi.contributor.name	Alessia	-
isi.contributor.name	Vanessa	-
isi.contributor.name	Melissa	-
isi.contributor.name	Maurizio	-
isi.contributor.researcherId	ABV-5822-2022	-
isi.contributor.researcherId	F-9847-2010	-
isi.contributor.researcherId	GEB-2460-2022	-
isi.contributor.researcherId	COJ-1891-2022	-
isi.contributor.researcherId	DFY-2820-2022	-
isi.contributor.subaffiliation		-
isi.contributor.subaffiliation		-
isi.contributor.subaffiliation		-
isi.contributor.subaffiliation		-
isi.contributor.subaffiliation		-
isi.contributor.surname	Lenatti	-
isi.contributor.surname	Paglialonga	-
isi.contributor.surname	Orani	-
isi.contributor.surname	Ferretti	-
isi.contributor.surname	Mongelli	-
isi.date.issued	2023	*
isi.description.abstracteng	The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data.	*
isi.description.allpeopleoriginal	Lenatti, M; Paglialonga, A; Orani, V; Ferretti, M; Mongelli, M;	*
isi.document.sourcetype	WOS.SCI	*
isi.document.type	Article	*
isi.document.types	Article	*
isi.identifier.doi	10.1109/JBHI.2023.3236722	*
isi.identifier.eissn	2168-2208	*
isi.identifier.isi	WOS:001045824200007	*
isi.journal.journaltitle	IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS	*
isi.journal.journaltitleabbrev	IEEE J BIOMED HEALTH	*
isi.language.original	English	*
isi.publisher.place	445 HOES LANE, PISCATAWAY, NJ 08855-4141 USA	*
isi.relation.firstpage	3760	*
isi.relation.issue	8	*
isi.relation.lastpage	3769	*
isi.relation.volume	27	*
isi.title	Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models	*
scopus.authority.ancejournal	IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS###2168-2194	*
scopus.category	1706	*
scopus.category	2718	*
scopus.category	2208	*
scopus.category	3605	*
scopus.contributor.affiliation	CNR-IEIIT	-
scopus.contributor.affiliation	CNR-IEIIT	-
scopus.contributor.affiliation	CNR-IEIIT	-
scopus.contributor.affiliation	CNR-IEIIT	-
scopus.contributor.affiliation	CNR-IEIIT	-
scopus.contributor.afid	60021199	-
scopus.contributor.afid	60021199	-
scopus.contributor.afid	60021199	-
scopus.contributor.afid	60021199	-
scopus.contributor.afid	60021199	-
scopus.contributor.auid	57222472784	-
scopus.contributor.auid	23668671800	-
scopus.contributor.auid	57217857599	-
scopus.contributor.auid	57203499432	-
scopus.contributor.auid	7005882346	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.dptid		-
scopus.contributor.dptid		-
scopus.contributor.dptid		-
scopus.contributor.dptid		-
scopus.contributor.dptid		-
scopus.contributor.name	Marta	-
scopus.contributor.name	Alessia	-
scopus.contributor.name	Vanessa	-
scopus.contributor.name	Melissa	-
scopus.contributor.name	Maurizio	-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.surname	Lenatti	-
scopus.contributor.surname	Paglialonga	-
scopus.contributor.surname	Orani	-
scopus.contributor.surname	Ferretti	-
scopus.contributor.surname	Mongelli	-
scopus.date.issued	2023	*
scopus.description.abstracteng	The aim of this study is to apply and characterize eXplainable AI (XAI) to assess the quality of synthetic health data generated using a data augmentation algorithm. In this exploratory study, several synthetic datasets are generated using various configurations of a conditional Generative Adversarial Network (GAN) from a set of 156 observations related to adult hearing screening. A rule-based native XAI algorithm, the Logic Learning Machine, is used in combination with conventional utility metrics. The classification performance in different conditions is assessed: models trained and tested on synthetic data, models trained on synthetic data and tested on real data, and models trained on real data and tested on synthetic data. The rules extracted from real and synthetic data are then compared using a rule similarity metric. The results indicate that XAI may be used to assess the quality of synthetic data by (i) the analysis of classification performance and (ii) the analysis of the rules extracted on real and synthetic data (number, covering, structure, cut-off values, and similarity). These results suggest that XAI can be used in an original way to assess synthetic health data and extract knowledge about the mechanisms underlying the generated data.	*
scopus.description.allpeopleoriginal	Lenatti M.; Paglialonga A.; Orani V.; Ferretti M.; Mongelli M.	*
scopus.differences	scopus.subject.keywords	*
scopus.differences	scopus.description.allpeopleoriginal	*
scopus.document.type	ar	*
scopus.document.types	ar	*
scopus.funding.funders	100008180 - Capita Foundation;	*
scopus.identifier.doi	10.1109/JBHI.2023.3236722	*
scopus.identifier.eissn	2168-2208	*
scopus.identifier.pmid	37018683	*
scopus.identifier.pui	2022545519	*
scopus.identifier.scopus	2-s2.0-85147261752	*
scopus.journal.sourceid	21100256982	*
scopus.language.iso	eng	*
scopus.publisher.name	Institute of Electrical and Electronics Engineers Inc.	*
scopus.relation.firstpage	3760	*
scopus.relation.issue	8	*
scopus.relation.lastpage	3769	*
scopus.relation.volume	27	*
scopus.subject.keywords	Data augmentation; eXplainable AI (XAI); Generative Adversarial Networks (GAN); hearing screening; rule similarity;	*
scopus.title	Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models	*
scopus.titleeng	Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models	*
Appare nelle tipologie:	01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
prod_476441-doc_194721.pdf accesso aperto Descrizione: Lenatti et al., Characterization of synthetic health data using rule-based artificial intelligence models Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 1.16 MB Formato Adobe PDF Visualizza/Apri	1.16 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/418837

Citazioni

ND

20

14

social impact