CNR Institutional Research Information System

Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs.

Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models

Occhipinti D.;Marchi M.;Mondella I.;Lai H.;Dell'Orletta F.;Nissim M.;Guerini M.

2024

Abstract

Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.anceserie	PROCEEDINGS OF THE CONFERENCE - ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING	en
dc.authority.orgunit	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	en
dc.authority.orgunit	Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI	en
dc.authority.people	Occhipinti D.	en
dc.authority.people	Marchi M.	en
dc.authority.people	Mondella I.	en
dc.authority.people	Lai H.	en
dc.authority.people	Dell'Orletta F.	en
dc.authority.people	Nissim M.	en
dc.authority.people	Guerini M.	en
dc.collection.id.s	71c7200a-7c5f-4e83-8d57-d3d2ba88f40d	*
dc.collection.name	04.01 Contributo in Atti di convegno	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	918	*
dc.date.accessioned	2024/12/19 16:29:06	-
dc.date.available	2024/12/19 16:29:06	-
dc.date.firstsubmission	2024/12/18 17:04:06	*
dc.date.issued	2024	-
dc.date.submission	2024/12/18 17:04:06	*
dc.description.abstracteng	Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs.	-
dc.description.allpeople	Occhipinti, D.; Marchi, M.; Mondella, I.; Lai, H.; Dell'Orletta, F.; Nissim, M.; Guerini, M.	-
dc.description.allpeopleoriginal	Occhipinti D.; Marchi M.; Mondella I.; Lai H.; Dell'Orletta F.; Nissim M.; Guerini M.	en
dc.description.fulltext	open	en
dc.description.numberofauthors	7	-
dc.identifier.scopus	2-s2.0-85205303737	en
dc.identifier.source	scopus	*
dc.identifier.uri	https://hdl.handle.net/20.500.14243/519999	-
dc.language.iso	eng	en
dc.publisher.name	Association for Computational Linguistics (ACL)	en
dc.relation.conferencedate	2024	en
dc.relation.conferencename	Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024	en
dc.relation.conferenceplace	Bangkok, Thailand	en
dc.relation.firstpage	11892	en
dc.relation.ispartofbook	Proceedings of the Annual Meeting of the Association for Computational Linguistics	en
dc.relation.lastpage	11907	en
dc.relation.numberofpages	16	en
dc.subject.keywordseng	Large Language Models (LLMs)	-
dc.subject.keywordseng	Detecting Synthetic Texts	-
dc.subject.singlekeyword	Large Language Models (LLMs)	*
dc.subject.singlekeyword	Detecting Synthetic Texts	*
dc.title	Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models	en
dc.type.driver	info:eu-repo/semantics/conferenceObject	-
dc.type.full	04 Contributo in convegno::04.01 Contributo in Atti di convegno	it
dc.type.miur	273	-
iris.mediafilter.data	2025/04/15 04:08:06	*
iris.orcid.lastModifiedDate	2024/12/19 16:29:06	*
iris.orcid.lastModifiedMillisecond	1734622146439	*
iris.scopus.extIssued	2024	-
iris.scopus.extTitle	Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models	-
iris.sitodocente.maxattempts	1	-
scopus.authority.anceserie	PROCEEDINGS OF THE CONFERENCE - ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING###0736-587X	*
scopus.category	1203	*
scopus.category	3310	*
scopus.category	1706	*
scopus.contributor.affiliation	University of Trento	-
scopus.contributor.affiliation	University of Trento	-
scopus.contributor.affiliation	ItaliaNLP Lab @ CNR-ILC	-
scopus.contributor.affiliation	University of Groningen	-
scopus.contributor.affiliation	ItaliaNLP Lab @ CNR-ILC	-
scopus.contributor.affiliation	University of Groningen	-
scopus.contributor.affiliation	Fondazione Bruno Kessler	-
scopus.contributor.afid	60015986	-
scopus.contributor.afid	60015986	-
scopus.contributor.afid	60021199	-
scopus.contributor.afid	60010023	-
scopus.contributor.afid	60021199	-
scopus.contributor.afid	60010023	-
scopus.contributor.afid	60083112	-
scopus.contributor.auid	57220749030	-
scopus.contributor.auid	59206205800	-
scopus.contributor.auid	59156649100	-
scopus.contributor.auid	57222015938	-
scopus.contributor.auid	57540567000	-
scopus.contributor.auid	8281949100	-
scopus.contributor.auid	16686458800	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Netherlands	-
scopus.contributor.country	Italy	-
scopus.contributor.country	Netherlands	-
scopus.contributor.country	Italy	-
scopus.contributor.dptid		-
scopus.contributor.dptid		-
scopus.contributor.dptid	121833164	-
scopus.contributor.dptid		-
scopus.contributor.dptid	121833164	-
scopus.contributor.dptid		-
scopus.contributor.dptid		-
scopus.contributor.name	Daniela	-
scopus.contributor.name	Michele	-
scopus.contributor.name	Irene	-
scopus.contributor.name	Huiyuan	-
scopus.contributor.name	Felice	-
scopus.contributor.name	Malvina	-
scopus.contributor.name	Marco	-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.subaffiliation		-
scopus.contributor.surname	Occhipinti	-
scopus.contributor.surname	Marchi	-
scopus.contributor.surname	Mondella	-
scopus.contributor.surname	Lai	-
scopus.contributor.surname	Dell'Orletta	-
scopus.contributor.surname	Nissim	-
scopus.contributor.surname	Guerini	-
scopus.date.issued	2024	*
scopus.description.abstracteng	Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs.	*
scopus.description.allpeopleoriginal	Occhipinti D.; Marchi M.; Mondella I.; Lai H.; Dell'Orletta F.; Nissim M.; Guerini M.	*
scopus.differences	scopus.identifier.isbn	*
scopus.differences	scopus.identifier.doi	*
scopus.differences	scopus.relation.conferenceplace	*
scopus.document.type	cp	*
scopus.document.types	cp	*
scopus.identifier.doi	10.18653/v1/2024.findings-acl.707	*
scopus.identifier.isbn	9798891760998	*
scopus.identifier.pui	645398611	*
scopus.identifier.scopus	2-s2.0-85205303737	*
scopus.journal.sourceid	21101138302	*
scopus.language.iso	eng	*
scopus.publisher.name	Association for Computational Linguistics (ACL)	*
scopus.relation.conferencedate	2024	*
scopus.relation.conferencename	Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024	*
scopus.relation.conferenceplace	tha	*
scopus.relation.firstpage	11892	*
scopus.relation.lastpage	11907	*
scopus.title	Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models	*
scopus.titleeng	Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models	*
Appare nelle tipologie:	04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2024.findings-acl.707.pdf accesso aperto Licenza: Creative commons Dimensione 312.55 kB Formato Adobe PDF Visualizza/Apri	312.55 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/519999

Citazioni

ND

3

ND

social impact