Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs.

Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models

Dell'Orletta F.;
2024

Abstract

Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs.
Campo DC Valore Lingua
dc.authority.anceserie PROCEEDINGS OF THE CONFERENCE - ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING en
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.orgunit Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI en
dc.authority.people Occhipinti D. en
dc.authority.people Marchi M. en
dc.authority.people Mondella I. en
dc.authority.people Lai H. en
dc.authority.people Dell'Orletta F. en
dc.authority.people Nissim M. en
dc.authority.people Guerini M. en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/12/19 16:29:06 -
dc.date.available 2024/12/19 16:29:06 -
dc.date.firstsubmission 2024/12/18 17:04:06 *
dc.date.issued 2024 -
dc.date.submission 2024/12/18 17:04:06 *
dc.description.abstracteng Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs. -
dc.description.allpeople Occhipinti, D.; Marchi, M.; Mondella, I.; Lai, H.; Dell'Orletta, F.; Nissim, M.; Guerini, M. -
dc.description.allpeopleoriginal Occhipinti D.; Marchi M.; Mondella I.; Lai H.; Dell'Orletta F.; Nissim M.; Guerini M. en
dc.description.fulltext open en
dc.description.numberofauthors 7 -
dc.identifier.scopus 2-s2.0-85205303737 en
dc.identifier.source scopus *
dc.identifier.uri https://hdl.handle.net/20.500.14243/519999 -
dc.language.iso eng en
dc.publisher.name Association for Computational Linguistics (ACL) en
dc.relation.conferencedate 2024 en
dc.relation.conferencename Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 en
dc.relation.conferenceplace Bangkok, Thailand en
dc.relation.firstpage 11892 en
dc.relation.ispartofbook Proceedings of the Annual Meeting of the Association for Computational Linguistics en
dc.relation.lastpage 11907 en
dc.relation.numberofpages 16 en
dc.subject.keywordseng Large Language Models (LLMs) -
dc.subject.keywordseng Detecting Synthetic Texts -
dc.subject.singlekeyword Large Language Models (LLMs) *
dc.subject.singlekeyword Detecting Synthetic Texts *
dc.title Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
iris.mediafilter.data 2025/04/15 04:08:06 *
iris.orcid.lastModifiedDate 2024/12/19 16:29:06 *
iris.orcid.lastModifiedMillisecond 1734622146439 *
iris.scopus.extIssued 2024 -
iris.scopus.extTitle Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models -
iris.sitodocente.maxattempts 1 -
scopus.authority.anceserie PROCEEDINGS OF THE CONFERENCE - ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING###0736-587X *
scopus.category 1203 *
scopus.category 3310 *
scopus.category 1706 *
scopus.contributor.affiliation University of Trento -
scopus.contributor.affiliation University of Trento -
scopus.contributor.affiliation ItaliaNLP Lab @ CNR-ILC -
scopus.contributor.affiliation University of Groningen -
scopus.contributor.affiliation ItaliaNLP Lab @ CNR-ILC -
scopus.contributor.affiliation University of Groningen -
scopus.contributor.affiliation Fondazione Bruno Kessler -
scopus.contributor.afid 60015986 -
scopus.contributor.afid 60015986 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60010023 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60010023 -
scopus.contributor.afid 60083112 -
scopus.contributor.auid 57220749030 -
scopus.contributor.auid 59206205800 -
scopus.contributor.auid 59156649100 -
scopus.contributor.auid 57222015938 -
scopus.contributor.auid 57540567000 -
scopus.contributor.auid 8281949100 -
scopus.contributor.auid 16686458800 -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Netherlands -
scopus.contributor.country Italy -
scopus.contributor.country Netherlands -
scopus.contributor.country Italy -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid 121833164 -
scopus.contributor.dptid -
scopus.contributor.dptid 121833164 -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.name Daniela -
scopus.contributor.name Michele -
scopus.contributor.name Irene -
scopus.contributor.name Huiyuan -
scopus.contributor.name Felice -
scopus.contributor.name Malvina -
scopus.contributor.name Marco -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.surname Occhipinti -
scopus.contributor.surname Marchi -
scopus.contributor.surname Mondella -
scopus.contributor.surname Lai -
scopus.contributor.surname Dell'Orletta -
scopus.contributor.surname Nissim -
scopus.contributor.surname Guerini -
scopus.date.issued 2024 *
scopus.description.abstracteng Automatic methods for generating and gathering linguistic data have proven effective for fine-tuning Language Models (LMs) in languages less resourced than English. Still, while there has been emphasis on data quantity, less attention has been given to its quality. In this work, we investigate the impact of human intervention on machine-generated data when fine-tuning dialogical models. In particular, we study (1) whether post-edited dialogues exhibit higher perceived quality compared to the originals that were automatically generated; (2) whether fine-tuning with post-edited dialogues results in noticeable differences in the generated outputs; and (3) whether post-edited dialogues influence the outcomes when considering the parameter size of the LMs. To this end we created HED-IT, a large-scale dataset where machine-generated dialogues are paired with the version post-edited by humans. Using both the edited and unedited portions of HED-IT, we fine-tuned three different sizes of an LM. Results from both human and automatic evaluation show that the different quality of training data is clearly perceived and it has an impact also on the models trained on such data. Additionally, our findings indicate that larger models are less sensitive to data quality, whereas this has a crucial impact on smaller models. These results enhance our comprehension of the impact of human intervention on training data in the development of high-quality LMs. *
scopus.description.allpeopleoriginal Occhipinti D.; Marchi M.; Mondella I.; Lai H.; Dell'Orletta F.; Nissim M.; Guerini M. *
scopus.differences scopus.identifier.isbn *
scopus.differences scopus.identifier.doi *
scopus.differences scopus.relation.conferenceplace *
scopus.document.type cp *
scopus.document.types cp *
scopus.identifier.doi 10.18653/v1/2024.findings-acl.707 *
scopus.identifier.isbn 9798891760998 *
scopus.identifier.pui 645398611 *
scopus.identifier.scopus 2-s2.0-85205303737 *
scopus.journal.sourceid 21101138302 *
scopus.language.iso eng *
scopus.publisher.name Association for Computational Linguistics (ACL) *
scopus.relation.conferencedate 2024 *
scopus.relation.conferencename Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 *
scopus.relation.conferenceplace tha *
scopus.relation.firstpage 11892 *
scopus.relation.lastpage 11907 *
scopus.title Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models *
scopus.titleeng Fine-tuning with HED-IT: The impact of human post-editing for dialogical language models *
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
2024.findings-acl.707.pdf

accesso aperto

Licenza: Creative commons
Dimensione 312.55 kB
Formato Adobe PDF
312.55 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/519999
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact