We automatically generate headlines that are expected to comply with the specific styles of two different Italian newspapers. Through a data alignment strategy and different training/testing settings, we aim at decoupling content from style and preserve the latter in generation. In order to evaluate the generated headlines' quality in terms of their specific newspaper-compliance, we devise a fine-grained evaluation strategy based on automatic classification. We observe that our models do indeed learn newspaper-specific style. Importantly, we also observe that humans aren't reliable judges for this task, since although familiar with the newspapers, they are notable to discern their specific styles even in the original human-written headlines. The utility of automatic evaluation goes therefore beyond saving the costs and hurdles of manual annotation, and deserves particular care in its design.

Invisible to People but not to Machines: Evaluation of Style-aware Headline Generation in Absence of Reliable Human Judgment

Dell'Orletta F;
2020

Abstract

We automatically generate headlines that are expected to comply with the specific styles of two different Italian newspapers. Through a data alignment strategy and different training/testing settings, we aim at decoupling content from style and preserve the latter in generation. In order to evaluate the generated headlines' quality in terms of their specific newspaper-compliance, we devise a fine-grained evaluation strategy based on automatic classification. We observe that our models do indeed learn newspaper-specific style. Importantly, we also observe that humans aren't reliable judges for this task, since although familiar with the newspapers, they are notable to discern their specific styles even in the original human-written headlines. The utility of automatic evaluation goes therefore beyond saving the costs and hurdles of manual annotation, and deserves particular care in its design.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.people De Mattei L it
dc.authority.people Cafagna M it
dc.authority.people Dell'Orletta F it
dc.authority.people Nissim M it
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/21 05:21:18 -
dc.date.available 2024/02/21 05:21:18 -
dc.date.issued 2020 -
dc.description.abstracteng We automatically generate headlines that are expected to comply with the specific styles of two different Italian newspapers. Through a data alignment strategy and different training/testing settings, we aim at decoupling content from style and preserve the latter in generation. In order to evaluate the generated headlines' quality in terms of their specific newspaper-compliance, we devise a fine-grained evaluation strategy based on automatic classification. We observe that our models do indeed learn newspaper-specific style. Importantly, we also observe that humans aren't reliable judges for this task, since although familiar with the newspapers, they are notable to discern their specific styles even in the original human-written headlines. The utility of automatic evaluation goes therefore beyond saving the costs and hurdles of manual annotation, and deserves particular care in its design. -
dc.description.affiliations Department of Computer Science, University of Pisa, Italy; University of Malta, Malta; Istituto di Linguistica Computazionale"Antonio Zampolli" (ILC-CNR); University of Groningen, The Netherland; -
dc.description.allpeople De Mattei, L; Cafagna, M; Dell'Orletta, F; Nissim, M -
dc.description.allpeopleoriginal De Mattei L., Cafagna M., Dell'Orletta F., Nissim M. -
dc.description.fulltext none en
dc.description.numberofauthors 4 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/401393 -
dc.identifier.url http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.828.pdf -
dc.language.iso eng -
dc.relation.conferencedate 11-16/05/2020 -
dc.relation.conferencename 12th Edition of International Conference on Language Resources and Evaluation (LREC 2020) -
dc.relation.conferenceplace online -
dc.subject.keywords Natural Language Generation -
dc.subject.keywords Stylistic variations -
dc.subject.keywords Evaluation -
dc.subject.singlekeyword Natural Language Generation *
dc.subject.singlekeyword Stylistic variations *
dc.subject.singlekeyword Evaluation *
dc.title Invisible to People but not to Machines: Evaluation of Style-aware Headline Generation in Absence of Reliable Human Judgment en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 450806 -
iris.orcid.lastModifiedDate 2024/04/04 15:07:55 *
iris.orcid.lastModifiedMillisecond 1712236075070 *
iris.sitodocente.maxattempts 3 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/401393
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact