Large Language Models (LLMs) are increasingly used as 'content farm' models (CFMs), to generate synthetic text that could pass for real news articles. This is already happening even for languages that do not have high-quality monolingual LLMs. We show that fine-tuning Llama (v1), mostly trained on English, on as little as 40K Italian news articles, is sufficient for producing news-like texts that native speakers of Italian struggle to identify as synthetic. We investigate three LLMs and three methods of detecting synthetic texts (log-likelihood, DetectGPT, and supervised classification), finding that they all perform better than human raters, but they are all impractical in the real world (requiring either access to token likelihood information or a large dataset of CFM texts). We also explore the possibility of creating a proxy CFM: an LLM fine-tuned on a similar dataset to one used by the real 'content farm'. We find that even a small amount of fine-tuning data suffices for creating a successful detector, but we need to know which base LLM is used, which is a major challenge. Our results suggest that there are currently no practical methods for detecting synthetic news-like texts 'in the wild', while generating them is too easy. We highlight the urgency of more NLP research on this problem.

AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian

Puccetti G.;Alzetta C.;Dell'Orletta F.;Esuli A.
2024

Abstract

Large Language Models (LLMs) are increasingly used as 'content farm' models (CFMs), to generate synthetic text that could pass for real news articles. This is already happening even for languages that do not have high-quality monolingual LLMs. We show that fine-tuning Llama (v1), mostly trained on English, on as little as 40K Italian news articles, is sufficient for producing news-like texts that native speakers of Italian struggle to identify as synthetic. We investigate three LLMs and three methods of detecting synthetic texts (log-likelihood, DetectGPT, and supervised classification), finding that they all perform better than human raters, but they are all impractical in the real world (requiring either access to token likelihood information or a large dataset of CFM texts). We also explore the possibility of creating a proxy CFM: an LLM fine-tuned on a similar dataset to one used by the real 'content farm'. We find that even a small amount of fine-tuning data suffices for creating a successful detector, but we need to know which base LLM is used, which is a major challenge. Our results suggest that there are currently no practical methods for detecting synthetic news-like texts 'in the wild', while generating them is too easy. We highlight the urgency of more NLP research on this problem.
Campo DC Valore Lingua
dc.authority.anceserie PROCEEDINGS OF THE CONFERENCE - ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING en
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.orgunit Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI en
dc.authority.people Puccetti G. en
dc.authority.people Rogers A. en
dc.authority.people Alzetta C. en
dc.authority.people Dell'Orletta F. en
dc.authority.people Esuli A. en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.appartenenza.mi 973 *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.date.accessioned 2024/12/19 16:13:12 -
dc.date.available 2024/12/19 16:13:12 -
dc.date.firstsubmission 2024/12/18 16:50:48 *
dc.date.issued 2024 -
dc.date.submission 2024/12/18 16:50:48 *
dc.description.abstracteng Large Language Models (LLMs) are increasingly used as 'content farm' models (CFMs), to generate synthetic text that could pass for real news articles. This is already happening even for languages that do not have high-quality monolingual LLMs. We show that fine-tuning Llama (v1), mostly trained on English, on as little as 40K Italian news articles, is sufficient for producing news-like texts that native speakers of Italian struggle to identify as synthetic. We investigate three LLMs and three methods of detecting synthetic texts (log-likelihood, DetectGPT, and supervised classification), finding that they all perform better than human raters, but they are all impractical in the real world (requiring either access to token likelihood information or a large dataset of CFM texts). We also explore the possibility of creating a proxy CFM: an LLM fine-tuned on a similar dataset to one used by the real 'content farm'. We find that even a small amount of fine-tuning data suffices for creating a successful detector, but we need to know which base LLM is used, which is a major challenge. Our results suggest that there are currently no practical methods for detecting synthetic news-like texts 'in the wild', while generating them is too easy. We highlight the urgency of more NLP research on this problem. -
dc.description.allpeople Puccetti, G.; Rogers, A.; Alzetta, C.; Dell'Orletta, F.; Esuli, A. -
dc.description.allpeopleoriginal Puccetti G.; Rogers A.; Alzetta C.; Dell'Orletta F.; Esuli A. en
dc.description.fulltext open en
dc.description.numberofauthors 5 -
dc.identifier.doi 10.18653/v1/2024.acl-long.817 en
dc.identifier.isi WOS:001391776306025 -
dc.identifier.scopus 2-s2.0-85204461442 en
dc.identifier.source scopus *
dc.identifier.uri https://hdl.handle.net/20.500.14243/519993 -
dc.identifier.url https://aclanthology.org/2024.acl-long.817/ en
dc.language.iso eng en
dc.publisher.name Association for Computational Linguistics (ACL) en
dc.relation.conferencedate 2024 en
dc.relation.conferencename ACL 2024 - 62nd Annual Meeting of the Association for Computational Linguistics en
dc.relation.conferenceplace tha en
dc.relation.firstpage 15312 en
dc.relation.ispartofbook Proceedings of the Annual Meeting of the Association for Computational Linguistics en
dc.relation.lastpage 15338 en
dc.relation.numberofpages 27 en
dc.relation.volume 1 en
dc.subject.keywordseng Large Language Models (LLMs) -
dc.subject.keywordseng Detecting synthetic texts -
dc.subject.singlekeyword Large Language Models (LLMs) *
dc.subject.singlekeyword Detecting synthetic texts *
dc.title AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
iris.isi.extIssued 2024 -
iris.isi.extTitle AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian -
iris.mediafilter.data 2025/03/19 03:35:27 *
iris.orcid.lastModifiedDate 2025/03/16 11:48:07 *
iris.orcid.lastModifiedMillisecond 1742122087860 *
iris.scopus.extIssued 2024 -
iris.scopus.extTitle AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.bestoaversion publishedVersion *
iris.unpaywall.doi 10.18653/v1/2024.acl-long.817 *
iris.unpaywall.isoa true *
iris.unpaywall.journalisindoaj false *
iris.unpaywall.landingpage https://doi.org/10.18653/v1/2024.acl-long.817 *
iris.unpaywall.license cc-by *
iris.unpaywall.metadataCallLastModified 29/04/2026 05:53:50 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1777434830651 -
iris.unpaywall.oastatus gold *
isi.authority.sdg Goal 3: Good health and well-being###12083 *
isi.category EV *
isi.category EX *
isi.category EP *
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation IT University Copenhagen -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.country Italy -
isi.contributor.country Denmark -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.name Giovanni -
isi.contributor.name Anna -
isi.contributor.name Chiara -
isi.contributor.name Felice -
isi.contributor.name Andrea -
isi.contributor.researcherId MIO-0767-2025 -
isi.contributor.researcherId KGX-6755-2024 -
isi.contributor.researcherId KVX-9760-2024 -
isi.contributor.researcherId AAX-1864-2020 -
isi.contributor.researcherId B-6343-2015 -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation Ist Linguist Computaz Antonio Zampolli -
isi.contributor.subaffiliation Ist Linguist Computaz Antonio Zampolli -
isi.contributor.subaffiliation -
isi.contributor.surname Puccetti -
isi.contributor.surname Rogers -
isi.contributor.surname Alzetta -
isi.contributor.surname Dell'Orletta -
isi.contributor.surname Esuli -
isi.date.issued 2024 *
isi.description.abstracteng Large Language Models (LLMs) are increasingly used as 'content farm' models (CFMs), to generate synthetic text that could pass for real news articles. This is already happening even for languages that do not have high-quality monolingual LLMs. We show that fine-tuning Llama (v1), mostly trained on English, on as little as 40K Italian news articles, is sufficient for producing news-like texts that native speakers of Italian struggle to identify as synthetic.We investigate three LLMs and three methods of detecting synthetic texts (log-likelihood, DetectGPT, and supervised classification), finding that they all perform better than human raters, but they are all impractical in the real world (requiring either access to token likelihood information or a large dataset of CFM texts). We also explore the possibility of creating a proxy CFM: an LLM fine-tuned on a similar dataset to one used by the real 'content farm'. We find that even a small amount of fine-tuning data suffices for creating a successful detector, but we need to know which base LLM is used, which is a major challenge.Our results suggest that there are currently no practical methods for detecting synthetic newslike texts 'in the wild', while generating them is too easy. We highlight the urgency of more NLP research on this problem. *
isi.description.allpeopleoriginal Puccetti, G; Rogers, A; Alzetta, C; Dell'Orletta, F; Esuli, A; *
isi.document.sourcetype WOS.ISTP *
isi.document.type Proceedings Paper *
isi.document.types Proceedings Paper *
isi.identifier.isi WOS:001391776306025 *
isi.journal.journaltitle PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS *
isi.language.original English *
isi.publisher.place 209 N EIGHTH STREET, STROUDSBURG, PA 18360 USA *
isi.relation.firstpage 15312 *
isi.relation.lastpage 15338 *
isi.title AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian *
scopus.authority.anceserie PROCEEDINGS OF THE CONFERENCE - ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. MEETING###0736-587X *
scopus.category 1203 *
scopus.category 3310 *
scopus.category 1706 *
scopus.contributor.affiliation Istituto di Scienza e Tecnologia dell'Informazione “A. Faedo” -
scopus.contributor.affiliation IT University of Copenhagen -
scopus.contributor.affiliation Istituto di Linguistica Computazionale “Antonio Zampolli” -
scopus.contributor.affiliation Istituto di Linguistica Computazionale “Antonio Zampolli” -
scopus.contributor.affiliation Istituto di Scienza e Tecnologia dell'Informazione “A. Faedo” -
scopus.contributor.afid 131428502 -
scopus.contributor.afid 60018567 -
scopus.contributor.afid 60008941 -
scopus.contributor.afid 60008941 -
scopus.contributor.afid 131428502 -
scopus.contributor.auid 57220748419 -
scopus.contributor.auid 57198517078 -
scopus.contributor.auid 57192938832 -
scopus.contributor.auid 57540567000 -
scopus.contributor.auid 15044356100 -
scopus.contributor.country Italy -
scopus.contributor.country Denmark -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid 114087935 -
scopus.contributor.dptid 114087935 -
scopus.contributor.dptid -
scopus.contributor.name Giovanni -
scopus.contributor.name Anna -
scopus.contributor.name Chiara -
scopus.contributor.name Felice -
scopus.contributor.name Andrea -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation ItaliaNLP Lab; -
scopus.contributor.subaffiliation ItaliaNLP Lab; -
scopus.contributor.subaffiliation -
scopus.contributor.surname Puccetti -
scopus.contributor.surname Rogers -
scopus.contributor.surname Alzetta -
scopus.contributor.surname Dell'Orletta -
scopus.contributor.surname Esuli -
scopus.date.issued 2024 *
scopus.description.abstracteng Large Language Models (LLMs) are increasingly used as 'content farm' models (CFMs), to generate synthetic text that could pass for real news articles. This is already happening even for languages that do not have high-quality monolingual LLMs. We show that fine-tuning Llama (v1), mostly trained on English, on as little as 40K Italian news articles, is sufficient for producing news-like texts that native speakers of Italian struggle to identify as synthetic. We investigate three LLMs and three methods of detecting synthetic texts (log-likelihood, DetectGPT, and supervised classification), finding that they all perform better than human raters, but they are all impractical in the real world (requiring either access to token likelihood information or a large dataset of CFM texts). We also explore the possibility of creating a proxy CFM: an LLM fine-tuned on a similar dataset to one used by the real 'content farm'. We find that even a small amount of fine-tuning data suffices for creating a successful detector, but we need to know which base LLM is used, which is a major challenge. Our results suggest that there are currently no practical methods for detecting synthetic news-like texts 'in the wild', while generating them is too easy. We highlight the urgency of more NLP research on this problem. *
scopus.description.allpeopleoriginal Puccetti G.; Rogers A.; Alzetta C.; Dell'Orletta F.; Esuli A. *
scopus.differences scopus.relation.conferencename *
scopus.differences scopus.identifier.isbn *
scopus.document.type cp *
scopus.document.types cp *
scopus.identifier.doi 10.18653/v1/2024.acl-long.817 *
scopus.identifier.isbn 9798891760943 *
scopus.identifier.pui 645308969 *
scopus.identifier.scopus 2-s2.0-85204461442 *
scopus.journal.sourceid 21101138302 *
scopus.language.iso eng *
scopus.publisher.name Association for Computational Linguistics (ACL) *
scopus.relation.conferencedate 2024 *
scopus.relation.conferencename 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 *
scopus.relation.conferenceplace tha *
scopus.relation.firstpage 15312 *
scopus.relation.lastpage 15338 *
scopus.relation.volume 1 *
scopus.title AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian *
scopus.titleeng AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian *
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
2024.acl-long.817.pdf

accesso aperto

Descrizione: AI ‘News’ Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 1.55 MB
Formato Adobe PDF
1.55 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/519993
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 1
social impact