AI-generated counterspeech offers a promising and scalable strategy to curb online toxicity through direct replies that promote civil discourse. However, current counterspeech is one-size-fits-all, lacking adaptation to the moderation context and the users involved. We propose and evaluate multiple strategies for generating tailored counterspeech that is adapted to the moderation context and personalized for the moderated user. We instruct a LLaMA2-13B model to generate counterspeech, experimenting with various configurations based on different contextual information and fine-tuning strategies. We identify the configurations that generate persuasive counterspeech through a combination of quantitative indicators and human evaluations collected via a pre-registered mixed-design crowdsourcing experiment. Results show that contextualized counterspeech can significantly outperform state-of-the-art generic counterspeech in adequacy and persuasiveness, without compromising other characteristics. Our findings also reveal a poor correlation between quantitative indicators and human evaluations, suggesting that these methods assess different aspects and highlighting the need for nuanced evaluation methodologies. The effectiveness of contextualized AI-generated counterspeech and the divergence between human and algorithmic evaluations underscore the importance of increased human-AI collaboration in content moderation.

Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation

Cima L.;Miaschi A.;Dell'Orletta F.;Cresci S.
2025

Abstract

AI-generated counterspeech offers a promising and scalable strategy to curb online toxicity through direct replies that promote civil discourse. However, current counterspeech is one-size-fits-all, lacking adaptation to the moderation context and the users involved. We propose and evaluate multiple strategies for generating tailored counterspeech that is adapted to the moderation context and personalized for the moderated user. We instruct a LLaMA2-13B model to generate counterspeech, experimenting with various configurations based on different contextual information and fine-tuning strategies. We identify the configurations that generate persuasive counterspeech through a combination of quantitative indicators and human evaluations collected via a pre-registered mixed-design crowdsourcing experiment. Results show that contextualized counterspeech can significantly outperform state-of-the-art generic counterspeech in adequacy and persuasiveness, without compromising other characteristics. Our findings also reveal a poor correlation between quantitative indicators and human evaluations, suggesting that these methods assess different aspects and highlighting the need for nuanced evaluation methodologies. The effectiveness of contextualized AI-generated counterspeech and the divergence between human and algorithmic evaluations underscore the importance of increased human-AI collaboration in content moderation.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.orgunit Istituto di informatica e telematica - IIT en
dc.authority.people Cima L. en
dc.authority.people Miaschi A. en
dc.authority.people Trujillo A. en
dc.authority.people Avvenuti M. en
dc.authority.people Dell'Orletta F. en
dc.authority.people Cresci S. en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di informatica e telematica - IIT *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 912 *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.date.accessioned 2026/03/03 15:17:04 -
dc.date.available 2026/03/03 15:17:04 -
dc.date.firstsubmission 2026/03/02 18:09:26 *
dc.date.issued 2025 -
dc.date.submission 2026/03/02 18:09:26 *
dc.description.abstracteng AI-generated counterspeech offers a promising and scalable strategy to curb online toxicity through direct replies that promote civil discourse. However, current counterspeech is one-size-fits-all, lacking adaptation to the moderation context and the users involved. We propose and evaluate multiple strategies for generating tailored counterspeech that is adapted to the moderation context and personalized for the moderated user. We instruct a LLaMA2-13B model to generate counterspeech, experimenting with various configurations based on different contextual information and fine-tuning strategies. We identify the configurations that generate persuasive counterspeech through a combination of quantitative indicators and human evaluations collected via a pre-registered mixed-design crowdsourcing experiment. Results show that contextualized counterspeech can significantly outperform state-of-the-art generic counterspeech in adequacy and persuasiveness, without compromising other characteristics. Our findings also reveal a poor correlation between quantitative indicators and human evaluations, suggesting that these methods assess different aspects and highlighting the need for nuanced evaluation methodologies. The effectiveness of contextualized AI-generated counterspeech and the divergence between human and algorithmic evaluations underscore the importance of increased human-AI collaboration in content moderation. -
dc.description.allpeople Cima, L.; Miaschi, A.; Trujillo, A.; Avvenuti, M.; Dell'Orletta, F.; Cresci, S. -
dc.description.allpeopleoriginal Cima L.; Miaschi A.; Trujillo A.; Avvenuti M.; Dell'Orletta F.; Cresci S. en
dc.description.fulltext open en
dc.description.international no en
dc.description.numberofauthors 6 -
dc.identifier.doi 10.1145/3696410.3714507 en
dc.identifier.isi WOS:001505285200414 -
dc.identifier.scopus 2-s2.0-105005139241 en
dc.identifier.source scopus *
dc.identifier.uri https://hdl.handle.net/20.500.14243/570444 -
dc.language.iso eng en
dc.publisher.name Association for Computing Machinery, Inc en
dc.publisher.place 1601 Broadway, 10th Floor, NEW YORK, NY, UNITED STATES en
dc.relation.conferencedate 2025 en
dc.relation.conferencename 34th ACM Web Conference, WWW 2025 en
dc.relation.conferenceplace Sydney Convention and Exhibition Centre, aus en
dc.relation.firstpage 5022 en
dc.relation.ispartofbook WWW 2025 - Proceedings of the ACM Web Conference en
dc.relation.lastpage 5033 en
dc.relation.numberofpages 12 en
dc.subject.keywords content moderation -
dc.subject.keywords Counterspeech -
dc.subject.keywords generative AI -
dc.subject.keywords online toxicity -
dc.subject.keywords personalization -
dc.subject.singlekeyword content moderation *
dc.subject.singlekeyword Counterspeech *
dc.subject.singlekeyword generative AI *
dc.subject.singlekeyword online toxicity *
dc.subject.singlekeyword personalization *
dc.title Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
iris.isi.extIssued 2025 -
iris.isi.extTitle Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation -
iris.mediafilter.data 2026/03/04 02:52:21 *
iris.orcid.lastModifiedDate 2026/03/04 01:09:50 *
iris.orcid.lastModifiedMillisecond 1772582990316 *
iris.scopus.extIssued 2025 -
iris.scopus.extTitle Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.bestoaversion publishedVersion *
iris.unpaywall.doi 10.1145/3696410.3714507 *
iris.unpaywall.isoa true *
iris.unpaywall.landingpage https://doi.org/10.1145/3696410.3714507 *
iris.unpaywall.license cc-by *
iris.unpaywall.metadataCallLastModified 04/03/2026 04:34:00 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1772595240623 -
iris.unpaywall.oastatus gold *
iris.unpaywall.pdfurl https://dl.acm.org/doi/pdf/10.1145/3696410.3714507 *
isi.authority.sdg Goal 3: Good health and well-being###12083 *
isi.category EV *
isi.contributor.affiliation University of Pisa -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation University of Pisa -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.affiliation Consiglio Nazionale delle Ricerche (CNR) -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.country Italy -
isi.contributor.name Lorenzo -
isi.contributor.name Alessio -
isi.contributor.name Amaury -
isi.contributor.name Marco -
isi.contributor.name Felice -
isi.contributor.name Stefano -
isi.contributor.researcherId KLP-3367-2024 -
isi.contributor.researcherId GCD-5321-2022 -
isi.contributor.researcherId Q-3946-2018 -
isi.contributor.researcherId M-7150-2019 -
isi.contributor.researcherId AAX-1864-2020 -
isi.contributor.researcherId Q-4031-2018 -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.subaffiliation -
isi.contributor.surname Cima -
isi.contributor.surname Miaschi -
isi.contributor.surname Trujillo -
isi.contributor.surname Avvenuti -
isi.contributor.surname Dell'Orletta -
isi.contributor.surname Cresci -
isi.date.issued 2025 *
isi.description.abstracteng AI-generated counterspeech offers a promising and scalable strategy to curb online toxicity through direct replies that promote civil discourse. However, current counterspeech is one-size-fits-all, lacking adaptation to the moderation context and the users involved. We propose and evaluate multiple strategies for generating tailored counterspeech that is adapted to the moderation context and personalized for the moderated user. We instruct a LLaMA2-13B model to generate counterspeech, experimenting with various configurations based on different contextual information and fine-tuning strategies. We identify the configurations that generate persuasive counterspeech through a combination of quantitative indicators and human evaluations collected via a pre-registered mixed-design crowdsourcing experiment. Results show that contextualized counterspeech can significantly outperform state-of-the-art generic counterspeech in adequacy and persuasiveness, without compromising other characteristics. Our findings also reveal a poor correlation between quantitative indicators and human evaluations, suggesting that these methods assess different aspects and highlighting the need for nuanced evaluation methodologies. The effectiveness of contextualized AI-generated counterspeech and the divergence between human and algorithmic evaluations underscore the importance of increased human-AI collaboration in content moderation. *
isi.description.allpeopleoriginal Cima, L; Miaschi, A; Trujillo, A; Avvenuti, M; Dell'Orletta, F; Cresci, S; *
isi.document.sourcetype WOS.ISTP *
isi.document.type Proceedings Paper *
isi.document.types Proceedings Paper *
isi.identifier.doi 10.1145/3696410.3714507 *
isi.identifier.isi WOS:001505285200414 *
isi.journal.journaltitle PROCEEDINGS OF THE ACM WEB CONFERENCE 2025, WWW 2025 *
isi.language.original English *
isi.publisher.place 1601 Broadway, 10th Floor, NEW YORK, NY, UNITED STATES *
isi.relation.firstpage 5022 *
isi.relation.lastpage 5033 *
isi.title Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation *
scopus.category 1710 *
scopus.category 2611 *
scopus.category 2213 *
scopus.category 1804 *
scopus.category 1705 *
scopus.category 1802 *
scopus.category 1702 *
scopus.contributor.affiliation IIT-CNR -
scopus.contributor.affiliation ILC-CNR -
scopus.contributor.affiliation IIT-CNR -
scopus.contributor.affiliation University of Pisa -
scopus.contributor.affiliation ILC-CNR -
scopus.contributor.affiliation IIT-CNR -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60028868 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.auid 58886153500 -
scopus.contributor.auid 57211678681 -
scopus.contributor.auid 56421715800 -
scopus.contributor.auid 6602976787 -
scopus.contributor.auid 57540567000 -
scopus.contributor.auid 56178304900 -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.dptid -
scopus.contributor.name Lorenzo -
scopus.contributor.name Alessio -
scopus.contributor.name Amaury -
scopus.contributor.name Marco -
scopus.contributor.name Felice -
scopus.contributor.name Stefano -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.subaffiliation -
scopus.contributor.surname Cima -
scopus.contributor.surname Miaschi -
scopus.contributor.surname Trujillo -
scopus.contributor.surname Avvenuti -
scopus.contributor.surname Dell’Orletta -
scopus.contributor.surname Cresci -
scopus.date.issued 2025 *
scopus.description.abstracteng AI-generated counterspeech offers a promising and scalable strategy to curb online toxicity through direct replies that promote civil discourse. However, current counterspeech is one-size-fits-all, lacking adaptation to the moderation context and the users involved. We propose and evaluate multiple strategies for generating tailored counterspeech that is adapted to the moderation context and personalized for the moderated user. We instruct a LLaMA2-13B model to generate counterspeech, experimenting with various configurations based on different contextual information and fine-tuning strategies. We identify the configurations that generate persuasive counterspeech through a combination of quantitative indicators and human evaluations collected via a pre-registered mixed-design crowdsourcing experiment. Results show that contextualized counterspeech can significantly outperform state-of-the-art generic counterspeech in adequacy and persuasiveness, without compromising other characteristics. Our findings also reveal a poor correlation between quantitative indicators and human evaluations, suggesting that these methods assess different aspects and highlighting the need for nuanced evaluation methodologies. The effectiveness of contextualized AI-generated counterspeech and the divergence between human and algorithmic evaluations underscore the importance of increased human-AI collaboration in content moderation. *
scopus.description.allpeopleoriginal Cima L.; Miaschi A.; Trujillo A.; Avvenuti M.; Dell'Orletta F.; Cresci S. *
scopus.differences scopus.subject.keywords *
scopus.differences scopus.identifier.isbn *
scopus.document.type cp *
scopus.document.types cp *
scopus.funding.funders 501100021856 - Ministero dell'Università e della Ricerca; 501100000780 - European Commission; 501100000781 - European Research Council; 501100000781 - European Research Council; *
scopus.funding.ids 101113826; CUP B53D23013290006; *
scopus.identifier.doi 10.1145/3696410.3714507 *
scopus.identifier.isbn 9798400712746 *
scopus.identifier.pui 647338839 *
scopus.identifier.scopus 2-s2.0-105005139241 *
scopus.journal.sourceid 21101295705 *
scopus.language.iso eng *
scopus.publisher.name Association for Computing Machinery, Inc *
scopus.relation.conferencedate 2025 *
scopus.relation.conferencename 34th ACM Web Conference, WWW 2025 *
scopus.relation.conferenceplace Sydney Convention and Exhibition Centre, aus *
scopus.relation.firstpage 5022 *
scopus.relation.lastpage 5033 *
scopus.subject.keywords content moderation; Counterspeech; generative AI; online toxicity; personalization; *
scopus.title Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation *
scopus.titleeng Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation *
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
3696410.3714507.pdf

accesso aperto

Licenza: Creative commons
Dimensione 2.08 MB
Formato Adobe PDF
2.08 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/570444
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact