Nowadays, modern social networks allow the rapid sharing of news worldwide. Alas, these news are frequently unverified or shared on the basis of users’ opinions or beliefs, which can cause confusion widespread, public trust erosion, and contribution to social and political instability. In this complex and evolving scenario, the early detection of fake news has become a critical issue. Multimodal approaches, which integrate various data types such as text, images, audio, video, and network structures, have shown promising results in addressing such a problem. The literature presents different fusion strategies, but there is no consensus on which one is the most effective. In this work, we propose M3DUSA, a modular multi-modal framework able to combine different modalities to effectively detect malicious and misleading content. By using deep attention-based architectures, our framework discovers informative latent representations that can be combined using different early or late fusion strategies. Experiments conducted on a real-world dataset demonstrate the effectiveness of our solution. The achieved results highlight that while both early and late fusion approaches can effectively exploit the complementary contributions from different modalities, they can exhibit distinct advantages depending on the desired outcomes.

You Can Spread But You Cannot Hide: Discovering Accurate Multi-modal Deep Fusion Models for Fake News Detection

Liliana Martirano;Paolo Zicari;Massimo Guarascio;Francesco Sergio Pisani;Carmela Comito
2024

Abstract

Nowadays, modern social networks allow the rapid sharing of news worldwide. Alas, these news are frequently unverified or shared on the basis of users’ opinions or beliefs, which can cause confusion widespread, public trust erosion, and contribution to social and political instability. In this complex and evolving scenario, the early detection of fake news has become a critical issue. Multimodal approaches, which integrate various data types such as text, images, audio, video, and network structures, have shown promising results in addressing such a problem. The literature presents different fusion strategies, but there is no consensus on which one is the most effective. In this work, we propose M3DUSA, a modular multi-modal framework able to combine different modalities to effectively detect malicious and misleading content. By using deep attention-based architectures, our framework discovers informative latent representations that can be combined using different early or late fusion strategies. Experiments conducted on a real-world dataset demonstrate the effectiveness of our solution. The achieved results highlight that while both early and late fusion approaches can effectively exploit the complementary contributions from different modalities, they can exhibit distinct advantages depending on the desired outcomes.
2024
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
9783031824265
9783031824272
Deep Fusion Methods
Heterogeneous Information Networks
Multimodal Fake News Detection
Social Networks
File in questo prodotto:
File Dimensione Formato  
978-3-031-82427-2_17.pdf

solo utenti autorizzati

Descrizione: Articolo
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 833.22 kB
Formato Adobe PDF
833.22 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/555886
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact