The rapid spread of fake news poses a major societal challenge, requiring efficient and generalizable detection methods. Active Learning offers a viable solution by reducing annotation costs while enhancing model performance. This study benchmarks multiple Active Learning strategies for fake news detection across two distinct domains: political discourse (Politifact) and entertainment news (GossipCop). We evaluate uncertainty-based methods (Entropy Sampling, Least Confidence) alongside more advanced techniques (Core-Set, K-Means, BADGE, BALD), assessing their effectiveness, efficiency and sustainability. Our findings highlight Entropy Sampling as the most accurate approach, particularly in the political domain, while K-Means emerges as the most computationally efficient. Additionally, we analyze the environmental impact of Active Learning-based training, underscoring its role in optimizing both performance and resource consumption. These insights contribute to the development of scalable and energy-efficient misinformation detection systems.
Benchmarking Active Learning Techniques: Insights from Multi-Domain Fake News Detection
Scala F.
;Vocaturo E.
2025
Abstract
The rapid spread of fake news poses a major societal challenge, requiring efficient and generalizable detection methods. Active Learning offers a viable solution by reducing annotation costs while enhancing model performance. This study benchmarks multiple Active Learning strategies for fake news detection across two distinct domains: political discourse (Politifact) and entertainment news (GossipCop). We evaluate uncertainty-based methods (Entropy Sampling, Least Confidence) alongside more advanced techniques (Core-Set, K-Means, BADGE, BALD), assessing their effectiveness, efficiency and sustainability. Our findings highlight Entropy Sampling as the most accurate approach, particularly in the political domain, while K-Means emerges as the most computationally efficient. Additionally, we analyze the environmental impact of Active Learning-based training, underscoring its role in optimizing both performance and resource consumption. These insights contribute to the development of scalable and energy-efficient misinformation detection systems.| File | Dimensione | Formato | |
|---|---|---|---|
|
paper8.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
1.23 MB
Formato
Adobe PDF
|
1.23 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


