CNR Institutional Research Information System

The rise of bots and their influence on social networks is a hot topic that has aroused the interest of many researchers. Despite the efforts to detect social bots, it is still difficult to distinguish them from legitimate users. Here, we propose a simple yet effective semi-supervised method that allows distinguishing between bots and legitimate users with high accuracy. The method learns a joint representation of social connections and interactions between users by leveraging graph-based representation learning. Then, on the proximity graph derived from user embeddings, a sample of bots is used as seeds for a label propagation algorithm. We demonstrate that when the label propagation is done according to pairwise account proximity, our method achieves F1 = 0.93, whereas other state-of-the-art techniques achieve F1 <= 0.87. By applying our method to a large dataset of retweets, we uncover the presence of different clusters of bots in the network of Twitter interactions. Interestingly, such clusters feature different degrees of integration with legitimate users. By analyzing the interactions produced by the different clusters of bots, our results suggest that a significant group of users was systematically exposed to content produced by bots and to interactions with bots, indicating the presence of a selective exposure phenomenon.

Bots in social and interaction networks: Detection and impact estimation

M Mendoza;M Tesconi;S Cresci

2020

Abstract

The rise of bots and their influence on social networks is a hot topic that has aroused the interest of many researchers. Despite the efforts to detect social bots, it is still difficult to distinguish them from legitimate users. Here, we propose a simple yet effective semi-supervised method that allows distinguishing between bots and legitimate users with high accuracy. The method learns a joint representation of social connections and interactions between users by leveraging graph-based representation learning. Then, on the proximity graph derived from user embeddings, a sample of bots is used as seeds for a label propagation algorithm. We demonstrate that when the label propagation is done according to pairwise account proximity, our method achieves F1 = 0.93, whereas other state-of-the-art techniques achieve F1 <= 0.87. By applying our method to a large dataset of retweets, we uncover the presence of different clusters of bots in the network of Twitter interactions. Interestingly, such clusters feature different degrees of integration with legitimate users. By analyzing the interactions produced by the different clusters of bots, our results suggest that a significant group of users was systematically exposed to content produced by bots and to interactions with bots, indicating the presence of a selective exposure phenomenon.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Strutture organizzative
	
				Istituto di informatica e telematica - IIT
			
	Parole chiave
	
				Machine Learning
Data Mining
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
prod_436332-doc_156260.pdf solo utenti autorizzati Descrizione: Bots in social and interaction networks: Detection and impact estimation Tipologia: Versione Editoriale (PDF) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 7.3 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	7.3 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
AAM_TOIS.pdf accesso aperto Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 9.28 MB Formato Adobe PDF Visualizza/Apri	9.28 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/381815

Citazioni

ND

42

36

social impact