CNR Institutional Research Information System

Fake followers are those Twitter accounts created to inflate the number of followers of a target account. Fake followers are dangerous to the social platform and beyond, since they may alter concepts like popularity and influence in the Twittersphere--hence impacting on economy, politics, and Society. In this paper, we contribute along different dimensions. First, we review some of the most relevant existing features and rules (proposed by Academia and Media) for anomalous Twitter accounts detection. Second, we create a gold standard of verified human and fake accounts. Then, we exploit the gold standard to train a set of machine-learning classifiers built over the reviewed rules and features. Most of the rules provided by Media provide unsatisfactory performance in revealing fake followers, while features provided by Academia for spam detection result in good performance. Building on the most promising features, we optimise the classifiers both in terms of reduction of overfitting and costs for gathering the data needed to compute the features. The final result is a "Class A" classifier, that is general enough to thwart overfitting and that uses the less costly features, while being able to correctly classify more than 95% of the accounts of the training set. The findings reported in this paper, other than being supported by a thorough experimental methodology and being interesting on their own, also pave the way for further investigation

A Fake Follower Story: improving fake accounts detection on Twitter

Stefano Cresci;Roberto Di Pietro;Marinella Petrocchi;Angelo Spognardi;Maurizio Tesconi

2014

Abstract

Fake followers are those Twitter accounts created to inflate the number of followers of a target account. Fake followers are dangerous to the social platform and beyond, since they may alter concepts like popularity and influence in the Twittersphere--hence impacting on economy, politics, and Society. In this paper, we contribute along different dimensions. First, we review some of the most relevant existing features and rules (proposed by Academia and Media) for anomalous Twitter accounts detection. Second, we create a gold standard of verified human and fake accounts. Then, we exploit the gold standard to train a set of machine-learning classifiers built over the reviewed rules and features. Most of the rules provided by Media provide unsatisfactory performance in revealing fake followers, while features provided by Academia for spam detection result in good performance. Building on the most promising features, we optimise the classifiers both in terms of reduction of overfitting and costs for gathering the data needed to compute the features. The final result is a "Class A" classifier, that is general enough to thwart overfitting and that uses the less costly features, while being able to correctly classify more than 95% of the accounts of the training set. The findings reported in this paper, other than being supported by a thorough experimental methodology and being interesting on their own, also pave the way for further investigation

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2014
			
	Strutture organizzative
	
				Istituto di informatica e telematica - IIT
			
	Parole chiave
	
				fake followers detection
gold standard
machine learning
Twitter
			
	Appare nelle tipologie:
	
				08.04 Rapporto tecnico

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/264320

Citazioni

ND

ND

ND

social impact