Spam emails constitute a fast growing and costly problems associated with the Internet today. To fight effectively against spammers, it is not enough to block spam messages. Instead, it is necessary to analyze the behavior of spammer. This analysis is extremely difficult if the huge amount of spam messages is considered as a whole. Clustering spam emails into smaller groups according to their inherent similarity, facilitates discovering spam campaigns sent by a spammer, in order to analyze the spammer behavior. This paper proposes a methodology to group large sets of spam emails into spam campaigns, on the base of categorical attributes of spam messages. A new informative clustering algorithm, named Categorical Clustering Tree (CCTree), is introduced to cluster and characterize spam campaigns. The complexity of the algorithm is also analyzed and its efficiency has been proven.
Clustering spam emails into campaigns
SheikhAlishahi M;
2015
Abstract
Spam emails constitute a fast growing and costly problems associated with the Internet today. To fight effectively against spammers, it is not enough to block spam messages. Instead, it is necessary to analyze the behavior of spammer. This analysis is extremely difficult if the huge amount of spam messages is considered as a whole. Clustering spam emails into smaller groups according to their inherent similarity, facilitates discovering spam campaigns sent by a spammer, in order to analyze the spammer behavior. This paper proposes a methodology to group large sets of spam emails into spam campaigns, on the base of categorical attributes of spam messages. A new informative clustering algorithm, named Categorical Clustering Tree (CCTree), is introduced to cluster and characterize spam campaigns. The complexity of the algorithm is also analyzed and its efficiency has been proven.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


