Large user and application logs are generated and stored by many organisations at a rate that makes it really hard to analyse, especially in real-time. In particular, in the field of cybersecurity, it is of great interest to analyse fast user logs, coming from different and heterogeneous sources, in order to prevent data breach issues caused by user behaviour. In addition to these problems, often part of the data or some entire sources are missing. To overcome these issues, we propose a framework based on the Elastic Stack (ELK) to process and store log data coming from different users and applications to generate an ensemble of classifiers, in order to classify the user behaviour, and eventually to detect anomalies. The system exploits the scalable architecture of ELK by running on top of a Kubernetes platform and adopts a distributed evolutionary algorithm for classifying the users, on the basis of their digital footprints, derived by many sources of data. Preliminary experiments show that the system is effective in classifying the behaviour of the different users and that this can be considered as an auxiliary task for detecting anomalies in their behaviour, by helping to reduce the number of false alarms.

A Scalable Architecture Exploiting Elastic Stack and Meta Ensemble of Classifiers for Profiling User Behaviour

Gianluigi Folino;Francesco Sergio Pisani
2022

Abstract

Large user and application logs are generated and stored by many organisations at a rate that makes it really hard to analyse, especially in real-time. In particular, in the field of cybersecurity, it is of great interest to analyse fast user logs, coming from different and heterogeneous sources, in order to prevent data breach issues caused by user behaviour. In addition to these problems, often part of the data or some entire sources are missing. To overcome these issues, we propose a framework based on the Elastic Stack (ELK) to process and store log data coming from different users and applications to generate an ensemble of classifiers, in order to classify the user behaviour, and eventually to detect anomalies. The system exploits the scalable architecture of ELK by running on top of a Kubernetes platform and adopts a distributed evolutionary algorithm for classifying the users, on the basis of their digital footprints, derived by many sources of data. Preliminary experiments show that the system is effective in classifying the behaviour of the different users and that this can be considered as an auxiliary task for detecting anomalies in their behaviour, by helping to reduce the number of false alarms.
2022
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
user behaviour
cybersecurity
ensemble learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/417492
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact