Unbalanced classes, the ability to detect changes in real-time, the speed of the streams and other peculiar characteristics make most of the data mining algorithms not apt to operate with datasets in the cyber security domain. To overcome these issues, we propose an ensemble-based algorithm, using a distributed Genetic Programming framework to generate the function to combine the classifiers and efficient strategies to react to changes in data. After that the base classifiers are trained, the combining function of the ensemble, based on non-trainable functions, can be generated without any extra phase of training, while the drift detection function adopted, together with a strategy for replacing classifiers, permits to respond in an efficient way to changes. Preliminary experiments conducted on an artificial dataset and on a real intrusion detection dataset show the effectiveness of the approach.

An Incremental Ensemble Evolved by using Genetic Programming to Efficiently Detect Drifts in Cyber Security Datasets

Folino, Gianluigi
Co-primo
;
Pisani, Francesco Sergio
Co-primo
;
Sabatino, Pietro
Co-primo
2016

Abstract

Unbalanced classes, the ability to detect changes in real-time, the speed of the streams and other peculiar characteristics make most of the data mining algorithms not apt to operate with datasets in the cyber security domain. To overcome these issues, we propose an ensemble-based algorithm, using a distributed Genetic Programming framework to generate the function to combine the classifiers and efficient strategies to react to changes in data. After that the base classifiers are trained, the combining function of the ensemble, based on non-trainable functions, can be generated without any extra phase of training, while the drift detection function adopted, together with a strategy for replacing classifiers, permits to respond in an efficient way to changes. Preliminary experiments conducted on an artificial dataset and on a real intrusion detection dataset show the effectiveness of the approach.
2016
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
intrusion detection systems, ensemble, data mining, cyber security
File in questo prodotto:
File Dimensione Formato  
An_Incremental_Ensemble_Evolved_by_using_Genetic_Programming_to.pdf

solo utenti autorizzati

Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 238.68 kB
Formato Adobe PDF
238.68 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/556846
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact