Unbalanced classes, the ability to detect changes in real-time, the speed of the streams and other peculiar characteristics make most of the data mining algorithms not apt to operate with datasets in the cyber security domain. To overcome these issues, we propose an ensemble-based algorithm, using a distributed Genetic Program- ming framework to generate the function to combine the classifiers and efficient strategies to react to changes in data. After that the base classifiers are trained, the combining function of the ensemble, based on non-trainable functions, can be generated without any extra phase of training, while the drift detection function adopted, together with a strategy for replacing classifiers, permits to respond in an efficient way to changes. Preliminary experiments conducted on an artificial dataset and on a real intrusion detection dataset show the effectiveness of the approach.

An incremental ensemble evolved by using genetic programming to efficiently detect drifts in cyber security datasets

Gianluigi Folino;Pietro Sabatino;Francesco Sergio Pisani
2016

Abstract

Unbalanced classes, the ability to detect changes in real-time, the speed of the streams and other peculiar characteristics make most of the data mining algorithms not apt to operate with datasets in the cyber security domain. To overcome these issues, we propose an ensemble-based algorithm, using a distributed Genetic Program- ming framework to generate the function to combine the classifiers and efficient strategies to react to changes in data. After that the base classifiers are trained, the combining function of the ensemble, based on non-trainable functions, can be generated without any extra phase of training, while the drift detection function adopted, together with a strategy for replacing classifiers, permits to respond in an efficient way to changes. Preliminary experiments conducted on an artificial dataset and on a real intrusion detection dataset show the effectiveness of the approach.
2016
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Inglese
Genetic and Evolutionary Computation Conference, Gecco 2016
ACM, Association for computing machinery
New York
STATI UNITI D'AMERICA
20-24 Luglio 2016
Colorado, Denver, USA
Cyber security
Data mining
Ensemble
1
none
Gianluigi Folino; Pietro Sabatino;Francesco Sergio Pisani
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/308652
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 26
  • ???jsp.display-item.citation.isi??? ND
social impact