Classification-oriented Machine Learning methods are a precious tool, in modern Intrusion Detection Systems (IDSs), for discriminating between suspected intrusion attacks and normal behaviors. Many recent proposals in this field leveraged Deep Neural Network (DNN) methods, capable of learning effective hierarchical data representations automatically. However, many of these solutions were validated on data featuring stationary distributions and/or large amounts of training examples. By contrast, in real IDS applications, different kinds of attack tend to occur over time, and only a small fraction of the data instances is labeled (usually with far fewer examples of attacks than of normal behavior). A novel ensemble-based Deep Learning framework is proposed here that tries to face the challenging issues above. Basically, the non-stationary nature of IDS log data is faced by maintaining an ensemble consisting of a number of specialized base DNN classifiers, trained on disjoint chunks of the data instances' stream, plus a combiner model (reasoning on both the base classifiers predictions and original instance features). In order to learn deep base classifiers effectively from small training samples, an ad-hoc shared DNN architecture is adopted, featuring a combination of dropout capabilities, skip- connections, along with a cost-sensitive loss (for dealing with unbalanced data). Tests results, conducted on two benchmark IDS datasets and involving several competitors, confirmed the effectiveness of our proposal (in terms of both classification accuracy and robustness to data scarcity), and allowed us to evaluate different ensemble combination schemes.
On learning effective ensembles of deep neural networks for intrusion detection
Francesco Folino;Gianluigi Folino;Massimo Guarascio;Francesco Sergio Pisani;Luigi Pontieri
2021
Abstract
Classification-oriented Machine Learning methods are a precious tool, in modern Intrusion Detection Systems (IDSs), for discriminating between suspected intrusion attacks and normal behaviors. Many recent proposals in this field leveraged Deep Neural Network (DNN) methods, capable of learning effective hierarchical data representations automatically. However, many of these solutions were validated on data featuring stationary distributions and/or large amounts of training examples. By contrast, in real IDS applications, different kinds of attack tend to occur over time, and only a small fraction of the data instances is labeled (usually with far fewer examples of attacks than of normal behavior). A novel ensemble-based Deep Learning framework is proposed here that tries to face the challenging issues above. Basically, the non-stationary nature of IDS log data is faced by maintaining an ensemble consisting of a number of specialized base DNN classifiers, trained on disjoint chunks of the data instances' stream, plus a combiner model (reasoning on both the base classifiers predictions and original instance features). In order to learn deep base classifiers effectively from small training samples, an ad-hoc shared DNN architecture is adopted, featuring a combination of dropout capabilities, skip- connections, along with a cost-sensitive loss (for dealing with unbalanced data). Tests results, conducted on two benchmark IDS datasets and involving several competitors, confirmed the effectiveness of our proposal (in terms of both classification accuracy and robustness to data scarcity), and allowed us to evaluate different ensemble combination schemes.File | Dimensione | Formato | |
---|---|---|---|
prod_458775-doc_178461.pdf
solo utenti autorizzati
Descrizione: pdf
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.85 MB
Formato
Adobe PDF
|
1.85 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.