Detecting deviant traces in business process logs is a crucial task in modern organizations due to the detrimental effect of certain deviant behaviors (e.g., attacks, frauds, faults). Training a Deviance Detection Model (DDM)only over labeled traces with supervised learning methods unfits real-life contexts where a small fraction of the traces are labeled. Thus, we here propose an Active-Learning-based approach to discovering a deep DDM ensemble that exploits a temporal ensembling method to train and fuse multiple DDMs sharing the same DNN architecture, devised in a way ensuring rapid convergence in relatively few training epochs. Experts' supervision is required only on small numbers of unlabelled traces exhibiting high values of (epistemic) prediction uncertainty, estimated in an ensemble-driven fashion. Tests on real data confirmed the approach's effectiveness, even compared to the results obtained by state-of-the-art supervised methods in the ideal case where all the data are labeled.

Combining Active Learning and Fast DNN Ensembles for Process Deviance Discovery

Francesco Folino;Gianluigi Folino;Massimo Guarascio;Luigi Pontieri
2022

Abstract

Detecting deviant traces in business process logs is a crucial task in modern organizations due to the detrimental effect of certain deviant behaviors (e.g., attacks, frauds, faults). Training a Deviance Detection Model (DDM)only over labeled traces with supervised learning methods unfits real-life contexts where a small fraction of the traces are labeled. Thus, we here propose an Active-Learning-based approach to discovering a deep DDM ensemble that exploits a temporal ensembling method to train and fuse multiple DDMs sharing the same DNN architecture, devised in a way ensuring rapid convergence in relatively few training epochs. Experts' supervision is required only on small numbers of unlabelled traces exhibiting high values of (epistemic) prediction uncertainty, estimated in an ensemble-driven fashion. Tests on real data confirmed the approach's effectiveness, even compared to the results obtained by state-of-the-art supervised methods in the ideal case where all the data are labeled.
2022
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Process deviance discovery
Deep Ensembles
Active Learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/417490
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact