Increasing attention has been paid to the detection and analysis of "deviant" instances of a business process that are connected with some kind of "hidden" undesired behavior (e.g., frauds, faults). In particular, several recent works faced the problem of inducing a binary classification model (here named Deviance Detection Model) that can discriminate between deviant traces and normal ones, based on a set of historical log traces (labelled as either deviant or normal). Current solutions rely on applying standard classifier-induction methods to a feature-based representation of the given traces, where the features include sequence-based patterns extracted from the corresponding sequences of activities. However, there is no consensus on which kinds of patterns are the most suitable for such a task. On the other hand, mixing multiple pattern families together may produce a heterogenous, redundant and sparse representation of the traces that likely leads to poor deviance-detection models. In this paper, we propose an ensemble-learning method for solving this problem, where multiple base classifiers are trained on different feature-based views of the log (obtained each by mapping the traces onto a distinguished collection of patterns). A stacking procedure is used to combine the discovered base models into an overall probabilistic model that associates any new trace with an estimate of the probability that it reflects a deviant process instance. This helps the analyst prioritize the inspection of the cases that are more likely to be deviant. The method also takes advantage of all non-structural data available in the log, and employs a resampling mechanism to deal with the rarity of deviances in the training log. It has been conceived as the core of a comprehensive framework for detecting and analyzing business process deviances. The framework supports the analyst to investigate suspect deviances, and provides some feedback to the learning method for improving the accuracy of the discovered deviance detection models. Tests on several real-life datasets proved the validity of the approach, as concerns its capability to discover an accurate deviance detection model, and to effectively exploit new (originally unlabeled) traces via active learning and self-training mechanisms.
A ROBUST AND VERSATILE MULTI-VIEW LEARNING FRAMEWORK FOR THE DETECTION OF DEVIANT BUSINESS PROCESS INSTANCES
Cuzzocrea A;Folino F;Guarascio M;Pontieri L
2016
Abstract
Increasing attention has been paid to the detection and analysis of "deviant" instances of a business process that are connected with some kind of "hidden" undesired behavior (e.g., frauds, faults). In particular, several recent works faced the problem of inducing a binary classification model (here named Deviance Detection Model) that can discriminate between deviant traces and normal ones, based on a set of historical log traces (labelled as either deviant or normal). Current solutions rely on applying standard classifier-induction methods to a feature-based representation of the given traces, where the features include sequence-based patterns extracted from the corresponding sequences of activities. However, there is no consensus on which kinds of patterns are the most suitable for such a task. On the other hand, mixing multiple pattern families together may produce a heterogenous, redundant and sparse representation of the traces that likely leads to poor deviance-detection models. In this paper, we propose an ensemble-learning method for solving this problem, where multiple base classifiers are trained on different feature-based views of the log (obtained each by mapping the traces onto a distinguished collection of patterns). A stacking procedure is used to combine the discovered base models into an overall probabilistic model that associates any new trace with an estimate of the probability that it reflects a deviant process instance. This helps the analyst prioritize the inspection of the cases that are more likely to be deviant. The method also takes advantage of all non-structural data available in the log, and employs a resampling mechanism to deal with the rarity of deviances in the training log. It has been conceived as the core of a comprehensive framework for detecting and analyzing business process deviances. The framework supports the analyst to investigate suspect deviances, and provides some feedback to the learning method for improving the accuracy of the discovered deviance detection models. Tests on several real-life datasets proved the validity of the approach, as concerns its capability to discover an accurate deviance detection model, and to effectively exploit new (originally unlabeled) traces via active learning and self-training mechanisms.File | Dimensione | Formato | |
---|---|---|---|
prod_361926-doc_163459.pdf
solo utenti autorizzati
Descrizione: A Robust and Versatile Multi-View Learning Framework for the Detection of Deviant Business Process Instances
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.26 MB
Formato
Adobe PDF
|
1.26 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.