Increasing attention has been paid to the detection and analysis of "deviant" instances of a business process that are connected with some kind of "hidden" undesired behavior (e.g., frauds, faults). In particular, several recent works faced the problem of inducing a binary classification model (here named Deviance Detection Model) that can discriminate between deviant traces and normal ones, based on a set of historical log traces (labelled as either deviant or normal). Current solutions rely on applying standard classifier-induction methods to a feature-based representation of the given traces, where the features include sequence-based patterns extracted from the corresponding sequences of activities. However, there is no consensus on which kinds of patterns are the most suitable for such a task. On the other hand, mixing multiple pattern families together may produce a heterogenous, redundant and sparse representation of the traces that likely leads to poor deviance-detection models. In this paper, we propose an ensemble-learning method for solving this problem, where multiple base classifiers are trained on different feature-based views of the log (obtained each by mapping the traces onto a distinguished collection of patterns). A stacking procedure is used to combine the discovered base models into an overall probabilistic model that associates any new trace with an estimate of the probability that it reflects a deviant process instance. This helps the analyst prioritize the inspection of the cases that are more likely to be deviant. The method also takes advantage of all non-structural data available in the log, and employs a resampling mechanism to deal with the rarity of deviances in the training log. It has been conceived as the core of a comprehensive framework for detecting and analyzing business process deviances. The framework supports the analyst to investigate suspect deviances, and provides some feedback to the learning method for improving the accuracy of the discovered deviance detection models. Tests on several real-life datasets proved the validity of the approach, as concerns its capability to discover an accurate deviance detection model, and to effectively exploit new (originally unlabeled) traces via active learning and self-training mechanisms.

A ROBUST AND VERSATILE MULTI-VIEW LEARNING FRAMEWORK FOR THE DETECTION OF DEVIANT BUSINESS PROCESS INSTANCES

Cuzzocrea A;Folino F;Guarascio M;Pontieri L
2016

Abstract

Increasing attention has been paid to the detection and analysis of "deviant" instances of a business process that are connected with some kind of "hidden" undesired behavior (e.g., frauds, faults). In particular, several recent works faced the problem of inducing a binary classification model (here named Deviance Detection Model) that can discriminate between deviant traces and normal ones, based on a set of historical log traces (labelled as either deviant or normal). Current solutions rely on applying standard classifier-induction methods to a feature-based representation of the given traces, where the features include sequence-based patterns extracted from the corresponding sequences of activities. However, there is no consensus on which kinds of patterns are the most suitable for such a task. On the other hand, mixing multiple pattern families together may produce a heterogenous, redundant and sparse representation of the traces that likely leads to poor deviance-detection models. In this paper, we propose an ensemble-learning method for solving this problem, where multiple base classifiers are trained on different feature-based views of the log (obtained each by mapping the traces onto a distinguished collection of patterns). A stacking procedure is used to combine the discovered base models into an overall probabilistic model that associates any new trace with an estimate of the probability that it reflects a deviant process instance. This helps the analyst prioritize the inspection of the cases that are more likely to be deviant. The method also takes advantage of all non-structural data available in the log, and employs a resampling mechanism to deal with the rarity of deviances in the training log. It has been conceived as the core of a comprehensive framework for detecting and analyzing business process deviances. The framework supports the analyst to investigate suspect deviances, and provides some feedback to the learning method for improving the accuracy of the discovered deviance detection models. Tests on several real-life datasets proved the validity of the approach, as concerns its capability to discover an accurate deviance detection model, and to effectively exploit new (originally unlabeled) traces via active learning and self-training mechanisms.
2016
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Business process intelligence
Classification
Deviance detection
File in questo prodotto:
File Dimensione Formato  
prod_361926-doc_163459.pdf

solo utenti autorizzati

Descrizione: A Robust and Versatile Multi-View Learning Framework for the Detection of Deviant Business Process Instances
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.26 MB
Formato Adobe PDF
1.26 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/321504
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 12
social impact