CNR Institutional Research Information System

Increasing attention has been paid to the detection and analysis of "deviant" instances of a business process that are connected with some kind of "hidden" undesired behavior (e.g., frauds, faults). In particular, several recent works faced the problem of inducing a binary classification model (here named Deviance Detection Model) that can discriminate between deviant traces and normal ones, based on a set of historical log traces (labelled as either deviant or normal). Current solutions rely on applying standard classifier-induction methods to a feature-based representation of the given traces, where the features include sequence-based patterns extracted from the corresponding sequences of activities. However, there is no consensus on which kinds of patterns are the most suitable for such a task. On the other hand, mixing multiple pattern families together may produce a heterogenous, redundant and sparse representation of the traces that likely leads to poor deviance-detection models. In this paper, we propose an ensemble-learning method for solving this problem, where multiple base classifiers are trained on different feature-based views of the log (obtained each by mapping the traces onto a distinguished collection of patterns). A stacking procedure is used to combine the discovered base models into an overall probabilistic model that associates any new trace with an estimate of the probability that it reflects a deviant process instance. This helps the analyst prioritize the inspection of the cases that are more likely to be deviant. The method also takes advantage of all non-structural data available in the log, and employs a resampling mechanism to deal with the rarity of deviances in the training log. It has been conceived as the core of a comprehensive framework for detecting and analyzing business process deviances. The framework supports the analyst to investigate suspect deviances, and provides some feedback to the learning method for improving the accuracy of the discovered deviance detection models. Tests on several real-life datasets proved the validity of the approach, as concerns its capability to discover an accurate deviance detection model, and to effectively exploit new (originally unlabeled) traces via active learning and self-training mechanisms.

A ROBUST AND VERSATILE MULTI-VIEW LEARNING FRAMEWORK FOR THE DETECTION OF DEVIANT BUSINESS PROCESS INSTANCES

Cuzzocrea A;Folino F;Guarascio M;Pontieri L

2016

Abstract

Increasing attention has been paid to the detection and analysis of "deviant" instances of a business process that are connected with some kind of "hidden" undesired behavior (e.g., frauds, faults). In particular, several recent works faced the problem of inducing a binary classification model (here named Deviance Detection Model) that can discriminate between deviant traces and normal ones, based on a set of historical log traces (labelled as either deviant or normal). Current solutions rely on applying standard classifier-induction methods to a feature-based representation of the given traces, where the features include sequence-based patterns extracted from the corresponding sequences of activities. However, there is no consensus on which kinds of patterns are the most suitable for such a task. On the other hand, mixing multiple pattern families together may produce a heterogenous, redundant and sparse representation of the traces that likely leads to poor deviance-detection models. In this paper, we propose an ensemble-learning method for solving this problem, where multiple base classifiers are trained on different feature-based views of the log (obtained each by mapping the traces onto a distinguished collection of patterns). A stacking procedure is used to combine the discovered base models into an overall probabilistic model that associates any new trace with an estimate of the probability that it reflects a deviant process instance. This helps the analyst prioritize the inspection of the cases that are more likely to be deviant. The method also takes advantage of all non-structural data available in the log, and employs a resampling mechanism to deal with the rarity of deviances in the training log. It has been conceived as the core of a comprehensive framework for detecting and analyzing business process deviances. The framework supports the analyst to investigate suspect deviances, and provides some feedback to the learning method for improving the accuracy of the discovered deviance detection models. Tests on several real-life datasets proved the validity of the approach, as concerns its capability to discover an accurate deviance detection model, and to effectively exploit new (originally unlabeled) traces via active learning and self-training mechanisms.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Strutture organizzative
	
				Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
			
	Parole chiave
	
				Business process intelligence
Classification
Deviance detection
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
prod_361926-doc_163459.pdf solo utenti autorizzati Descrizione: A Robust and Versatile Multi-View Learning Framework for the Detection of Deviant Business Process Instances Tipologia: Versione Editoriale (PDF) Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 1.26 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.26 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/321504

Citazioni

ND

19

13

social impact