CNR Institutional Research Information System

The execution logs of a business process have been recently exploited to extract a classification model for discriminating "deviant" instances --i.e. instances diverging from normal/desired outcomes (e.g., frauds, faults, SLA violations). Regarding all log traces as sequences of task labels, current solutions essentially map each trace onto a vector space, where the features correspond to sequence-oriented patterns, and any standard classifier-induction method can be applied to separate the two classes of instances. An ensemble-learning approach was also recently proposed to combine multiple base learners trained on heterogenous pattern-based log views. However, by simply abstracting each event into an activity symbol, these approaches disregard all the non structural event data that are typically stored in real-life logs, and which yet help improve the detection of deviances. Moreover, the usefulness of deviance models could be enhanced by equipping each prediction with a confidence measure, allowing the analyst to focus on (or prioritize) more suspicious cases. To overcome these limitations, we propose a multi-view ensemble learning approach, which: (i) fully exploits the multi-dimensional nature of log events, with the help of a clustering-based trace abstraction method and (ii) implements a context- and probability-aware stacking method for combining base models' predictions. Tests on a real-life log proved the validity of the approach, and its capability to achieve compelling performances w.r.t. state-of-the-art methods.

A Multi-view Multi-dimensional Ensemble Learning Approach to Mining Business Process Deviances

Alfredo Cuzzocrea;Francesco Folino;Guarascio Massimo;Luigi Pontieri

2016

Abstract

The execution logs of a business process have been recently exploited to extract a classification model for discriminating "deviant" instances --i.e. instances diverging from normal/desired outcomes (e.g., frauds, faults, SLA violations). Regarding all log traces as sequences of task labels, current solutions essentially map each trace onto a vector space, where the features correspond to sequence-oriented patterns, and any standard classifier-induction method can be applied to separate the two classes of instances. An ensemble-learning approach was also recently proposed to combine multiple base learners trained on heterogenous pattern-based log views. However, by simply abstracting each event into an activity symbol, these approaches disregard all the non structural event data that are typically stored in real-life logs, and which yet help improve the detection of deviances. Moreover, the usefulness of deviance models could be enhanced by equipping each prediction with a confidence measure, allowing the analyst to focus on (or prioritize) more suspicious cases. To overcome these limitations, we propose a multi-view ensemble learning approach, which: (i) fully exploits the multi-dimensional nature of log events, with the help of a clustering-based trace abstraction method and (ii) implements a context- and probability-aware stacking method for combining base models' predictions. Tests on a real-life log proved the validity of the approach, and its capability to achieve compelling performances w.r.t. state-of-the-art methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Strutture organizzative
	
				Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
			
	Parole chiave
	
				Business process intelligence; Classification.
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/309114

Citazioni

ND

22

16

social impact