Process discovery (i.e. the automated induction of a behavioral process model from execution logs) is an important tool for business process analysts/managers, who can exploit the extracted knowledge in key pro- cess improvement and (re-)design tasks. Unfortunately, when directly applied to the logs of complex and/or lowly-structured processes, such techniques tend to produce low-quality workflow schemas, featuring both poor readability ("spaghetti-like") and low fitness (i.e. low ability to reproduce log traces). Trace clustering methods alleviate this problem, by helping detect different execution scenarios, for which simpler and more fitting workflow schemas can be eventually discovered. However, most of these methods just focus on the sequence of activities performed in each log trace, without fully exploiting all non-structural data (such as cases' data and environmental variables) available in many real logs, which might well help discover more meaningful (context-related) process variants. In order to overcome these limitations, we propose a two-phase clustering-based process discovery approach, where the clusters are inherently defined through logical deci- sion rules over context data, ensuring a satisfactory trade-off is between the readability/explainability of the discovered clusters, and the behavioral fitness of the workflow schemas eventually extracted from them. The approach has been implemented in a system prototype, which supports the discovery, evaluation and reuse of such multi-variant process models. Experimental results on a real-life log confirmed the capability of our approach to achieve compelling performances w.r.t. state-of-the-art clustering ones, in terms of both fitness and explainability.
On the Discovery of Explainable and Accurate Behavioral Models for Complex Lowly-Structured Business Processes
Francesco Folino;Massimo Guarascio;Luigi Pontieri
2015
Abstract
Process discovery (i.e. the automated induction of a behavioral process model from execution logs) is an important tool for business process analysts/managers, who can exploit the extracted knowledge in key pro- cess improvement and (re-)design tasks. Unfortunately, when directly applied to the logs of complex and/or lowly-structured processes, such techniques tend to produce low-quality workflow schemas, featuring both poor readability ("spaghetti-like") and low fitness (i.e. low ability to reproduce log traces). Trace clustering methods alleviate this problem, by helping detect different execution scenarios, for which simpler and more fitting workflow schemas can be eventually discovered. However, most of these methods just focus on the sequence of activities performed in each log trace, without fully exploiting all non-structural data (such as cases' data and environmental variables) available in many real logs, which might well help discover more meaningful (context-related) process variants. In order to overcome these limitations, we propose a two-phase clustering-based process discovery approach, where the clusters are inherently defined through logical deci- sion rules over context data, ensuring a satisfactory trade-off is between the readability/explainability of the discovered clusters, and the behavioral fitness of the workflow schemas eventually extracted from them. The approach has been implemented in a system prototype, which supports the discovery, evaluation and reuse of such multi-variant process models. Experimental results on a real-life log confirmed the capability of our approach to achieve compelling performances w.r.t. state-of-the-art clustering ones, in terms of both fitness and explainability.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.