Abstract—Artificial Intelligence and Deep Learning have rev- olutionized the capability to handle broad amounts of data and maintain a high standard of efficiency in many sectors. However, there are still some major shortcomings. First, due to their increasing complexity, these methods are often used as black boxes and the features they consider in order to predict the outputs are not easy to explain. Furthermore, complex deep networks consist of multiple layers performing several computations in parallel, with a remarkable demand of resources. This implies that these methods have a non negligible impact on sustainability, waste and other environmental concerns. Finally, such sophisticated models cannot be used in real-time scenarios, in which the capability to respond promptly is pivotal. In this work, we delve into the analysis of deep network models coping with an industrial monitoring scenario involving the control of plastic consumables. In particular, we compared two state- of-the-art models — a vision transformer and a convolutional neural network — in terms of efficiency, sustainability and explainability. Our findings show that the former is slightly more effective, but less sustainable than the latter. With respect to explainability, both approaches identify specific portions of the input images affecting their predictions, hence providing a valid justification for the returned output. Nevertheless, the underlying mechanisms determining the importance attributions are quite different among the two models.
The Trade-off between Efficiency, Sustainability and Explainability: a Comparative Study on the Quality Control of Laboratory Consumables
Pagliuca, Paolo
Primo
;Pitolli, Francesca
2025
Abstract
Abstract—Artificial Intelligence and Deep Learning have rev- olutionized the capability to handle broad amounts of data and maintain a high standard of efficiency in many sectors. However, there are still some major shortcomings. First, due to their increasing complexity, these methods are often used as black boxes and the features they consider in order to predict the outputs are not easy to explain. Furthermore, complex deep networks consist of multiple layers performing several computations in parallel, with a remarkable demand of resources. This implies that these methods have a non negligible impact on sustainability, waste and other environmental concerns. Finally, such sophisticated models cannot be used in real-time scenarios, in which the capability to respond promptly is pivotal. In this work, we delve into the analysis of deep network models coping with an industrial monitoring scenario involving the control of plastic consumables. In particular, we compared two state- of-the-art models — a vision transformer and a convolutional neural network — in terms of efficiency, sustainability and explainability. Our findings show that the former is slightly more effective, but less sustainable than the latter. With respect to explainability, both approaches identify specific portions of the input images affecting their predictions, hence providing a valid justification for the returned output. Nevertheless, the underlying mechanisms determining the importance attributions are quite different among the two models.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


