To build more accurate and trustworthy artificial intelligence algorithms in deep learning, it is essential to understand the mechanisms driving classification systems to identify their targets. Typically, post hoc methods provide insights into this process. In this preliminary work, we shift the reconstruction of the class activation map to the training phase to evaluate how the model's performance changes compared to standard classification approaches. The MNIST dataset and its variants, such as Fashion MNIST, consist of well-defined images that facilitate testing this type of training process. Specifically, the classification targets are the only significant content in the images, excluding the background, allowing for a direct comparison of the reconstruction against the input images. To enhance the guidance of the network, we introduce a contrastive loss term to complement the standard classification function, which often uses categorical cross-entropy. By comparing the accuracy and the extracted pattern of the standard approach with the proposed method, we can gain valuable insights into the network's learning process. This approach aims toimprove the interpretability and effectiveness of the model during training, ultimately leading to higher classification accuracy and reliability.
You've got the wrong number: evaluating deep learning training paradigms using handwritten digit recognition data
Ignesti G.
;Martinelli M.;Moroni D.
2024
Abstract
To build more accurate and trustworthy artificial intelligence algorithms in deep learning, it is essential to understand the mechanisms driving classification systems to identify their targets. Typically, post hoc methods provide insights into this process. In this preliminary work, we shift the reconstruction of the class activation map to the training phase to evaluate how the model's performance changes compared to standard classification approaches. The MNIST dataset and its variants, such as Fashion MNIST, consist of well-defined images that facilitate testing this type of training process. Specifically, the classification targets are the only significant content in the images, excluding the background, allowing for a direct comparison of the reconstruction against the input images. To enhance the guidance of the network, we introduce a contrastive loss term to complement the standard classification function, which often uses categorical cross-entropy. By comparing the accuracy and the extracted pattern of the standard approach with the proposed method, we can gain valuable insights into the network's learning process. This approach aims toimprove the interpretability and effectiveness of the model during training, ultimately leading to higher classification accuracy and reliability.File | Dimensione | Formato | |
---|---|---|---|
IMTA-IX-2024_CameraReady.pdf
accesso aperto
Descrizione: Camera ready
Tipologia:
Documento in Pre-print
Licenza:
Creative commons
Dimensione
370.33 kB
Formato
Adobe PDF
|
370.33 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.