The demand for understanding machine learning models has led to the development of interpretable-by-design models that provide both outcomes and explanations. In this paper, we extend the concept of Prototypical Part Networks to the audio domain with SonicProtoPNet. This model enables a “this sounds like that” reasoning for audio classification, where a test instance audio is classified based on prototypical parts that most resemble specific areas of specific training instances. Quantitative results from genre and environmental sound classification, as well as musical instrument recognition tasks, demonstrate satisfactory per formance using the Log-Mel transformation of the audio input signal, further supported by backbone pre-training on image-input data. Furthermore, we introduce a high-quality back-soundification method for the learned sonic prototypes, facilitating intuitive interpretation of classification decisions through auditory inspection.
This sounds like that: explainable audio classification via prototypical parts
Guidotti R.;Pedreschi D.
2025
Abstract
The demand for understanding machine learning models has led to the development of interpretable-by-design models that provide both outcomes and explanations. In this paper, we extend the concept of Prototypical Part Networks to the audio domain with SonicProtoPNet. This model enables a “this sounds like that” reasoning for audio classification, where a test instance audio is classified based on prototypical parts that most resemble specific areas of specific training instances. Quantitative results from genre and environmental sound classification, as well as musical instrument recognition tasks, demonstrate satisfactory per formance using the Log-Mel transformation of the audio input signal, further supported by backbone pre-training on image-input data. Furthermore, we introduce a high-quality back-soundification method for the learned sonic prototypes, facilitating intuitive interpretation of classification decisions through auditory inspection.| File | Dimensione | Formato | |
|---|---|---|---|
|
Fedele-Guidotti-Pedreschi_Springer 2025.pdf
solo utenti autorizzati
Descrizione: This Sounds Like That: Explainable Audio Classification via Prototypical Parts
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.13 MB
Formato
Adobe PDF
|
1.13 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


