Knowledge distillation (KD) is a key technique for transferring knowledge from a large, complex “teacher” model to a smaller, more efficient “student” model. Although initially developed for model compression, it has found applications across various domains due to the benefits of its knowledge transfer mechanism. While Cross Entropy (CE) and Kullback-Leibler (KL) are commonly used in KD, this work investigates the applicability of loss functions based on underexplored information dissimilarity measures, such as Triangular Divergence (TD), Structural Entropic Distance (SED), and Jensen-Shannon Divergence (JS), for both independent and identically distributed (iid) and non-iid data distributions. The primary contributions of this study include an empirical evaluation of these dissimilarity measures within a decentralized learning context, i.e., where independent clients collaborate without a central server coordinating the learning process. Additionally, the paper assesses the performance of clients by comparing pairwise distillation averaging among clients to conventional peer-to-peer pairwise distillation. Results indicate that while dissimilarity measures perform comparably in iid settings, non-iid distributions favor SED and JS, which also demonstrated consistent performance across clients.

Information dissimilarity measures in decentralized knowledge distillation: a comparative analysis

Vadicamo L.
;
Carlini E.
;
Gennaro C.
;
2024

Abstract

Knowledge distillation (KD) is a key technique for transferring knowledge from a large, complex “teacher” model to a smaller, more efficient “student” model. Although initially developed for model compression, it has found applications across various domains due to the benefits of its knowledge transfer mechanism. While Cross Entropy (CE) and Kullback-Leibler (KL) are commonly used in KD, this work investigates the applicability of loss functions based on underexplored information dissimilarity measures, such as Triangular Divergence (TD), Structural Entropic Distance (SED), and Jensen-Shannon Divergence (JS), for both independent and identically distributed (iid) and non-iid data distributions. The primary contributions of this study include an empirical evaluation of these dissimilarity measures within a decentralized learning context, i.e., where independent clients collaborate without a central server coordinating the learning process. Additionally, the paper assesses the performance of clients by comparing pairwise distillation averaging among clients to conventional peer-to-peer pairwise distillation. Results indicate that while dissimilarity measures perform comparably in iid settings, non-iid distributions favor SED and JS, which also demonstrated consistent performance across clients.
2024
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
9783031758225
9783031758232
Information dissimilarity measure
Divergence function
Knowledge distillation
Distributed intelligence
File in questo prodotto:
File Dimensione Formato  
2024_SISAP__Information_Dissimilarity_Measures_in_Decentralized_Knowledge_Distillation.pdf

accesso aperto

Descrizione: This is the Submitted version (preprint) of the following paper: Molo M.J. et al. “Information Dissimilarity Measures in Decentralized Knowledge Distillation: A Comparative Analysis”, 2024 submitted to “Similarity Search and Applications. 17th International Conference, SISAP 2024, Providence, RI, USA, November 4–6, 2024, Proceedings”. The final published version is available on the publisher’s website https://link.springer.com/chapter/10.1007/978-3-031-75823-2_12.
Tipologia: Documento in Pre-print
Licenza: Creative commons
Dimensione 1.32 MB
Formato Adobe PDF
1.32 MB Adobe PDF Visualizza/Apri
978-3-031-75823-2_12.pdf

solo utenti autorizzati

Descrizione: Information Dissimilarity Measures in Decentralized Knowledge Distillation: A Comparative Analysis
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 2.3 MB
Formato Adobe PDF
2.3 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/509644
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact