The standard technique for predicting the accuracy that a classifier will have on unseen data (classifier accuracy prediction – CAP) is cross-validation (CV). However, CV relies on the assumption that the training data and the test data are sampled from the same distribution, an assumption that is often violated in many real-world scenarios. When such violations occur (i.e., in the presence of dataset shift), the estimates returned by CV are unreliable. In this paper we propose a CAP method specifically designed to address prior probability shift (PPS), an instance of dataset shift in which the training and test distributions are characterized by different class priors. By solving a system of independent linear equations, with n the number of classes, our method estimates the entries of the contingency table of the test data, and thus allows estimating any specific evaluation measure. Since a key step in this method involves predicting the class priors of the test data, we further observe a connection between our method and the field of “learning to quantify”. Our experiments show that, when combined with state-of-the-art quantification techniques, under PPS our method tends to outperform existing CAP methods.

A simple method for classifier accuracy prediction under prior probability shift

Volpi L.;Moreo Fernandez A.;Sebastiani F.
2025

Abstract

The standard technique for predicting the accuracy that a classifier will have on unseen data (classifier accuracy prediction – CAP) is cross-validation (CV). However, CV relies on the assumption that the training data and the test data are sampled from the same distribution, an assumption that is often violated in many real-world scenarios. When such violations occur (i.e., in the presence of dataset shift), the estimates returned by CV are unreliable. In this paper we propose a CAP method specifically designed to address prior probability shift (PPS), an instance of dataset shift in which the training and test distributions are characterized by different class priors. By solving a system of independent linear equations, with n the number of classes, our method estimates the entries of the contingency table of the test data, and thus allows estimating any specific evaluation measure. Since a key step in this method involves predicting the class priors of the test data, we further observe a connection between our method and the field of “learning to quantify”. Our experiments show that, when combined with state-of-the-art quantification techniques, under PPS our method tends to outperform existing CAP methods.
2025
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
9783031789793
9783031789809
Classifier accuracy prediction
Prior probability shift
Label shift
Quantification
File in questo prodotto:
File Dimensione Formato  
978-3-031-78980-9_17.pdf

solo utenti autorizzati

Descrizione: A Simple Method for Classifier Accuracy Prediction Under Prior Probability Shift
Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1 MB
Formato Adobe PDF
1 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
LEAP_for_CAP.pdf

embargo fino al 27/01/2026

Descrizione: This is the Author Accepted Manuscript (postprint) version of the following paper: Volpi L., Moreo A., Sebastiani F. “A Simple Method for Classifier Accuracy Prediction Under Prior Probability Shift”. DOI: 10.1007/978-3-031-78980-9_17.
Tipologia: Documento in Post-print
Licenza: Altro tipo di licenza
Dimensione 3.02 MB
Formato Adobe PDF
3.02 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/532051
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact