The standard technique for predicting the accuracy that a classifier will have on unseen data (classifier accuracy prediction – CAP) is cross-validation (CV). However, CV relies on the assumption that the training data and the test data are sampled from the same distribution, an assumption that is often violated in many real-world scenarios. When such violations occur (i.e., in the presence of dataset shift), the estimates returned by CV are unreliable. In this paper we propose a CAP method specifically designed to address prior probability shift (PPS), an instance of dataset shift in which the training and test distributions are characterized by different class priors. By solving a system of independent linear equations, with n the number of classes, our method estimates the entries of the contingency table of the test data, and thus allows estimating any specific evaluation measure. Since a key step in this method involves predicting the class priors of the test data, we further observe a connection between our method and the field of “learning to quantify”. Our experiments show that, when combined with state-of-the-art quantification techniques, under PPS our method tends to outperform existing CAP methods.
A simple method for classifier accuracy prediction under prior probability shift
Volpi L.;Moreo Fernandez A.;Sebastiani F.
2025
Abstract
The standard technique for predicting the accuracy that a classifier will have on unseen data (classifier accuracy prediction – CAP) is cross-validation (CV). However, CV relies on the assumption that the training data and the test data are sampled from the same distribution, an assumption that is often violated in many real-world scenarios. When such violations occur (i.e., in the presence of dataset shift), the estimates returned by CV are unreliable. In this paper we propose a CAP method specifically designed to address prior probability shift (PPS), an instance of dataset shift in which the training and test distributions are characterized by different class priors. By solving a system of independent linear equations, with n the number of classes, our method estimates the entries of the contingency table of the test data, and thus allows estimating any specific evaluation measure. Since a key step in this method involves predicting the class priors of the test data, we further observe a connection between our method and the field of “learning to quantify”. Our experiments show that, when combined with state-of-the-art quantification techniques, under PPS our method tends to outperform existing CAP methods.File | Dimensione | Formato | |
---|---|---|---|
978-3-031-78980-9_17.pdf
solo utenti autorizzati
Descrizione: A Simple Method for Classifier Accuracy Prediction Under Prior Probability Shift
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1 MB
Formato
Adobe PDF
|
1 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
LEAP_for_CAP.pdf
embargo fino al 27/01/2026
Descrizione: This is the Author Accepted Manuscript (postprint) version of the following paper: Volpi L., Moreo A., Sebastiani F. “A Simple Method for Classifier Accuracy Prediction Under Prior Probability Shift”. DOI: 10.1007/978-3-031-78980-9_17.
Tipologia:
Documento in Post-print
Licenza:
Altro tipo di licenza
Dimensione
3.02 MB
Formato
Adobe PDF
|
3.02 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.