CNR Institutional Research Information System

Quantification is the task of estimating, given a set of unlabelled items and a set of classes, the relative frequency (or "prevalence"). Quantification is important in many disciplines (such as e.g., market research, political science, the social sciences, and epidemiology) which usually deal with aggregate (as opposed to individual) data. In these contexts, classifying individual unlabelled instances is usually not a primary goal, while estimating the prevalence of the classes of interest in the data is. Quantification may in principle be solved via classification, i.e., by classifying each item in and counting, for all, how many such items have been labelled with. However, it has been shown in a multitude of works that this "classify and count" (CC) method yields suboptimal quantification accuracy, one of the reasons being that most classifiers are optimized for classification accuracy, and not for quantification accuracy. As a result, quantification has come to be no longer considered a mere byproduct of classification, and has evolved as a task of its own, devoted to designing methods and algorithms that deliver better prevalence estimates than CC. The goal of this tutorial is to introduce the main supervised learning techniques that have been proposed for solving quantification, the metrics used to evaluate them, and the most promising directions for further research.

Tutorial: Supervised Learning for Prevalence Estimation

Moreo Fernandez AD;Sebastiani F

2019

Abstract

Quantification is the task of estimating, given a set of unlabelled items and a set of classes, the relative frequency (or "prevalence"). Quantification is important in many disciplines (such as e.g., market research, political science, the social sciences, and epidemiology) which usually deal with aggregate (as opposed to individual) data. In these contexts, classifying individual unlabelled instances is usually not a primary goal, while estimating the prevalence of the classes of interest in the data is. Quantification may in principle be solved via classification, i.e., by classifying each item in and counting, for all, how many such items have been labelled with. However, it has been shown in a multitude of works that this "classify and count" (CC) method yields suboptimal quantification accuracy, one of the reasons being that most classifiers are optimized for classification accuracy, and not for quantification accuracy. As a result, quantification has come to be no longer considered a mere byproduct of classification, and has evolved as a task of its own, devoted to designing methods and algorithms that deliver better prevalence estimates than CC. The goal of this tutorial is to introduce the main supervised learning techniques that have been proposed for solving quantification, the metrics used to evaluate them, and the most promising directions for further research.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Te
Tutorial
Class Prevalence Estimation
Supervised Learning
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_415597-doc_146375.pdf solo utenti autorizzati Descrizione: Tutorial: Supervised Learning for Prevalence Estimation Tipologia: Versione Editoriale (PDF) Dimensione 177.12 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	177.12 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
prod_415597-doc_146646.pdf accesso aperto Descrizione: Tutorial: Supervised Learning for Prevalence Estimation Tipologia: Versione Editoriale (PDF) Dimensione 190.28 kB Formato Adobe PDF Visualizza/Apri	190.28 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/374197

Citazioni

ND

1

ND

social impact