CNR Institutional Research Information System

Category Ranking is a variant of the multi-label text categorization problem, in which, rather than performing a (hard) assignment to a document dj of (zero, one, or more) categories from a predefined set C, we rank all categories in C according to their estimated 'degree of suitability' to dj. Category ranking has many applications, all pertaining to 'interactive' classification contexts in which the system, rather than taking a final categorization decision, is simply required to support a human expert who is in charge of taking this decision. Despite its high applicative potential, category ranking has not received much attention from the information retrieval and text categorization communities, and has mainly been tackled by standard text categorization methods, i.e. by training one binary classifier for each category and ranking the categories in terms of the confidence scores returned by the respective classifiers when asked to classify dj. In this paper we take a radically different stand to category ranking, i.e. one in which supervision is provided to the learner not in the standard form of labels attached to training documents, but in the form of preferences of type 'category c1 is to be preferred to category c2 for document dj'. We apply to this problem a recently proposed, very general model for preferential learning, and show, through experiments performed on the standard Reuters-21578 benchmark, that this outperforms support vector machines, the learning method which has up to now proved the best-performing one in text categorization comparative experiments.

Preference learning for category-ranking based interactive text categorization

Aiolli F;Sperduti A;Sebastiani F

2007

Abstract

Category Ranking is a variant of the multi-label text categorization problem, in which, rather than performing a (hard) assignment to a document dj of (zero, one, or more) categories from a predefined set C, we rank all categories in C according to their estimated 'degree of suitability' to dj. Category ranking has many applications, all pertaining to 'interactive' classification contexts in which the system, rather than taking a final categorization decision, is simply required to support a human expert who is in charge of taking this decision. Despite its high applicative potential, category ranking has not received much attention from the information retrieval and text categorization communities, and has mainly been tackled by standard text categorization methods, i.e. by training one binary classifier for each category and ranking the categories in terms of the confidence scores returned by the respective classifiers when asked to classify dj. In this paper we take a radically different stand to category ranking, i.e. one in which supervision is provided to the learner not in the standard form of labels attached to training documents, but in the form of preferences of type 'category c1 is to be preferred to category c2 for document dj'. We apply to this problem a recently proposed, very general model for preferential learning, and show, through experiments performed on the standard Reuters-21578 benchmark, that this outperforms support vector machines, the learning method which has up to now proved the best-performing one in text categorization comparative experiments.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2007
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Preference learning
Text classification
Kernel machines
Supervised learning Analysis and Indexing
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_91676-doc_131550.pdf accesso aperto Descrizione: Preference learning for category-ranking based interactive text categorization Tipologia: Versione Editoriale (PDF) Dimensione 109.64 kB Formato Adobe PDF Visualizza/Apri	109.64 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/102635

Citazioni

ND

1

ND

social impact