Learning-to-Rank (LtR) techniques leverage machine learning algorithms and large amounts of training data to induce high-quality ranking functions. Given a set of documents and a user query, these functions are able to predict a score for each of the documents that is in turn exploited to induce a relevance ranking. The effectiveness of these learned functions has been proved to be significantly affected by the data used to learn them. Several analysis and document selection strategies have been proposed in the past to deal with this aspect. In this paper we review the state-of-the-art proposals and we report the results of a preliminary investigation of a new sampling strategy aimed at reducing the number of not relevant query-document pairs, so to significantly decrease the training time of the learning algorithm and to increase the final effectiveness of the model by reducing noise and redundancy in the training set.

The impact of negative samples on learning to rank

Nardini FM;Perego R;Trani S
2017

Abstract

Learning-to-Rank (LtR) techniques leverage machine learning algorithms and large amounts of training data to induce high-quality ranking functions. Given a set of documents and a user query, these functions are able to predict a score for each of the documents that is in turn exploited to induce a relevance ranking. The effectiveness of these learned functions has been proved to be significantly affected by the data used to learn them. Several analysis and document selection strategies have been proposed in the past to deal with this aspect. In this paper we review the state-of-the-art proposals and we report the results of a preliminary investigation of a new sampling strategy aimed at reducing the number of not relevant query-document pairs, so to significantly decrease the training time of the learning algorithm and to increase the final effectiveness of the model by reducing noise and redundancy in the training set.
2017
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
learning systems
Document selection
Learning to rank
Negative samples
Reducing noise
Relevance ranking
Relevant query
Sampling strategies
State of the art
File in questo prodotto:
File Dimensione Formato  
prod_424422-doc_151350.pdf

accesso aperto

Descrizione: The impact of negative samples on learning to rank
Tipologia: Versione Editoriale (PDF)
Dimensione 639.29 kB
Formato Adobe PDF
639.29 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/411774
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact