CNR Institutional Research Information System

Learning-to-Rank models based on additive ensembles of regression trees have been proven to be very effective for scoring query results returned by large-scale Web search engines. Unfortunately, the computational cost of scoring thousands of candidate documents by traversing large ensembles of trees is high. Thus, several works have investigated solutions aimed at improving the efficiency of document scoring by exploiting advanced features of modern CPUs and memory hierarchies. In this article, we present QUICKSCORER, a new algorithm that adopts a novel cache-efficient representation of a given tree ensemble, performs an interleaved traversal by means of fast bitwise operations, and supports ensembles of oblivious trees. An extensive and detailed test assessment is conducted on two standard Learning-to-Rank datasets and on a novel very large dataset we made publicly available for conducting significant efficiency tests. The experiments show unprecedented speedups over the best state-of-the-art baselines ranging from 1.9x to 6.6x. The analysis of low-level profiling traces shows that QUICKSCORER efficiency is due to its cache-aware approach in terms of both data layout and access patterns and to a control flow that entails very low branch mis-prediction rates.

Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees

Dato D;Lucchese C;Nardini F M;Orlando S;Perego R;Tonellotto N;Venturini R

2016

Abstract

Learning-to-Rank models based on additive ensembles of regression trees have been proven to be very effective for scoring query results returned by large-scale Web search engines. Unfortunately, the computational cost of scoring thousands of candidate documents by traversing large ensembles of trees is high. Thus, several works have investigated solutions aimed at improving the efficiency of document scoring by exploiting advanced features of modern CPUs and memory hierarchies. In this article, we present QUICKSCORER, a new algorithm that adopts a novel cache-efficient representation of a given tree ensemble, performs an interleaved traversal by means of fast bitwise operations, and supports ensembles of oblivious trees. An extensive and detailed test assessment is conducted on two standard Learning-to-Rank datasets and on a novel very large dataset we made publicly available for conducting significant efficiency tests. The experiments show unprecedented speedups over the best state-of-the-art baselines ranging from 1.9x to 6.6x. The analysis of low-level profiling traces shows that QUICKSCORER efficiency is due to its cache-aware approach in terms of both data layout and access patterns and to a control flow that entails very low branch mis-prediction rates.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Learning to rank
Additive ensembles of regression trees
Document scoring
Efficiency
Cache-awareness
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
prod_366884-doc_121221.pdf solo utenti autorizzati Descrizione: Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees Tipologia: Versione Editoriale (PDF) Dimensione 1 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
prod_366884-doc_157375.pdf accesso aperto Descrizione: Fast Ranking with Additive Ensembles of Oblivious and Non-Oblivious Regression Trees Tipologia: Versione Editoriale (PDF) Dimensione 984.07 kB Formato Adobe PDF Visualizza/Apri	984.07 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/329618

Citazioni

ND

91

48

social impact