CNR Institutional Research Information System

Sommario in IngleseA search engine infrastructure must be able to provide the same quality of service to all queries received during a day. During normal operating conditions, the demand for resources is considerably lower than under peak conditions, yet an oversized infrastructure would result in an unnecessary waste of computing power. A possible solution adopted in this situation might consist of dening a maximum threshold processing time for each query, and dropping queries for which this threshold elapses, leading to disappointed users. In this paper, we propose and evaluate a dierent approach, where, given a set of dierent query processing strategies with diering eciency, each query is considered by a framework that sets a maximum query processing time and selects which processing strategy is the best for that query, such that the processing time for all queries is kept below the threshold. The processing time estimates used by the scheduler are learned from past queries. We experimentally validate our approach on 10,000 queries from a standard TREC dataset with over 50 million documents, and we compare it with several baselines. These experiments encompass testing the system under dierent query loads and dierent maximum tolerated query response times. Our results show that, at the cost of a marginal loss in terms of response quality, our search system is able to answer 90% of queries within half a second during times of high query volume.

Load-sensitive selective pruning for distributed search

Broccolo D;Macdonal C;Orlando S;Ounis I;Perego R;Silvestri F;Tonellotto;N

2013

Abstract

Sommario in IngleseA search engine infrastructure must be able to provide the same quality of service to all queries received during a day. During normal operating conditions, the demand for resources is considerably lower than under peak conditions, yet an oversized infrastructure would result in an unnecessary waste of computing power. A possible solution adopted in this situation might consist of dening a maximum threshold processing time for each query, and dropping queries for which this threshold elapses, leading to disappointed users. In this paper, we propose and evaluate a dierent approach, where, given a set of dierent query processing strategies with diering eciency, each query is considered by a framework that sets a maximum query processing time and selects which processing strategy is the best for that query, such that the processing time for all queries is kept below the threshold. The processing time estimates used by the scheduler are learned from past queries. We experimentally validate our approach on 10,000 queries from a standard TREC dataset with over 50 million documents, and we compare it with several baselines. These experiments encompass testing the system under dierent query loads and dierent maximum tolerated query response times. Our results show that, at the cost of a marginal loss in terms of response quality, our search system is able to answer 90% of queries within half a second during times of high query volume.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2013
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Codice ISBN
	
				978-1-4503-2263-8
			
	Parole chiave
	
				Distributed Search Engines
Efficiency
Effectiveness
Throughput.
H.3.3 Information Search and Retrieval
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_277740-doc_78313.pdf solo utenti autorizzati Descrizione: Load-Sensitive Selective Pruning for Distributed Search Tipologia: Versione Editoriale (PDF) Dimensione 906.93 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	906.93 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/253210

Citazioni

ND

15

ND

social impact