Dynamic pruning strategies permit efficient retrieval by not fully scoring all postings of the documents matching a query - without degrading the retrieval effectiveness of the topranked results. However, the amount of pruning achievable for a query can vary, resulting in queries taking different amounts of time to execute. Knowing in advance the execution time of queries would permit the exploitation of online algorithms to schedule queries across replicated servers in order to minimise the average query waiting and completion times. In this work, we investigate the impact of dynamic pruning strategies on query response times, and propose a framework for predicting the efficiency of a query. Within this framework, we analyse the accuracy of several query efficiency predictors across 10,000 queries submitted to in-memory inverted indices of a 50-million-document Web crawl. Our results show that combining multiple efficiency predictors with regression can accurately predict the response time of a query before it is executed. Moreover, using the efficiency predictors to facilitate online scheduling algorithms can result in a 22% reduction in the mean waiting time experienced by queries before execution, and a 7% reduction in the mean completion time experienced by users.

Learning to predict response times for online query scheduling

Tonellotto N;
2012

Abstract

Dynamic pruning strategies permit efficient retrieval by not fully scoring all postings of the documents matching a query - without degrading the retrieval effectiveness of the topranked results. However, the amount of pruning achievable for a query can vary, resulting in queries taking different amounts of time to execute. Knowing in advance the execution time of queries would permit the exploitation of online algorithms to schedule queries across replicated servers in order to minimise the average query waiting and completion times. In this work, we investigate the impact of dynamic pruning strategies on query response times, and propose a framework for predicting the efficiency of a query. Within this framework, we analyse the accuracy of several query efficiency predictors across 10,000 queries submitted to in-memory inverted indices of a 50-million-document Web crawl. Our results show that combining multiple efficiency predictors with regression can accurately predict the response time of a query before it is executed. Moreover, using the efficiency predictors to facilitate online scheduling algorithms can result in a 22% reduction in the mean waiting time experienced by queries before execution, and a 7% reduction in the mean completion time experienced by users.
2012
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
978-1-4503-1472-5
Performance
Experimentation
H.3.3 Information Search & Retrieval
File in questo prodotto:
File Dimensione Formato  
prod_218941-doc_51431.pdf

solo utenti autorizzati

Descrizione: Learning to predict response times for online query scheduling
Tipologia: Versione Editoriale (PDF)
Dimensione 1.18 MB
Formato Adobe PDF
1.18 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/4606
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 60
  • ???jsp.display-item.citation.isi??? ND
social impact