In this article we study the trade-offs in designing ef?cient caching systems for Web search engines. We explore the impact of different approaches, such as static vs. dynamic caching, and caching query results vs. caching posting lists. Using a query log spanning a whole year, we explore the limitations of caching and we demonstrate that caching posting lists can achieve higher hit rates than caching query answers. We propose a new algorithm for static caching of posting lists, which outperforms previous methods. We also study the problem of ?nding the optimal way to split the static cache between answers and posting lists. Finally, we measure how the changes in the query log in?uence the effectiveness of static caching, given our observation that the distribution of the queries changes slowly over time. Our results and observations are applicable to different levels of the data-access hierarchy, for instance, for a memory/disk layer or a broker/remote server layer.

Design trade offs for search engine caching

Silvestri F
2008

Abstract

In this article we study the trade-offs in designing ef?cient caching systems for Web search engines. We explore the impact of different approaches, such as static vs. dynamic caching, and caching query results vs. caching posting lists. Using a query log spanning a whole year, we explore the limitations of caching and we demonstrate that caching posting lists can achieve higher hit rates than caching query answers. We propose a new algorithm for static caching of posting lists, which outperforms previous methods. We also study the problem of ?nding the optimal way to split the static cache between answers and posting lists. Finally, we measure how the changes in the query log in?uence the effectiveness of static caching, given our observation that the distribution of the queries changes slowly over time. Our results and observations are applicable to different levels of the data-access hierarchy, for instance, for a memory/disk layer or a broker/remote server layer.
2008
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Caching
Web search
Query logs
File in questo prodotto:
File Dimensione Formato  
prod_168826-doc_21528.pdf

solo utenti autorizzati

Descrizione: Design trade offs for search engine caching
Tipologia: Versione Editoriale (PDF)
Dimensione 1.32 MB
Formato Adobe PDF
1.32 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/150237
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 76
  • ???jsp.display-item.citation.isi??? 51
social impact