CNR Institutional Research Information System

This paper discusses the design and implementation of an efficient caching system aimed to exploit the locality present in the queries submitted to a Web search engine. Previous works showed that there is a significative temporal locality in the queries, and demonstrated that caching query results is a viable strategy to increase search engine throughput. We enhance previous proposals in several directions. First we propose the adoption of a hybrid strategy for caching, where the results of the most frequently submitted queries are maintained in a static cache of fixed size, and only the queries that cannot be satisfied by the static cache compete for the use of a dynamic cache. We experimentally demonstrate the superiority of our hybrid strategy over a purely static or dynamic caching policy by evaluating the hit-rate achieved on three large query logs by varying the size of the cache, the percentage of static cache entries, and the replacement policy used for managing dynamic cache entries. Moreover, we show that search engine query logs also exhibit spatial locality, since users often require subsequent pages of results for the same query. Our caching system also take advantage of this type of locality by exploiting a sort of adaptive prefetching strategy. Finally, differently from other works, we accurately evaluate cost and scalability of our cache implementation.

A hybrid strategy for caching Web search engine results

Fagni T;Orlando S;Palmerini P;Perego R;Silvestri F

2003

Abstract

This paper discusses the design and implementation of an efficient caching system aimed to exploit the locality present in the queries submitted to a Web search engine. Previous works showed that there is a significative temporal locality in the queries, and demonstrated that caching query results is a viable strategy to increase search engine throughput. We enhance previous proposals in several directions. First we propose the adoption of a hybrid strategy for caching, where the results of the most frequently submitted queries are maintained in a static cache of fixed size, and only the queries that cannot be satisfied by the static cache compete for the use of a dynamic cache. We experimentally demonstrate the superiority of our hybrid strategy over a purely static or dynamic caching policy by evaluating the hit-rate achieved on three large query logs by varying the size of the cache, the percentage of static cache entries, and the replacement policy used for managing dynamic cache entries. Moreover, we show that search engine query logs also exhibit spatial locality, since users often require subsequent pages of results for the same query. Our caching system also take advantage of this type of locality by exploiting a sort of adaptive prefetching strategy. Finally, differently from other works, we accurately evaluate cost and scalability of our cache implementation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2003
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Parole chiave
	
				Caching
Search engines
Query log analysis
Performance
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_90997-doc_130961.pdf solo utenti autorizzati Descrizione: A hybrid strategy for caching Web search engine results Tipologia: Documento in Pre-print Dimensione 32.8 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	32.8 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/56726

Citazioni

ND

ND

ND

social impact