CNR Institutional Research Information System

In order to become an effective complement to traditional Web-scale text-based image retrieval solutions, content-based image retrieval must address scalability and effciency issues. In this paper we investigate the possibility of caching the answers to content-based image retrieval queries in metric space, with the aim of reducing the average cost of query processing, and boosting the overall system throughput. Our proposal exploits the similarity between the query object and the cache content, and allows the cache to return approximate answers with acceptable quality guarantee even if the query processed has never been encountered in the past. Moreover, since popular images that are likely to be used as query have several near-duplicate versions, we show that our caching algorithm is robust, and does not suffer of cache pollution problems due to near-duplicate query objects. We report on very promising results obtained with a collection of one million high-quality digital photos. We show that it is worth pursuing caching strategies also in similarity search systems, since the proposed caching techniques can have a signicant impact on performance, like caching on text queries has been proven effective for traditional Web search engines.

Caching content-based queries for robust and efficient image retrieval

Falchi F;Lucchese C;Orlando S;Perego R;Rabitti F

2009

Abstract

In order to become an effective complement to traditional Web-scale text-based image retrieval solutions, content-based image retrieval must address scalability and effciency issues. In this paper we investigate the possibility of caching the answers to content-based image retrieval queries in metric space, with the aim of reducing the average cost of query processing, and boosting the overall system throughput. Our proposal exploits the similarity between the query object and the cache content, and allows the cache to return approximate answers with acceptable quality guarantee even if the query processed has never been encountered in the past. Moreover, since popular images that are likely to be used as query have several near-duplicate versions, we show that our caching algorithm is robust, and does not suffer of cache pollution problems due to near-duplicate query objects. We report on very promising results obtained with a collection of one million high-quality digital photos. We show that it is worth pursuing caching strategies also in similarity search systems, since the proposed caching techniques can have a signicant impact on performance, like caching on text queries has been proven effective for traditional Web search engines.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2009
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Codice ISBN
	
				978-1-60558-422-5
			
	Parole chiave
	
				H.3.3 Information Search and Retrieval
Content-based retrieval
Query-result caching
Metric space
Query popularity
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_91957-doc_36537.pdf solo utenti autorizzati Descrizione: Caching content-based queries for robust and efficient image retrieval Tipologia: Versione Editoriale (PDF) Dimensione 1.64 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.64 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/62306

Citazioni

ND

20

ND

social impact