Similarity search in metric spaces is a general paradigm that can be used in several application fields. One of them is content-based image retrieval systems. In order to become an effective complement to traditional Web-scale text-based image retrieval solutions, content-based image retrieval must be efficient and scalable. In this paper we investigate caching the answers to content-based image retrieval queries in metric space, with the aim of reducing the average cost of query processing, and boosting the overall system throughput. Our proposal allows the cache to return approximate answers with acceptable quality guarantee even if the query processed has never been encountered in the past. By conducting tests on a collection of one million high-quality digital photos, we show that the proposed caching techniques can have a significant impact on performance. Moreover, we show that our caching algorithm does not suffer of cache pollution problems due to near-duplicate query objects.
Caching algorithms for similarity search
Lucchese C;Falchi F;Perego R;Rabitti F;Orlando S
2009
Abstract
Similarity search in metric spaces is a general paradigm that can be used in several application fields. One of them is content-based image retrieval systems. In order to become an effective complement to traditional Web-scale text-based image retrieval solutions, content-based image retrieval must be efficient and scalable. In this paper we investigate caching the answers to content-based image retrieval queries in metric space, with the aim of reducing the average cost of query processing, and boosting the overall system throughput. Our proposal allows the cache to return approximate answers with acceptable quality guarantee even if the query processed has never been encountered in the past. By conducting tests on a collection of one million high-quality digital photos, we show that the proposed caching techniques can have a significant impact on performance. Moreover, we show that our caching algorithm does not suffer of cache pollution problems due to near-duplicate query objects.| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_91956-doc_130898.pdf
solo utenti autorizzati
Descrizione: Caching algorithms for similarity search
Tipologia:
Versione Editoriale (PDF)
Dimensione
223.41 kB
Formato
Adobe PDF
|
223.41 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


