Techniques of the Hamming embedding, producing bit string sketches, have been recently successfully applied to speed up similarity search. Sketches are usually compared by the Hamming distance, and applied to filter out non-relevant objects during the query evaluation. As several sketching techniques exist and each can produce sketches with different lengths, it is hard to select a proper configuration for a particular dataset. We assume that the (dis)similarity of objects is expressed by an arbitrary metric function, and we propose a way to efficiently estimate the quality of sketches using just a small sample set of data. Our approach is based on a probabilistic analysis of sketches which describes how separated are objects after projection to the Hamming space.

Selecting sketches for similarity search

Vadicamo L;
2018

Abstract

Techniques of the Hamming embedding, producing bit string sketches, have been recently successfully applied to speed up similarity search. Sketches are usually compared by the Hamming distance, and applied to filter out non-relevant objects during the query evaluation. As several sketching techniques exist and each can produce sketches with different lengths, it is hard to select a proper configuration for a particular dataset. We assume that the (dis)similarity of objects is expressed by an arbitrary metric function, and we propose a way to efficiently estimate the quality of sketches using just a small sample set of data. Our approach is based on a probabilistic analysis of sketches which describes how separated are objects after projection to the Hamming space.
2018
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Bit string sketches
Similarity Search
File in questo prodotto:
File Dimensione Formato  
prod_403042-doc_140239.pdf

non disponibili

Descrizione: Selecting sketches for similarity search
Tipologia: Versione Editoriale (PDF)
Dimensione 1.23 MB
Formato Adobe PDF
1.23 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_403042-doc_140759.pdf

accesso aperto

Descrizione: Selecting sketches for similarity search
Tipologia: Versione Editoriale (PDF)
Dimensione 655.12 kB
Formato Adobe PDF
655.12 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/387098
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact