A novel access structure for similarity search in metric data, called Similarity Hashing (SH), is proposed. Its multi-level hash structure of separable buckets on each level supports easy insertion and bounded search costs, because at most one bucket needs to be accessed at each level for range queries up to a pre-defined value of search radius. At the same time, the number of distance computations is always significantly reduced by use of pre-computed distances obtained at insertion time. Buckets of static files can be arranged in such a way that the I/O costs never exceed the costs to scan a compressed sequential file. Experimental results demonstrate that the performance of SH is superior to the available tree-based structures. Contrary to tree organizations, the SH structure is suitable for distributed and parallel implementations.

A hashed schema for similarity search in metric spaces

Gennaro C;Savino P;
2000

Abstract

A novel access structure for similarity search in metric data, called Similarity Hashing (SH), is proposed. Its multi-level hash structure of separable buckets on each level supports easy insertion and bounded search costs, because at most one bucket needs to be accessed at each level for range queries up to a pre-defined value of search radius. At the same time, the number of distance computations is always significantly reduced by use of pre-computed distances obtained at insertion time. Buckets of static files can be arranged in such a way that the I/O costs never exceed the costs to scan a compressed sequential file. Experimental results demonstrate that the performance of SH is superior to the available tree-based structures. Contrary to tree organizations, the SH structure is suitable for distributed and parallel implementations.
2000
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Similarity search
Metric space
Information search and retrieval
File in questo prodotto:
File Dimensione Formato  
prod_228637-doc_56487.pdf

accesso aperto

Descrizione: A hashed schema for similarity search in metric spaces
Tipologia: Versione Editoriale (PDF)
Dimensione 459.07 kB
Formato Adobe PDF
459.07 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/116025
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact