A novel access structure for similarity search in metric data, called Similarity Hashing (SH), is proposed. Its multi-level hash structure of separable buckets on each level supports easy insertion and bounded search costs, because at most one bucket needs to be accessed at each level for range queries up to a pre-defined value of search radius. At the same time, the number of distance computations is always significantly reduced by use of pre-computed distances obtained at insertion time. Buckets of static files can be arranged in such a way that the I/O costs never exceed the costs to scan a compressed sequential file. Experimental results demonstrate that the performance of SH is superior to the available tree-based structures. Contrary to tree organizations, the SH structure is suitable for distributed and parallel implementations.
A hashed schema for similarity search in metric spaces
Gennaro C;Savino P;
2000
Abstract
A novel access structure for similarity search in metric data, called Similarity Hashing (SH), is proposed. Its multi-level hash structure of separable buckets on each level supports easy insertion and bounded search costs, because at most one bucket needs to be accessed at each level for range queries up to a pre-defined value of search radius. At the same time, the number of distance computations is always significantly reduced by use of pre-computed distances obtained at insertion time. Buckets of static files can be arranged in such a way that the I/O costs never exceed the costs to scan a compressed sequential file. Experimental results demonstrate that the performance of SH is superior to the available tree-based structures. Contrary to tree organizations, the SH structure is suitable for distributed and parallel implementations.File | Dimensione | Formato | |
---|---|---|---|
prod_228637-doc_56487.pdf
accesso aperto
Descrizione: A hashed schema for similarity search in metric spaces
Tipologia:
Versione Editoriale (PDF)
Dimensione
459.07 kB
Formato
Adobe PDF
|
459.07 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.