Similarity retrieval is an important paradigm for searching in environments where exact match has little meaning. Moreover, in order to enlarge the set of data types for which the similarity search can efficiently be performed, the notion of mathematical metric space provides a useful abstraction for similarity. In this paper we consider the problem of organizing and searching large data-sets from arbitrary metric spaces, and a novel access structure for similarity search in metric data, called D-Index, is discussed. D-Index combines a novel clustering technique and the pivot-based distance searching strategy to speed up execution of similarity range and nearest neighbor queries for large files with objects stored in disk memories. Moreover, we propose an extension of this access structure (eD-Index) which is able to deal with the problem of similarity self join. Though this approach is not able to eliminate the intrinsic quadratic complexity of similarity joins, significant performance improvements are confirmed by experiments.
Access structures for advanced similarity search in metric spaces
Gennaro C;Savino P;
2003
Abstract
Similarity retrieval is an important paradigm for searching in environments where exact match has little meaning. Moreover, in order to enlarge the set of data types for which the similarity search can efficiently be performed, the notion of mathematical metric space provides a useful abstraction for similarity. In this paper we consider the problem of organizing and searching large data-sets from arbitrary metric spaces, and a novel access structure for similarity search in metric data, called D-Index, is discussed. D-Index combines a novel clustering technique and the pivot-based distance searching strategy to speed up execution of similarity range and nearest neighbor queries for large files with objects stored in disk memories. Moreover, we propose an extension of this access structure (eD-Index) which is able to deal with the problem of similarity self join. Though this approach is not able to eliminate the intrinsic quadratic complexity of similarity joins, significant performance improvements are confirmed by experiments.File | Dimensione | Formato | |
---|---|---|---|
prod_91133-doc_62233.pdf
solo utenti autorizzati
Descrizione: Access structures for advanced similarity search in metric spaces
Tipologia:
Versione Editoriale (PDF)
Dimensione
256.44 kB
Formato
Adobe PDF
|
256.44 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.