Similarity retrieval is an important paradigm for searching in environments where exact match has little meaning. Moreover, in order to enlarge the set of data types for which the similarity search can efficiently be performed, the notion of mathematical metric space provides a useful abstraction for similarity. In this paper we consider the problem of organizing and searching large data-sets from arbitrary metric spaces, and a novel access structure for similarity search in metric data, called D-Index, is discussed. D-Index combines a novel clustering technique and the pivot-based distance searching strategy to speed up execution of similarity range and nearest neighbor queries for large files with objects stored in disk memories. Moreover, we propose an extension of this access structure (eD-Index) which is able to deal with the problem of similarity self join. Though this approach is not able to eliminate the intrinsic quadratic complexity of similarity joins, significant performance improvements are confirmed by experiments.

Access structures for advanced similarity search in metric spaces

Gennaro C;Savino P;
2003

Abstract

Similarity retrieval is an important paradigm for searching in environments where exact match has little meaning. Moreover, in order to enlarge the set of data types for which the similarity search can efficiently be performed, the notion of mathematical metric space provides a useful abstraction for similarity. In this paper we consider the problem of organizing and searching large data-sets from arbitrary metric spaces, and a novel access structure for similarity search in metric data, called D-Index, is discussed. D-Index combines a novel clustering technique and the pivot-based distance searching strategy to speed up execution of similarity range and nearest neighbor queries for large files with objects stored in disk memories. Moreover, we propose an extension of this access structure (eD-Index) which is able to deal with the problem of similarity self join. Though this approach is not able to eliminate the intrinsic quadratic complexity of similarity joins, significant performance improvements are confirmed by experiments.
2003
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
88-498-0629-9
Metric space
Access structure
Similarity Search
Similarity Join
Edit distance
File in questo prodotto:
File Dimensione Formato  
prod_91133-doc_62233.pdf

solo utenti autorizzati

Descrizione: Access structures for advanced similarity search in metric spaces
Tipologia: Versione Editoriale (PDF)
Dimensione 256.44 kB
Formato Adobe PDF
256.44 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/57588
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact