CNR Institutional Research Information System

Learned dense representations are a popular family of techniques for encoding queries and documents using high-dimensional embeddings, which enable retrieval by performing approximate k nearest-neighbors search (A-kNN). A popular technique for making A-kNN search efficient is based on a two-level index, where the embeddings of documents are clustered offline and, at query processing, a fixed number N of clusters closest to the query is visited exhaustively to compute the result set. In this paper, we build upon state-of-the-art for early exit A-kNN and propose an unsupervised method based on the notion of patience, which can reach competitive effectiveness with large efficiency gains. Moreover, we discuss a cascade approach where we first identify queries that find their nearest neighbor within the closest τ << N clusters, and then we decide how many more to visit based on our patience approach or other state-of-the-art strategies. Reproducible experiments employing state-of-the-art dense retrieval models and publicly available resources show that our techniques improve the A-kNN efficiency with up to 5× speedups while achieving negligible effectiveness losses. All the code used is available at https://github.com/francescobusolin/faiss_pEE

Early exit strategies for approximate k-NN search in dense retrieval

Busolin F.;Lucchese C.;Nardini F. M.;Orlando S.;Perego R.;Trani S.

2024

Abstract

Learned dense representations are a popular family of techniques for encoding queries and documents using high-dimensional embeddings, which enable retrieval by performing approximate k nearest-neighbors search (A-kNN). A popular technique for making A-kNN search efficient is based on a two-level index, where the embeddings of documents are clustered offline and, at query processing, a fixed number N of clusters closest to the query is visited exhaustively to compute the result set. In this paper, we build upon state-of-the-art for early exit A-kNN and propose an unsupervised method based on the notion of patience, which can reach competitive effectiveness with large efficiency gains. Moreover, we discuss a cascade approach where we first identify queries that find their nearest neighbor within the closest τ << N clusters, and then we decide how many more to visit based on our patience approach or other state-of-the-art strategies. Reproducible experiments employing state-of-the-art dense retrieval models and publicly available resources show that our techniques improve the A-kNN efficiency with up to 5× speedups while achieving negligible effectiveness losses. All the code used is available at https://github.com/francescobusolin/faiss_pEE

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Codice ISBN
	
				979-8-4007-0436-9
			
	Parole chiave
	
				dense retrieval
efficiency/effectiveness trade-offs
neural IR
			
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Early_Exit_Strategies_for_Learning-to-Rank_Cascades.pdf accesso aperto Descrizione: Early Exit Strategies for Approximate k-NN Search in Dense Retrieval Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 1.84 MB Formato Adobe PDF Visualizza/Apri	1.84 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/516293

Citazioni

ND

2

ND

social impact