Recent advances in representation learning have enabled neural Information Retrieval (IR) systems to use learned dense representations for queries and documents to effectively handle semantics, language nuances, and vocabulary mismatch problems. In contrast to traditional IR systems that rely on word matching, dense IR models exploit query/document similarity in dense latent spaces to account for semantics. This requires substantial training data and comes with increased computational demands. Thus, it would be beneficial to predict how a system will perform for a given query to decide whether a dense IR model is the best option or alternatives should be used. Traditional Query Performance Prediction (QPP) models are designed for lexical IR approaches and perform sub-optimally when applied to dense neural IR systems. Therefore, there has been a renewed interest in QPP methods to improve their effectiveness for dense neural IR models. While the results of the new QPP methods are generally encouraging, there is ample room for improvement in absolute performance and stability. We argue that by using features more aligned with the underlying rationale of dense IR models, we can enhance the performance of QPP. In this respect, we propose the Projection-Displacement-Based QPP (PDQPP), which exploits the geometric properties of dense IR models, projects queries and retrieved documents onto subspaces defined by pseudo-relevant documents, and considers changes in retrieval scores within them as a proxy for retrieval coherence. Minor score changes suggest robust and coherent retrieval, while significant alterations indicate semantic divergence and potentially poor performance. Results over a wide range of experimental settings on both traditional (TREC Robust) and neural-oriented (TREC Deep Learning) test collections show that PDQPP mostly outperforms the state-of-the-art QPP baselines.

Projection-displacement-based query performance prediction for embedded space of dense retrievers

Muntean Cristina Ioana;Perego Raffaele;Tonellotto Nicola
2026

Abstract

Recent advances in representation learning have enabled neural Information Retrieval (IR) systems to use learned dense representations for queries and documents to effectively handle semantics, language nuances, and vocabulary mismatch problems. In contrast to traditional IR systems that rely on word matching, dense IR models exploit query/document similarity in dense latent spaces to account for semantics. This requires substantial training data and comes with increased computational demands. Thus, it would be beneficial to predict how a system will perform for a given query to decide whether a dense IR model is the best option or alternatives should be used. Traditional Query Performance Prediction (QPP) models are designed for lexical IR approaches and perform sub-optimally when applied to dense neural IR systems. Therefore, there has been a renewed interest in QPP methods to improve their effectiveness for dense neural IR models. While the results of the new QPP methods are generally encouraging, there is ample room for improvement in absolute performance and stability. We argue that by using features more aligned with the underlying rationale of dense IR models, we can enhance the performance of QPP. In this respect, we propose the Projection-Displacement-Based QPP (PDQPP), which exploits the geometric properties of dense IR models, projects queries and retrieved documents onto subspaces defined by pseudo-relevant documents, and considers changes in retrieval scores within them as a proxy for retrieval coherence. Minor score changes suggest robust and coherent retrieval, while significant alterations indicate semantic divergence and potentially poor performance. Results over a wide range of experimental settings on both traditional (TREC Robust) and neural-oriented (TREC Deep Learning) test collections show that PDQPP mostly outperforms the state-of-the-art QPP baselines.
2026
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Query performance prediction
File in questo prodotto:
File Dimensione Formato  
3765617.pdf

accesso aperto

Descrizione: Projection-displacement-based query performance prediction for embedded space of dense retrievers
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 1.47 MB
Formato Adobe PDF
1.47 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/562502
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact