CNR Institutional Research Information System

The identification of relevance with little textual context is a primary challenge in passage retrieval. We address this problem with a representation-based ranking approach that: (1) explicitly models the importance of each term using a contextualized language model; (2) performs passage expansion by propagating the importance to similar terms; and (3) grounds the representations in the lexicon, making them interpretable. Passage representations can be pre-computed at index time to reduce query-time latency. We call our approach EPIC (Expansion via Prediction of Importance with Contextualization). We show that EPIC significantly outperforms prior importance-modeling and document expansion approaches. We also observe that the performance is additive with the current leading first-stage retrieval methods, further narrowing the gap between inexpensive and cost-prohibitive passage ranking approaches. Specifically, EPIC achieves a MRR@10 of 0.304 on the MS-MARCO passage ranking dataset with 78ms average query latency on commodity hardware. We also find that the latency is further reduced to 68ms by pruning document representations, with virtually no difference in effectiveness.

Expansion via prediction of importance with contextualization

MacAvaney S;Nardini FM;Perego R;Tonellotto N;Goharian N;Frieder O

2020

Abstract

The identification of relevance with little textual context is a primary challenge in passage retrieval. We address this problem with a representation-based ranking approach that: (1) explicitly models the importance of each term using a contextualized language model; (2) performs passage expansion by propagating the importance to similar terms; and (3) grounds the representations in the lexicon, making them interpretable. Passage representations can be pre-computed at index time to reduce query-time latency. We call our approach EPIC (Expansion via Prediction of Importance with Contextualization). We show that EPIC significantly outperforms prior importance-modeling and document expansion approaches. We also observe that the performance is additive with the current leading first-stage retrieval methods, further narrowing the gap between inexpensive and cost-prohibitive passage ranking approaches. Specifically, EPIC achieves a MRR@10 of 0.304 on the MS-MARCO passage ranking dataset with 78ms average query latency on commodity hardware. We also find that the latency is further reduced to 68ms by pruning document representations, with virtually no difference in effectiveness.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Strutture organizzative
	
				Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
			
	Lingua/e
	
				Inglese
			
	Titolo del convegno
	
				43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
			
	Da pagina
	
				1573
			
	A pagina
	
				1576
			
	Codice ISBN
	
				9781450380164
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3397271.3401262
			
	URL
	
				https://doi.org/10.1145/3397271.3401262
			
	Referee
	
				Sì, ma tipo non specificato
			
	Periodo del Convegno
	
				25-30 July, 2020
			
	Luogo del Convegno
	
				online
			
	Parole chiave
	
				Document representation
Query representation
Neural ranking
Efficient ranking
			
	Codice Scopus
	
				2-s2.0-85090127678
			
	Codice Web of Science
	
				WOS:000722377700174
			
	Numero autori
	
				6
			
	Fulltext
	
				partially_open
			
	Tutti gli autori
	
						Macavaney, S; Nardini, Fm; Perego, R; Tonellotto, N; Goharian, N; Frieder, O
					
	Tipologia Login Miur
	
				273
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04 Contributo in convegno::04.01 Contributo in Atti di convegno
			
	Identificativo progetto
	
	Titolo Progetto
	
									Big Data to Enable Global Disruption of the Grapevine-powered Industries
								
	Acronimo
	
									BigDataGrapes
								
	Finanziamento
	
									H2020
								
	N. Contratto
	
									780751
								
	Appare nelle tipologie:
	
				04.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
prod_440218-doc_157963.pdf accesso aperto Descrizione: preprint Tipologia: Versione Editoriale (PDF) Dimensione 1.06 MB Formato Adobe PDF Visualizza/Apri	1.06 MB	Adobe PDF	Visualizza/Apri
prod_440218-doc_158108.pdf non disponibili Descrizione: Expansion via prediction of importance with contextualization Tipologia: Versione Editoriale (PDF) Dimensione 1.26 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.26 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/420623

Citazioni

ND

86

67

social impact