Static index pruning techniques aim at removing from the posting lists of an inverted file the references to documents which are likely to be not relevant for answering user queries. The reduction in the size of the index results in a better exploitation of memory hierarchies and faster query processing. On the other hand, pruning may affect the precision of the information retrieval system, since pruned entries are unavailable at query processing time. Static pruning techniques proposed so far exploit query-independent measures to evaluate the importance of a document within a posting list. This paper proposes a general framework that aims at enhancing the precision of any static pruning methods by exploiting usage information extracted from query logs. Experiments conducted on the TREC WT10g Web collection and a large Altavista query log show that integrating usage knowledge into the pruning process is profitable, and increases remarkably performance figures obtained with the state-of-the art Carmel's static pruning method.

On using query logs for static index pruning

Perego R;Silvestri F
2010

Abstract

Static index pruning techniques aim at removing from the posting lists of an inverted file the references to documents which are likely to be not relevant for answering user queries. The reduction in the size of the index results in a better exploitation of memory hierarchies and faster query processing. On the other hand, pruning may affect the precision of the information retrieval system, since pruned entries are unavailable at query processing time. Static pruning techniques proposed so far exploit query-independent measures to evaluate the importance of a document within a posting list. This paper proposes a general framework that aims at enhancing the precision of any static pruning methods by exploiting usage information extracted from query logs. Experiments conducted on the TREC WT10g Web collection and a large Altavista query log show that integrating usage knowledge into the pruning process is profitable, and increases remarkably performance figures obtained with the state-of-the art Carmel's static pruning method.
2010
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
978-1-4244-8482-9
Database Applications. Data mining
Query log mining
Static pruning
Indexing
Search engine
File in questo prodotto:
File Dimensione Formato  
prod_92105-doc_131684.pdf

solo utenti autorizzati

Descrizione: On using query logs for static index pruning
Tipologia: Versione Editoriale (PDF)
Dimensione 446.49 kB
Formato Adobe PDF
446.49 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/63109
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact