In this article, we propose a new index representation based on clustering the collection of posting lists and, for each created cluster, building an ad hoc reference list with respect to which all lists in the cluster are encoded with Elias-Fano. We describe a posting lists clustering algorithm tailored for our encoder and two methods for building the reference list for a cluster. Both approaches are heuristic and differ in the way postings are added to the reference list: according to their frequency in the cluster or according to the number of bits necessary for their representation.

State-of-the-art encoders for inverted indexes compress each posting list individually. Encoding clusters of posting lists offers the possibility of reducing the redundancy of the lists while maintaining a noticeable query processing speed.

Clustered Elias-Fano Indexes

Pibiri GE;Venturini R
2017

Abstract

State-of-the-art encoders for inverted indexes compress each posting list individually. Encoding clusters of posting lists offers the possibility of reducing the redundancy of the lists while maintaining a noticeable query processing speed.
2017
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
In this article, we propose a new index representation based on clustering the collection of posting lists and, for each created cluster, building an ad hoc reference list with respect to which all lists in the cluster are encoded with Elias-Fano. We describe a posting lists clustering algorithm tailored for our encoder and two methods for building the reference list for a cluster. Both approaches are heuristic and differ in the way postings are added to the reference list: according to their frequency in the cluster or according to the number of bits necessary for their representation.
Elias-Fano encoding
inverted indexes
performance
File in questo prodotto:
File Dimensione Formato  
prod_385702-doc_140162.pdf

solo utenti autorizzati

Descrizione: Clustered Elias-Fano Indexes
Tipologia: Versione Editoriale (PDF)
Dimensione 961.86 kB
Formato Adobe PDF
961.86 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_385702-doc_166152.pdf

accesso aperto

Descrizione: Clustered Elias-Fano Indexes
Tipologia: Versione Editoriale (PDF)
Dimensione 1.45 MB
Formato Adobe PDF
1.45 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/349460
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 12
social impact