Dictionary-based compression schemes provide fast decoding operation, typically at the expense of reduced compression effectiveness compared to statistical or probability-based approaches. In this work, we apply dictionary-based techniques to the compression of inverted lists, showing that the high degree of regularity that these integer sequences exhibit is a good match for certain types of dictionary methods, and that an important new trade-off balance between compression effectiveness and compression efficiency can be achieved. Our observations are supported by experiments using the document-level inverted index data for two large text collections, and a wide range of other index compression implementations as reference points. Those experiments demonstrate that the gap between efficiency and effectiveness can be substantially narrowed.

Fast dictionary-based compression for inverted indexes

Pibiri G E;
2019

Abstract

Dictionary-based compression schemes provide fast decoding operation, typically at the expense of reduced compression effectiveness compared to statistical or probability-based approaches. In this work, we apply dictionary-based techniques to the compression of inverted lists, showing that the high degree of regularity that these integer sequences exhibit is a good match for certain types of dictionary methods, and that an important new trade-off balance between compression effectiveness and compression efficiency can be achieved. Our observations are supported by experiments using the document-level inverted index data for two large text collections, and a wide range of other index compression implementations as reference points. Those experiments demonstrate that the gap between efficiency and effectiveness can be substantially narrowed.
2019
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Inglese
International Conference on Web Search and Data Mining
6
14
https://dl.acm.org/doi/10.1145/3289600.3290962
Sì, ma tipo non specificato
11/02/2019,15/02/2019
Compression
Decoding
Efficiency
Inverted index
1
partially_open
Pibiri G. E.; Petri M.; Moffat A.
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
prod_402784-doc_140201.pdf

accesso aperto

Descrizione: Fast dictionary-based compression for inverted indexes
Tipologia: Versione Editoriale (PDF)
Dimensione 579.43 kB
Formato Adobe PDF
579.43 kB Adobe PDF Visualizza/Apri
prod_402784-doc_164452.pdf

non disponibili

Descrizione: Fast dictionary-based compression for inverted indexes
Tipologia: Versione Editoriale (PDF)
Dimensione 1.13 MB
Formato Adobe PDF
1.13 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/365126
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 24
  • ???jsp.display-item.citation.isi??? 17
social impact