Dictionary-based compression schemes provide fast decoding operation, typically at the expense of reduced compression effectiveness compared to statistical or probability-based approaches. In this work, we apply dictionary-based techniques to the compression of inverted lists, showing that the high degree of regularity that these integer sequences exhibit is a good match for certain types of dictionary methods, and that an important new trade-off balance between compression effectiveness and compression efficiency can be achieved. Our observations are supported by experiments using the document-level inverted index data for two large text collections, and a wide range of other index compression implementations as reference points. Those experiments demonstrate that the gap between efficiency and effectiveness can be substantially narrowed.
Fast dictionary-based compression for inverted indexes
Pibiri G E;
2019
Abstract
Dictionary-based compression schemes provide fast decoding operation, typically at the expense of reduced compression effectiveness compared to statistical or probability-based approaches. In this work, we apply dictionary-based techniques to the compression of inverted lists, showing that the high degree of regularity that these integer sequences exhibit is a good match for certain types of dictionary methods, and that an important new trade-off balance between compression effectiveness and compression efficiency can be achieved. Our observations are supported by experiments using the document-level inverted index data for two large text collections, and a wide range of other index compression implementations as reference points. Those experiments demonstrate that the gap between efficiency and effectiveness can be substantially narrowed.File | Dimensione | Formato | |
---|---|---|---|
prod_402784-doc_140201.pdf
accesso aperto
Descrizione: Fast dictionary-based compression for inverted indexes
Tipologia:
Versione Editoriale (PDF)
Dimensione
579.43 kB
Formato
Adobe PDF
|
579.43 kB | Adobe PDF | Visualizza/Apri |
prod_402784-doc_164452.pdf
non disponibili
Descrizione: Fast dictionary-based compression for inverted indexes
Tipologia:
Versione Editoriale (PDF)
Dimensione
1.13 MB
Formato
Adobe PDF
|
1.13 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.