Tries are popular data structures for storing a set of strings, where common prefixes are represented by common root-to-node paths. More than 50 years of usage have produced many variants and implementations to overcome some of their limitations.We explore new succinct representations of path-decomposed tries and experimentally evaluate the corresponding reduction in space usage and memory latency, comparing with the state of the art.We study the following applications: compressed string dictionary and monotone minimal perfect hash for strings. In compressed string dictionary, we obtain data structures that outperform other state-of-the-art compressed dictionaries in space efficiency while obtaining predictable query times that are competitive with data structures preferred by the practitioners. On real-world datasets, our compressed tries obtain the smallest space (except for one case) and have the fastest lookup times, whereas access times are within 20% slower than the best-known solutions. In monotone minimal perfect hash for strings, our compressed tries perform several times faster than other trie-based monotone perfect hash functions while occupying nearly the same space. On real-world datasets, our tries are approximately 2 to 5 times faster than previous solutions, with a space occupancy less than 10% larger. Categories and Subject Descriptors: H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval; E.1 [Data Structures]: Trees General Terms: Algorithms, Experimentation, Performance.

Fast compressed tries through path decompositions

2014

Abstract

Tries are popular data structures for storing a set of strings, where common prefixes are represented by common root-to-node paths. More than 50 years of usage have produced many variants and implementations to overcome some of their limitations.We explore new succinct representations of path-decomposed tries and experimentally evaluate the corresponding reduction in space usage and memory latency, comparing with the state of the art.We study the following applications: compressed string dictionary and monotone minimal perfect hash for strings. In compressed string dictionary, we obtain data structures that outperform other state-of-the-art compressed dictionaries in space efficiency while obtaining predictable query times that are competitive with data structures preferred by the practitioners. On real-world datasets, our compressed tries obtain the smallest space (except for one case) and have the fastest lookup times, whereas access times are within 20% slower than the best-known solutions. In monotone minimal perfect hash for strings, our compressed tries perform several times faster than other trie-based monotone perfect hash functions while occupying nearly the same space. On real-world datasets, our tries are approximately 2 to 5 times faster than previous solutions, with a space occupancy less than 10% larger. Categories and Subject Descriptors: H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval; E.1 [Data Structures]: Trees General Terms: Algorithms, Experimentation, Performance.
2014
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Monotone perfect hash functions
String dictionaries
Succinct data structures
Tries
File in questo prodotto:
File Dimensione Formato  
prod_303200-doc_86611.pdf

solo utenti autorizzati

Descrizione: Fast Compressed Tries through Path Decompositions
Tipologia: Versione Editoriale (PDF)
Dimensione 343.1 kB
Formato Adobe PDF
343.1 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/271621
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 39
  • ???jsp.display-item.citation.isi??? ND
social impact