Tries are popular data structures for storing a set of strings, where common prefixes are represented by common root-to-node paths. More than 50 years of usage have produced many variants and implementations to overcome some of their limitations.We explore new succinct representations of path-decomposed tries and experimentally evaluate the corresponding reduction in space usage and memory latency, comparing with the state of the art.We study the following applications: compressed string dictionary and monotone minimal perfect hash for strings. In compressed string dictionary, we obtain data structures that outperform other state-of-the-art compressed dictionaries in space efficiency while obtaining predictable query times that are competitive with data structures preferred by the practitioners. On real-world datasets, our compressed tries obtain the smallest space (except for one case) and have the fastest lookup times, whereas access times are within 20% slower than the best-known solutions. In monotone minimal perfect hash for strings, our compressed tries perform several times faster than other trie-based monotone perfect hash functions while occupying nearly the same space. On real-world datasets, our tries are approximately 2 to 5 times faster than previous solutions, with a space occupancy less than 10% larger. Categories and Subject Descriptors: H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval; E.1 [Data Structures]: Trees General Terms: Algorithms, Experimentation, Performance.
Fast compressed tries through path decompositions
2014
Abstract
Tries are popular data structures for storing a set of strings, where common prefixes are represented by common root-to-node paths. More than 50 years of usage have produced many variants and implementations to overcome some of their limitations.We explore new succinct representations of path-decomposed tries and experimentally evaluate the corresponding reduction in space usage and memory latency, comparing with the state of the art.We study the following applications: compressed string dictionary and monotone minimal perfect hash for strings. In compressed string dictionary, we obtain data structures that outperform other state-of-the-art compressed dictionaries in space efficiency while obtaining predictable query times that are competitive with data structures preferred by the practitioners. On real-world datasets, our compressed tries obtain the smallest space (except for one case) and have the fastest lookup times, whereas access times are within 20% slower than the best-known solutions. In monotone minimal perfect hash for strings, our compressed tries perform several times faster than other trie-based monotone perfect hash functions while occupying nearly the same space. On real-world datasets, our tries are approximately 2 to 5 times faster than previous solutions, with a space occupancy less than 10% larger. Categories and Subject Descriptors: H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval; E.1 [Data Structures]: Trees General Terms: Algorithms, Experimentation, Performance.File | Dimensione | Formato | |
---|---|---|---|
prod_303200-doc_86611.pdf
solo utenti autorizzati
Descrizione: Fast Compressed Tries through Path Decompositions
Tipologia:
Versione Editoriale (PDF)
Dimensione
343.1 kB
Formato
Adobe PDF
|
343.1 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.