2020-09-14
FSST: Fast random access string compression
Publication
Publication
Presented at the
46th International Conference on Very Large Data Bases (August 2020), Tokyo, Japan
Strings are prevalent in real-world data sets. They often occupy a large fraction of the data and are slow to process. In this work, we present Fast Static Symbol Table (FSST), a lightweight compression scheme for strings. On text data, FSST offers decompression and compression speed similar to or better than the best speed-optimized compression methods, such as LZ4, yet offers significantly better compression factors. Moreover, its use of a static symbol table allows random access to individual, compressed strings, enabling lazy decompression and query processing on compressed data. We believe these features will make FSST a valuable piece in the standard compression toolbox.
Additional Metadata | |
---|---|
doi.org/10.14778/3407790.3407851 | |
46th International Conference on Very Large Data Bases | |
Organisation | Database Architectures |
Boncz, P., Neumann, T., & Leis, V. (2020). FSST: Fast random access string compression. In Proceedings of the VLDB Endowment (pp. 2649–2661). doi:10.14778/3407790.3407851 |