Reverse-safe data structures for text indexing

Bernardini, Giulia; Chen, Huiping; Fici, Gabriele; Loukides, Grigorios; Pissis, Solon

doi:10.1145/3461698

G. Bernardini (Giulia), H. Chen (Huiping), G. Fici (Gabriele), G. Loukides (Grigorios) and S. Pissis (Solon)

2021-12-01

Reverse-safe data structures for text indexing

Presented at the Workshop on Algorithm Engineering and Experiments (January 2020), Salt Lake City, UT, USA

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z-reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D. The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z, we propose an algorithm which constructs a z-reverse-safe data structure that has size O(n) and answers pattern matching queries of length at most d optimally, where d is maximal for any such z-reverse-safe data structure. The construction algorithm takes O(nω log d) time, where ω is the matrix multiplication exponent. We show that, despite the nω factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We further show that plugging our method in data analysis applications gives insignificant or no data utility loss. Finally, we show how our technique can be extended to support applications under a realistic adversary model.

Additional Metadata
Persistent URL	doi.org/10.1145/3461698
Conference	Workshop on Algorithm Engineering and Experiments
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA APA Style APA-ALL Style AAA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Bernardini, G., Chen, H., Fici, G., Loukides, G.& Pissis, S. (2021). Reverse-safe data structures for text indexing. Proceedings of the Workshop on Algorithm Engineering and Experiments, 199–213.https://doi.org/10.1145/3461698

View at Publisher

Free Full Text ( Final Version , 1mb )

See Also
software\|data RSDS: Reverse-Safe-data-structure G. Loukides (Grigorios), S. Pissis (Solon) and H. Chen (Huiping)

Reverse-safe data structures for text indexing

Publication

Publication

software|data
RSDS: Reverse-Safe-data-structure

Address

CWI researchers

Questions or comments?

Reverse-safe data structures for text indexing

Publication

Publication

software|data RSDS: Reverse-Safe-data-structure

Workflow

Workflow

Add Content

software|data
RSDS: Reverse-Safe-data-structure