Elastic Degenerate (ED) strings and Elastic Founder (EF) graphs are two versions of acyclic components of pangenomes. Both ED strings and EF graphs (which we collectively name variable strings) extend the well-known notion of indeterminate string. Recent work has extensively investigated algorithmic tasks over these structures, and over several other variable strings notions that they generalise. Among such tasks, the basic operation of matching a pattern into a text, which can serve as a toolkit for many pangenomic data analyses using these data structures, deserves special attention. In this paper we: (1) highlight a clear taxonomy within both ED strings and EF graphs ranging through variable strings of all types, from the linear string up to the most general one; (2) investigate the problem PvarT(X,Y) of matching a solid or variable pattern of type X into a variable text of type Y; (3) using as a reference the quadratic conditional lower bounds that are known for PvarT(solid,ED) and PvarT(solid,EF), for all possible types of variable strings X and Y we either prove the quadratic conditional lower bound for PvarT(X,Y), or provide non-trivial, often sub-quadratic, upper bounds, also exploiting the above-mentioned taxonomy.

, , , ,
doi.org/10.4230/LIPIcs.WABI.2024.14
Leibniz International Proceedings in Informatics (LIPIcs)
Algorithms for PAngenome Computational Analysis
24th International Workshop on Algorithms in Bioinformatics, WABI 2024
Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands

Ascone, R., Bernardini, G., Conte, A., Equi, M., Gabory, E., Grossi, R., & Pisanti, N. (2024). A unifying taxonomy of pattern matching in degenerate strings and founder graphs. In Proceedings of the 24th International Workshop on Algorithms in Bioinformatics, WABI 2024 (pp. 14:1–14:21). doi:10.4230/LIPIcs.WABI.2024.14