SMART: SuperMaximal approximate repeats tool
Bioinformatics , Volume 36 - Issue 8 p. 2589- 2591
State-of-The-Art repeat analysis tools rely on extending maximal repeated pairs to enumerate maximal k-mismatch repeats. These pairs can be quadratic in n, the length of the input sequence, and thus greedy heuristics are applied to speed up the extension. Here, we introduce supermaximal k-mismatch repeats, which are linear in n and capture all maximal k-mismatch repeats: every maximal k-mismatch repeat is a substring of some supermaximal k-mismatch repeat. We present SMART, a tool based on recent algorithmic advances implemented in C++ to compute supermaximal k-mismatch repeats directly, and show that these elements are statistically much more significant than the output of the state-of-The-Art.