A third strike against perfect phylogeny
Perfect phylogenies are fundamental in the study of evolutionary trees because they capture the situation when each evolutionary trait emerges only once in history; if such events are believed to be rare, then by Occam’s Razor such parsimonious trees are preferable as a hypothesis of evolution. A classical result states that 2-state characters permit a perfect phylogeny precisely if each subset of 2 characters permits one. More recently, it was shown that for 3-state characters the same property holds but for size-3 subsets. A long-standing open problem asked whether such a constant exists for each number of states. More precisely, it has been conjectured that for any fixed number of states r there exists a constant f(r) such that a set of r-state characters C has a perfect phylogeny if and only if every subset of at most f(r) characters has a perfect phylogeny. Informally, the conjecture states that checking fixed-size subsets of characters is enough to correctly determine whether input data permits a perfect phylogeny, irrespective of the number of characters in the input. In this article, we show that this conjecture is false. In particular, we show that for any constant t, there exists a set C of 8-state characters such that C has no perfect phylogeny, but there exists a perfect phylogeny for every subset of at most t characters. Moreover, there already exists a perfect phylogeny when ignoring just one of the characters, independent of which character you ignore. This negative result complements the two negative results (“strikes”) of Bodlaender et al. (1992, 2000). We reflect on the consequences of this third strike, pointing out that while it does close off some routes for efficient algorithm development, many others remain open.
|Keywords||Four gamete condition, Local obstructions conjecture, Maximum parsimony, Perfect phylogeny, Phylogenetic tree|
van Iersel, L.J.J, Jones, M.E.L, & Kelk, S.M. (2019). A third strike against perfect phylogeny. Systematic Biology, 68(5), 814–827. doi:10.1093/sysbio/syz009