Motivation: Determining the interaction partners among protein/domain families poses hard computational problems, in particular in the presence of paralogous proteins. Available approaches aim to identify interaction partners among protein/domain families through maximizing the similarity between trimmed versions of their phylogenetic trees. Since maximization of any natural similarity score is computationally difficult, many approaches employ heuristics to evaluate the distance matrices corresponding to the tree topologies in question. In this article, we devise an efficient deterministic algorithm which directly maximizes the similarity between two leaf labeled trees with edge lengths, obtaining a score-optimal alignment of the two trees in question. Results: Our algorithm is significantly faster than those methods based on distance matrix comparison: 1 min on a single processor versus 730 h on a supercomputer. Furthermore, we outperform the current state-of-the-art exhaustive search approach in terms of precision, while incurring acceptable losses in recall. Availability: A C implementation of the method demonstrated in this article is available at http://compbio.cs.sfu.ca/mirrort.htm Contact: imanh@sfu.ca; cenk@sfu.ca; as@cwi.nl

Additional Metadata
THEME Life Sciences (theme 5), Energy (theme 4)
Publisher Oxford U.P.
Journal Bioinformatics
Note I. Hajirasouliha and A. Schönhuth are joint first authors.
Citation
Hajirasouliha, I, Schönhuth, A, Juan, D, Valencia, A, & Sahinalp, S.C. (2012). Mirroring trees in the light of their topologies. Bioinformatics, 28(9), 1202–1208.