Identifying patterns in temporal data is key to uncover meaningful relationships in diverse domains, from stock trading to social interactions. Also of great interest are clinical and biological applications, namely monitoring patient response to treatment or characterizing activity at the molecular level. In biology, researchers seek to gain insight into gene functions and dynamics of biological processes, as well as potential perturbations of these leading to disease, through the study of patterns emerging from gene expression time series. Clustering can group genes exhibiting similar expression profiles, but focuses on global patterns denoting rather broad, unspecific responses. Biclustering reveals local patterns, which more naturally capture the intricate collaboration between biological players, particularly under a temporal setting. Despite the general biclustering formulation being NP-hard, considering specific properties of time series has led to efficient solutions for the discovery of temporally aligned patterns. Notably, the identification of biclusters with time-lagged patterns, suggestive of transcriptional cascades, remains a challenge due to the combinatorial explosion of delayed occurrences. Herein, we propose LateBiclustering, a sensible heuristic algorithm enabling a polynomial rather than exponential time solution for the problem. We show that it identifies meaningful time-lagged biclusters relevant to the response of Saccharomyces cerevisiae to heat stress.
Additional Metadata
ACM Combinatorics (acm G.2.1), Clustering (acm I.5.3), LIFE AND MEDICAL SCIENCES (acm J.3), Nonnumerical Algorithms and Problems (acm F.2.2), DATA STRUCTURES (acm E.1)
THEME Life Sciences (theme 5)
Stakeholder Unspecified
Citation
de Pinho Goncalves, J.S, & Madeira, S.C. (2014). LateBiclustering: Efficient Heuristic Algorithm for Time-Lagged Bicluster Identification.