Similarity measures play an important role in many data mining algorithms. To allow the use of such algorithms on non-standard databases, such as databases of financial time series, their similarity measure has to be defined. We present a simple and powerful technique which allows for the rapid evaluation of similarity between time series in large data bases. It is based on the orthonormal decomposition of the time series into the Haar basis. We demonstrate that this approach is capable of providing estimates of the local slope of the time series in the sequence of multi-resolution steps. The Haar representation and a number of related represenations derived from it are suitable for direct comparison, e.g. evaluation of the correlation product. We demonstrate that the distance between such representations closely corresponds to the subjective feeling of similarity between the time series. In order to test the validity of subjective criteria, we test the records of currency exchanges, finding convincing levels of correlation.

Springer-Verlag
Database Architectures

Struzik, Z. R., & Siebes, A. (1999). The Haar Wavelet Transform in the Time Series Similarity Paradigm. In Principles of Data Mining and Knowledge Discovery (pp. 12–22). Springer-Verlag.