We present a comparison of recommender systems algorithms along four dimensions. The first dimension is offline evaluation where we compare the performance of our algorithms in an offline setting. The second dimension is online evaluation where we deploy recommender algorithms online with a view to comparing their performance patterns. The third dimension is time, where we compare our algorithms in two different years: 2015 and 2016. The fourth dimension is the quantification of the effect of non-Algorithmic factors on the performance of an online recommender system by using an A/A test. We then analyze the performance similarities and differences along these dimensions in an attempt to draw meaningful patterns and conclusions.

Conference and Labs of the Evaluation Forum

Gebremeskel, G., & de Vries, A. (2016). Recommender systems evaluations: Offline, online, time and A/A test. In CEUR Workshop Proceedings (pp. 642–656).