During the last decade, recommender systems have become a ubiquitous feature in the online world. Research on systems and algorithms in this area has flourished, leading to novel techniques for personalization and recommendation. The evaluation of recommender systems, however, has not seen similar progress---techniques have changed little since the advent of recommender systems, when evaluation methodologies were "borrowed'' from related research areas. As an effort to move evaluation methodology forward, this paper describes a production recommender system infrastructure that allows research systems to be evaluated in situ, by real-world metrics such as user clickthrough. We present an analysis of one month of interactions with this infrastructure and share our findings.
, ,
Workshop on Living Labs for Information Retrieval Evaluation
Human-Centered Data Analytics

Said, A., Lin, J., Bellogín Kouki, A., & de Vries, A. (2013). A Month in the Life of a Production News Recommender System.