In this work, we conduct a feature-aware comparison of approaches to Cumulative Citation Recommendation (CCR), a task that aims to filter and rank a stream of documents according to their relevance to entities in a knowledge base. We conducted experiments starting with a big feature set, identified a powerful subset and applied it to comparing classification and learning to rank algorithms. With few set of powerful features, we achieve better performance than the state-of-the-art. Surprisingly, our findings challenge the previously known preference of learning-to-rank over classification: in our study, the CCR performance of the classification approach outperforms that using learning-to-rank. This indicates that comparing two approaches is problematic due to the interplay between the approaches themselves and the feature sets one chooses to use.
Cumulative Citation Recommendation, Information Filtering, Knowledge Base Acceleration, Feature Study, System Comparison
Information (theme 2)
M. Spies , R.R. Wagner , L. Lhotsk√° , H. Decker , S. Link
COMMIT: Infinity (P01)
International Workshop on Text-based Information Retrieval
At DEXA 2014
Human-centered Data Analysis

Gebremeskel, G.G, He, J, de Vries, A.P, & Lin, J.J.P. (2014). Cumulative Citation Recommendation: A Feature-aware Comparisons of Approaches. In M Spies, R.R Wagner, L Lhotsk√°, H Decker, & S Link (Eds.), . doi:10.1109/DEXA.2014.49