Untangling Result List Refinement and Ranking Quality: a Framework for Evaluation and Prediction

He, Jiyin; Bron, M.; de Vries, Arjen; Azzopardi, L.; de Rijke, Maarten

doi:10.1145/2766462.2767740

J. He (Jiyin), M. Bron, A.P. de Vries (Arjen), L. Azzopardi and M. de Rijke (Maarten)

2015-08-01

Untangling Result List Refinement and Ranking Quality: a Framework for Evaluation and Prediction

Presented at the Annual ACM SIGIR Conference, Santiago, Chile

Traditional batch evaluation metrics assume that user interaction with search results is limited to scanning down a ranked list. However, modern search interfaces come with additional elements supporting result list refinement (RLR) through facets and filters, making user search behavior increasingly dynamic. We develop an evaluation framework that takes a step beyond the interaction assumption of traditional evaluation metrics and allows for batch evaluation of systems with and without RLR elements. In our framework we model user interaction as switching between different sublists. This provides a measure of user effort based on the joint effect of user interaction with RLR elements and result quality. We validate our framework by conducting a user study and comparing model predictions with real user performance. Our model predictions show significant positive correlation with real user effort. Further, in contrast to traditional evaluation metrics, the predictions using our framework, of when users stand to benefit from RLR elements, reflect findings from our user study. Finally, we use the framework to investigate under what conditions systems with and without RLR elements are likely to be effective. We simulate varying conditions concerning ranking quality, users, task and interface properties demonstrating a cost-effective way to study whole system performance.

Additional Metadata
Keywords	Simulation, Search behavior, Faceted search, Evaluation
THEME	Information (theme 2)
Stakeholder	Unspecified
Publisher	ACM
Persistent URL	doi.org/10.1145/2766462.2767740
Project	Behavior-aware Search Evaluation for Information Retrieval , Behavior-aware Search Evaluation for Information Retrieval
Conference	Annual ACM SIGIR Conference
Grant	This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/13675 - Behavior-aware Search Evaluation for Information Retrieval
Organisation	Human-Centered Data Analytics
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	He, J., Bron, M., de Vries, A., Azzopardi, L., & de Rijke, M. (2015). Untangling Result List Refinement and Ranking Quality: a Framework for Evaluation and Prediction. In Proceedings of Annual ACM SIGIR Conference 2015 (SIGIR 38). ACM. doi:10.1145/2766462.2767740

View at Publisher

Free Full Text ( Author Manuscript , 786kb )

Untangling Result List Refinement and Ranking Quality: a Framework for Evaluation and Prediction

Publication

Publication

Address

CWI researchers

Questions or comments?

Untangling Result List Refinement and Ranking Quality: a Framework for Evaluation and Prediction

Publication

Publication

Workflow

Workflow

Add Content