2019
Pure exploration with multiple correct answers
Publication
Publication
Presented at the
Conference on Neural Information Processing Systems (January 2019)
We determine the sample complexity of pure exploration bandit problems with multiple good answers. We derive a lower bound using a new game equilibrium argument. We show how continuity and convexity properties of single-answer problems ensure that the existing Track-and-Stop algorithm has asymptotically optimal sample complexity. However, that convexity is lost when going to the multiple-answer setting. We present a new algorithm which extends Track-and-Stop to the multiple-answer case and has asymptotic sample complexity matching the lower bound.
Additional Metadata | |
---|---|
Conference on Neural Information Processing Systems | |
Organisation | Machine Learning |
Degenne, R., & Koolen-Wijkstra, W. (2019). Pure exploration with multiple correct answers. In Advances in Neural Information Processing Systems. |