Pure exploration with multiple correct answers

Degenne, Rémy; Koolen-Wijkstra, Wouter

We determine the sample complexity of pure exploration bandit problems with multiple good answers. We derive a lower bound using a new game equilibrium argument. We show how continuity and convexity properties of single-answer problems ensure that the existing Track-and-Stop algorithm has asymptotically optimal sample complexity. However, that convexity is lost when going to the multiple-answer setting. We present a new algorithm which extends Track-and-Stop to the multiple-answer case and has asymptotic sample complexity matching the lower bound.

Additional Metadata
Conference	Conference on Neural Information Processing Systems
Organisation	Machine Learning
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Degenne, R., & Koolen-Wijkstra, W. (2019). Pure exploration with multiple correct answers. In Advances in Neural Information Processing Systems.

Free Full Text ( Final Version , 364kb )

Pure exploration with multiple correct answers

Publication

Publication

Address

CWI researchers

Questions or comments?

Pure exploration with multiple correct answers

Publication

Publication

Workflow

Workflow

Add Content