Non-asymptotic pure exploration by solving games

Degenne, Rémy; Koolen-Wijkstra, Wouter; Ménard, Pierre

R.R.B.P. Degenne (Rémy), W.M. Koolen-Wijkstra (Wouter) and P. Ménard (Pierre)

2019

Non-asymptotic pure exploration by solving games

Presented at the Conference on Neural Information Processing Systems (January 2019)

Pure exploration (aka active testing) is the fundamental task of sequentially gathering information to answer a query about a stochastic environment. Good algorithms make few mistakes and take few samples. Lower bounds (for multi-armed bandit models with arms in an exponential family) reveal that the sample complexity is determined by the solution to an optimisation problem. The existing state of the art algorithms achieve asymptotic optimality by solving a plug-in estimate of that optimisation problem at each step. We interpret the optimisation problem as an unknown game, and propose sampling rules based on iterative strategies to estimate and converge to its saddle point. We apply no-regret learners to obtain the first finite confidence guarantees that are adapted to the exponential family and which apply to any pure exploration query and bandit structure. Moreover, our algorithms only use a best response oracle instead of fully solving the optimisation problem

Additional Metadata
Conference	Conference on Neural Information Processing Systems
Organisation	Machine Learning
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Degenne, R., Koolen-Wijkstra, W., & Ménard, P. (2019). Non-asymptotic pure exploration by solving games. In Advances in Neural Information Processing Systems.

Free Full Text ( Final Version , 557kb )

See Also
software Purex_games W.M. Koolen-Wijkstra (Wouter)

Non-asymptotic pure exploration by solving games

Publication

Publication

software
Purex_games

Address

CWI researchers

Questions or comments?

Non-asymptotic pure exploration by solving games

Publication

Publication

software Purex_games

Workflow

Workflow

Add Content

software
Purex_games