Monte-Carlo Tree Search by Best Arm Identification

Kaufmann, Emilie; Koolen-Wijkstra, Wouter

doi:10.48550/arXiv.1706.02986

Recent advances in bandit tools and techniques for sequential learning are steadily enabling new applications and are promising the resolution of a range of challenging related problems. We study the game tree search problem, where the goal is to quickly identify the optimal move in a given game tree by sequentially sampling its stochastic payoffs. We develop new algorithms for trees of arbitrary depth, that operate by summarizing all deeper levels of the tree into confidence intervals at depth one, and applying a best arm identification procedure at the root. We prove new sample complexity guarantees with a refined dependence on the problem instance. We show experimentally that our algorithms outperform existing elimination-based algorithms and match previous special-purpose methods for depth-two trees.

Additional Metadata
Persistent URL	doi.org/10.48550/arXiv.1706.02986
Project	Machine Learning at the Intrinsic Task Pace
Grant	This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/639.021.439 - Machine Learning at the Intrinsic Task Pace
Organisation	Machine Learning
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Kaufmann, E., & Koolen-Wijkstra, W. (2017). Monte-Carlo Tree Search by Best Arm Identification. doi:10.48550/arXiv.1706.02986

View at Publisher

Free Full Text ( Final Version , 306kb )

Monte-Carlo Tree Search by Best Arm Identification

Publication

Publication

Address

CWI researchers

Questions or comments?

Monte-Carlo Tree Search by Best Arm Identification

Publication

Publication

Workflow

Workflow

Add Content