Robust online planning with imperfect models

Rostov, Maxim; Kaisers, Michael

Environment models are not always known a priori, and approximating stochastic transition dynamics may introduce errors, especially if only a small amount of data is available and/or model misspecification is a concern. This work introduces a robust decision-time planning method in order to cope with such imprecise models. The objective of robust planning is to find a policy with the best guaranteed performance, which we approach by transferring a two-stage minimization-maximization optimization procedure taken from the field of robust control to online planning. We assume a Markov Decision Process underlying the environment and aim for the best worst-case performance within specific model error bounds. To compute solutions, we introduce a family of locally robust decision-time planning algorithms, specifically robust Monte Carlo Tree Search (rMCTS). Robust MCTS methods are then evaluated empirically with model error bounded by Wasserstein distance, for which we find the resulting robust policies to yield safer and more uncertainty-aware behavior than their non-robust counterparts. Adaptability in model error bounds and corresponding model minimisers makes robust MCTS extensible for a variety of online planning settings.

Additional Metadata
Keywords	Planning, Robust optimization, Reinforcement learning
Stakeholder	Dexter Energy B.V., Amsterdam, The Netherlands
Conference	ALA 2021 - Adaptive and Learning Agents Workshop
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Rostov, M., & Kaisers, M. (2021). Robust online planning with imperfect models. In Adaptive and Learning Agents Workshop - ALA 2021.

Free Full Text ( Final Version , 534kb )

Robust online planning with imperfect models

Publication

Publication

Address

CWI researchers

Questions or comments?

Robust online planning with imperfect models

Publication

Publication

Workflow

Workflow

Add Content